Elasticsearch
Overview
GreptimeDB supports data ingestion through Elasticsearch's _bulk API. We map Elasticsearch's Index concept to GreptimeDB's Table, and users can specify the database name using the db URL parameter. Unlike native Elasticsearch, this API only supports data insertion, not modification or deletion. At the implementation level, both index and create commands in native Elasticsearch _bulk API requests are treated as creation operations by GreptimeDB. Additionally, GreptimeDB only parses the _index field from native _bulk API command requests while ignoring other fields.
HTTP API
GreptimeDB supports data ingestion via the Elasticsearch protocol through two HTTP endpoints:
-
/v1/elasticsearch/_bulk: Users can use the POST method to write data in NDJSON format to GreptimeDB.For example, the following request will create a table named
testand insert two records:POST /v1/elasticsearch/_bulk
{"create": {"_index": "test", "_id": "1"}}
{"name": "John", "age": 30}
{"create": {"_index": "test", "_id": "2"}}
{"name": "Jane", "age": 25} -
/v1/elasticsearch/${index}/_bulk: Users can use the POST method to write data in NDJSON format to the${index}table in GreptimeDB. If the POST request also contains an_indexfield, the${index}in the URL will be ignored.For example, the following request will create tables named
testandanother_index, and insert corresponding data:POST /v1/elasticsearch/test/_bulk
{"create": {"_id": "1"}}
{"name": "John", "age": 30}
{"create": {"_index": "another_index", "_id": "2"}}
{"name": "Jane", "age": 25}
Users can also use the following HTTP URL parameters:
db: Specifies the database name. Defaults topublicif not specifiedpipeline_name: Specifies the pipeline name. Defaults to GreptimeDB's internal pipelinegreptime_identityif not specifiedversion: Specifies the pipeline version. Defaults to the latest version of the corresponding pipeline if not specifiedmsg_field: Specifies the JSON field name containing the original log data. For example, in Logstash and Filebeat, this field is typicallymessage. If specified, GreptimeDB will attempt to parse the data in this field as JSON format. If parsing fails, the field will be treated as a string
Usage
Use HTTP API to ingest data
You can create a request.json file with the following content:
{"create": {"_index": "es_test", "_id": "1"}}
{"name": "John", "age": 30}
{"create": {"_index": "es_test", "_id": "2"}}
{"name": "Jane", "age": 25}
Then use the curl command to send this file as a request body to GreptimeDB:
curl -XPOST http://localhost:4000/v1/elasticsearch/_bulk \
-H "Content-Type: application/json" -d @request.json
We can use a mysql client to connect to GreptimeDB and execute the following SQL to view the inserted data:
SELECT * FROM es_test;
We will see the following results:
mysql> SELECT * FROM es_test;
+------+------+----------------------------+
| age | name | greptime_timestamp |
+------+------+----------------------------+
| 30 | John | 2025-01-15 08:26:06.516665 |
| 25 | Jane | 2025-01-15 08:26:06.521510 |
+------+------+----------------------------+
2 rows in set (0.13 sec)
Logstash
If you are using Logstash to collect logs, you can use the following configuration to write data to GreptimeDB:
output {
elasticsearch {
hosts => ["http://localhost:4000/v1/elasticsearch"]
index => "my_index"
parameters => {
"pipeline_name" => "my_pipeline"
"msg_field" => "message"
}
}
}
The parameters section is optional, while hosts and index should be adjusted according to your actual setup.
Filebeat
If you are using Filebeat to collect logs, you can use the following configuration to write data to GreptimeDB:
output.elasticsearch:
hosts: ["http://localhost:4000/v1/elasticsearch"]
index: "my_index"
parameters:
pipeline_name: my_pipeline
msg_field: message
The parameters section is optional, while hosts and index should be adjusted according to your actual setup.