# ClickHouse Exporter | Status | | | ------------- |-----------| | Stability | [alpha]: metrics | | | [beta]: traces, logs | | Distributions | [contrib] | | Issues | [![Open issues](https://img.shields.io/github/issues-search/open-telemetry/opentelemetry-collector-contrib?query=is%3Aissue%20is%3Aopen%20label%3Aexporter%2Fclickhouse%20&label=open&color=orange&logo=opentelemetry)](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues?q=is%3Aopen+is%3Aissue+label%3Aexporter%2Fclickhouse) [![Closed issues](https://img.shields.io/github/issues-search/open-telemetry/opentelemetry-collector-contrib?query=is%3Aissue%20is%3Aclosed%20label%3Aexporter%2Fclickhouse%20&label=closed&color=blue&logo=opentelemetry)](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues?q=is%3Aclosed+is%3Aissue+label%3Aexporter%2Fclickhouse) | | Code coverage | [![codecov](https://codecov.io/github/open-telemetry/opentelemetry-collector-contrib/graph/main/badge.svg?component=exporter_clickhouse)](https://app.codecov.io/gh/open-telemetry/opentelemetry-collector-contrib/tree/main/?components%5B0%5D=exporter_clickhouse&displayType=list) | | [Code Owners](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/CONTRIBUTING.md#becoming-a-code-owner) | [@hanjm](https://www.github.com/hanjm), [@Frapschen](https://www.github.com/Frapschen), [@SpencerTorres](https://www.github.com/SpencerTorres) | | Emeritus | [@dmitryax](https://www.github.com/dmitryax) | [alpha]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/component-stability.md#alpha [beta]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/component-stability.md#beta [contrib]: https://github.com/open-telemetry/opentelemetry-collector-releases/tree/main/distributions/otelcol-contrib This exporter supports sending OpenTelemetry data to [ClickHouse](https://clickhouse.com/). > ClickHouse is an open-source, high performance columnar OLAP database management system for real-time analytics using > SQL. > Throughput can be measured in rows per second or megabytes per second. > If the data is placed in the page cache, a query that is not too complex is processed on modern hardware at a speed of > approximately 2-10 GB/s of uncompressed data on a single server. > If 10 bytes of columns are extracted, the speed is expected to be around 100-200 million rows per second. Note: Always add [batch-processor](https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor/batchprocessor) to collector pipeline, as [ClickHouse document says:](https://clickhouse.com/docs/en/introduction/performance/#performance-when-inserting-data) > We recommend inserting data in packets of at least 1000 rows, or no more than a single request per second. When > inserting to a MergeTree table from a tab-separated dump, the insertion speed can be from 50 to 200 MB/s. ## Visualization Tools #### Official ClickHouse Plugin for Grafana The official [ClickHouse Datasource for Grafana](https://grafana.com/grafana/plugins/grafana-clickhouse-datasource/) contains features that integrate directly with this exporter. You can view associated [logs](https://clickhouse.com/docs/en/integrations/grafana/query-builder#logs) and [traces](https://clickhouse.com/docs/en/integrations/grafana/query-builder#traces), as well as visualize other queries such as tables and time series graphs. Learn [how to configure the OpenTelemetry integration](https://clickhouse.com/docs/en/integrations/grafana/config#opentelemetry). #### Altinity's ClickHouse Plugin for Grafana If the official plugin doesn't meet your needs, you can try the [Altinity plugin for ClickHouse](https://grafana.com/grafana/plugins/vertamedia-clickhouse-datasource/), which also supports a wide range of features. ### Logs - Get log severity count time series. ```sql SELECT toDateTime(toStartOfInterval(Timestamp, INTERVAL 60 second)) as time, SeverityText, count() as count FROM otel_logs WHERE toStartOfFiveMinutes(Timestamp) >= toStartOfFiveMinutes(NOW() - INTERVAL 1 HOUR) AND Timestamp >= NOW() - INTERVAL 1 HOUR GROUP BY SeverityText, time ORDER BY time; ``` The default logs table is ordered by `(toStartOfFiveMinutes(Timestamp), ServiceName, Timestamp)`. For time range queries, filter on both `toStartOfFiveMinutes(Timestamp)` and `Timestamp`, and order by the tuple `(toStartOfFiveMinutes(Timestamp), Timestamp)` to use the primary key's read-in-order optimization. Apply `toStartOfFiveMinutes` to the range bound as well (e.g. `toStartOfFiveMinutes(NOW() - INTERVAL 1 HOUR)`) so the time bucket bounds are not truncated off the scan. - Find any log. ```sql SELECT Timestamp as log_time, Body FROM otel_logs WHERE toStartOfFiveMinutes(Timestamp) >= toStartOfFiveMinutes(NOW() - INTERVAL 1 HOUR) AND Timestamp >= NOW() - INTERVAL 1 HOUR ORDER BY (toStartOfFiveMinutes(Timestamp), Timestamp) DESC LIMIT 100; ``` - Find log with specific service. ```sql SELECT Timestamp as log_time, Body FROM otel_logs WHERE ServiceName = 'clickhouse-exporter' AND toStartOfFiveMinutes(Timestamp) >= toStartOfFiveMinutes(NOW() - INTERVAL 1 HOUR) AND Timestamp >= NOW() - INTERVAL 1 HOUR ORDER BY (toStartOfFiveMinutes(Timestamp), Timestamp) DESC LIMIT 100; ``` - Find log with specific attribute. ```sql SELECT Timestamp as log_time, Body FROM otel_logs WHERE LogAttributes['container_name'] = '/example_flog_1' AND toStartOfFiveMinutes(Timestamp) >= toStartOfFiveMinutes(NOW() - INTERVAL 1 HOUR) AND Timestamp >= NOW() - INTERVAL 1 HOUR ORDER BY (toStartOfFiveMinutes(Timestamp), Timestamp) DESC LIMIT 100; ``` - Find log with body contain string token. ```sql SELECT Timestamp as log_time, Body FROM otel_logs WHERE hasToken(Body, 'http') AND toStartOfFiveMinutes(Timestamp) >= toStartOfFiveMinutes(NOW() - INTERVAL 1 HOUR) AND Timestamp >= NOW() - INTERVAL 1 HOUR ORDER BY (toStartOfFiveMinutes(Timestamp), Timestamp) DESC LIMIT 100; ``` - Find log with body contain string. ```sql SELECT Timestamp as log_time, Body FROM otel_logs WHERE Body like '%http%' AND toStartOfFiveMinutes(Timestamp) >= toStartOfFiveMinutes(NOW() - INTERVAL 1 HOUR) AND Timestamp >= NOW() - INTERVAL 1 HOUR ORDER BY (toStartOfFiveMinutes(Timestamp), Timestamp) DESC LIMIT 100; ``` - Find log with body regexp match string. ```sql SELECT Timestamp as log_time, Body FROM otel_logs WHERE match(Body, 'http') AND toStartOfFiveMinutes(Timestamp) >= toStartOfFiveMinutes(NOW() - INTERVAL 1 HOUR) AND Timestamp >= NOW() - INTERVAL 1 HOUR ORDER BY (toStartOfFiveMinutes(Timestamp), Timestamp) DESC LIMIT 100; ``` - Find log with body json extract. ```sql SELECT Timestamp as log_time, Body FROM otel_logs WHERE JSONExtractFloat(Body, 'bytes') > 1000 AND toStartOfFiveMinutes(Timestamp) >= toStartOfFiveMinutes(NOW() - INTERVAL 1 HOUR) AND Timestamp >= NOW() - INTERVAL 1 HOUR ORDER BY (toStartOfFiveMinutes(Timestamp), Timestamp) DESC LIMIT 100; ``` ### Traces - Find spans with specific attribute. ```sql SELECT Timestamp, TraceId, SpanId, ParentSpanId, SpanName, SpanKind, ServiceName, Duration, StatusCode, StatusMessage, toString(SpanAttributes), toString(ResourceAttributes), toString(Events.Name), toString(Links.TraceId) FROM otel_traces WHERE ServiceName = 'clickhouse-exporter' AND SpanAttributes['peer.service'] = 'telemetrygen-server' AND Timestamp >= NOW() - INTERVAL 1 HOUR Limit 100; ``` - Find traces with traceID (using time primary index and TraceID skip index). ```sql WITH '391dae938234560b16bb63f51501cb6f' as trace_id, (SELECT min(Start) FROM otel_traces_trace_id_ts WHERE TraceId = trace_id) as start, (SELECT max(End) + 1 FROM otel_traces_trace_id_ts WHERE TraceId = trace_id) as end SELECT Timestamp, TraceId, SpanId, ParentSpanId, SpanName, SpanKind, ServiceName, Duration, StatusCode, StatusMessage, toString(SpanAttributes), toString(ResourceAttributes), toString(Events.Name), toString(Links.TraceId) FROM otel_traces WHERE TraceId = trace_id AND Timestamp >= start AND Timestamp <= end Limit 100; ``` - Find spans is error. ```sql SELECT Timestamp, TraceId, SpanId, ParentSpanId, SpanName, SpanKind, ServiceName, Duration, StatusCode, StatusMessage, toString(SpanAttributes), toString(ResourceAttributes), toString(Events.Name), toString(Links.TraceId) FROM otel_traces WHERE ServiceName = 'clickhouse-exporter' AND StatusCode = 'Error' AND Timestamp >= NOW() - INTERVAL 1 HOUR Limit 100; ``` - Find slow spans. ```sql SELECT Timestamp, TraceId, SpanId, ParentSpanId, SpanName, SpanKind, ServiceName, Duration, StatusCode, StatusMessage, toString(SpanAttributes), toString(ResourceAttributes), toString(Events.Name), toString(Links.TraceId) FROM otel_traces WHERE ServiceName = 'clickhouse-exporter' AND Duration > 1 * 1e9 AND Timestamp >= NOW() - INTERVAL 1 HOUR Limit 100; ``` ### Metrics Metrics data is stored in different clickhouse tables depending on their types. The tables will have a suffix to distinguish which type of metrics data is stored. | Metrics Type | Metrics Table | | --------------------- | ---------------------- | | sum | _sum | | gauge | _gauge | | histogram | _histogram | | exponential histogram | _exponential_histogram | | summary | _summary | Before you make a metrics query, you need to know the type of metric you wish to use. If your metrics come from Prometheus(or someone else uses OpenMetrics protocol), you also need to know the [compatibility](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/compatibility/prometheus_and_openmetrics.md#prometheus-and-openmetrics-compatibility) between Prometheus(OpenMetrics) and OTLP Metrics. - Find a sum metrics with name ```sql select TimeUnix,MetricName,Attributes,Value from otel_metrics_sum where MetricName='calls' limit 100 ``` - Find a sum metrics with name, attribute. ```sql select TimeUnix,MetricName,Attributes,Value from otel_metrics_sum where MetricName='calls' and Attributes['service_name']='featureflagservice' limit 100 ``` The OTLP Metrics [define two type value for one datapoint](https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/metrics/v1/metrics.proto#L358), clickhouse only use one value of float64 to store them. ## Performance Guide A single ClickHouse instance with 32 CPU cores and 128 GB RAM can handle around 20 TB (20 Billion) logs per day, the data compression ratio is 7 ~ 11, the compressed data store in disk is 1.8 TB ~ 2.85 TB, add more clickhouse node to cluster can increase linearly. The otel-collector with `otlp receiver/batch processor/clickhouse tcp exporter` can process around 40k/s logs entry per CPU cores, add more collector node can increase linearly. ## Configuration options The following settings are required: - `endpoint` (no default): The ClickHouse server address, support multi host with port, for example: - tcp protocol `tcp://addr1:port,tcp://addr2:port` or TLS `tcp://addr1:port,addr2:port?secure=true` - http protocol `http://addr1:port,addr2:port` or https `https://addr1:port,addr2:port` - clickhouse protocol `clickhouse://addr1:port,addr2:port` or TLS `clickhouse://addr1:port,addr2:port?secure=true` When multiple endpoints are provided, the driver handles load balancing and automatic failover. By default, it uses `in_order` strategy (tries endpoints in the order specified). Alternatively, use `connection_open_strategy=round_robin` (distributes connections evenly) or `connection_open_strategy=random` (randomly selects endpoints) in `connection_params`. See [connection_open_strategy documentation](https://pkg.go.dev/github.com/ClickHouse/clickhouse-go/v2#readme-connection-strategy). Many other ClickHouse specific options can be configured through query parameters e.g. `addr?dial_timeout=5s&compress=lz4`. For a full list of options see the [ClickHouse driver documentation](https://pkg.go.dev/github.com/ClickHouse/clickhouse-go/v2#readme-connection-settings-reference) Connection options: - `username` (default = ): The authentication username. - `password` (default = ): The authentication password. - `ttl` (default = 0): The data time-to-live example 30m, 48h. Also, 0 means no ttl. - `database` (default = default): The database name. Overrides the database defined in `endpoint` when this setting is not equal to `default`. - `connection_params` (default = {}). Extra connection parameters with map format. Query parameters provided in `endpoint` will be individually overwritten if present in this map. Parameters can be either driver parameters (e.g., `connection_open_strategy`, `max_open_conns`) that control client-side behavior, or ClickHouse session settings (e.g., `max_execution_time`) that are passed to the server. See the [driver parameters list](https://pkg.go.dev/github.com/ClickHouse/clickhouse-go/v2#Options) for recognized driver options; all others are treated as session settings. - `create_schema` (default = true): When set to true, will run DDL to create the database and tables. (See [schema management](#schema-management)) - `compress` (default = lz4): Controls the compression algorithm. Valid options: `none` (disabled), `zstd`, `lz4` (default), `gzip`, `deflate`, `br`, `true` (lz4). Ignored if `compress` is set in the `endpoint` or `connection_params`. - `async_insert` (default = true): Enables [async inserts](https://clickhouse.com/docs/en/optimize/asynchronous-inserts). Ignored if async inserts are configured in the `endpoint` or `connection_params`. Async inserts may still be overridden server-side. - `tls` Advanced TLS configuration (See [TLS](#tls)). Additional DSN features: The underlying `clickhouse-go` module offers additional configuration. These can be set in the exporter's `endpoint` or `connection_params` config values. - `client_info_product` Must be in `productName/version` format with comma separated entries. By default the exporter will append its binary build information. You can use this information to track the origin of `INSERT` statements in the `system.query_log` table. ClickHouse tables: - `logs_table_name` (default = otel_logs): The table name for logs. - `traces_table_name` (default = otel_traces): The table name for traces. - `metrics_tables` - `gauge` - `name` (default = "otel_metrics_gauge") - `sum` - `name` (default = "otel_metrics_sum") - `summary` - `name` (default = "otel_metrics_summary") - `histogram` - `name` (default = "otel_metrics_histogram") - `exponential_histogram` - `name` (default = "otel_metrics_exp_histogram") Cluster definition: - `cluster_name` (default = ): Optional. If present, will include `ON CLUSTER cluster_name` when creating tables. Table engine: - `table_engine` - `name` (default = MergeTree) - `params` (default = ) Modifies `ENGINE` definition when table is created. If not set then `ENGINE` defaults to `MergeTree()`. Can be combined with `cluster_name` to enable [replication for fault tolerance](https://clickhouse.com/docs/en/architecture/replication). Processing: - `timeout` (default = 5s): The timeout for every attempt to send data to the backend. - `sending_queue` - `enabled` (default = true) - `num_consumers` (default = 10): Number of consumers that dequeue batches; ignored if `enabled` is `false` - `queue_size` (default = 1000): Maximum number of batches kept in memory before dropping data. - `retry_on_failure` - `enabled` (default = true) - `initial_interval` (default = 5s): The Time to wait after the first failure before retrying; ignored if `enabled` is `false` - `max_interval` (default = 30s): The upper bound on backoff; ignored if `enabled` is `false` - `max_elapsed_time` (default = 300s): The maximum amount of time spent trying to send a batch; ignored if `enabled` is `false` ## TLS The exporter supports TLS. To enable TLS, you must specify the `secure=true` query parameter in the `endpoint` URL or use the `https` scheme. You may also use certificate authentication with the `tls` setting: ```yaml exporters: clickhouse: endpoint: . . . tls: insecure: false insecure_skip_verify: false ca_file: CAroot.crt cert_file: client.crt key_file: client.key ``` The available `tls` options are inherited from [OpenTelemetry's TLS config structure](https://pkg.go.dev/go.opentelemetry.io/collector/config/configtls#ClientConfig), more options are available than shown in this example. ## Schema management By default, the exporter will create the database and tables under the names defined in the config. This is fine for simple deployments, but for production workloads, it is recommended that you manage your own schema by setting `create_schema` to `false` in the config. This prevents each exporter process from racing to create the database and tables, and makes it easier to upgrade the exporter in the future. In this mode, the only SQL sent to your server will be for `INSERT` statements. The default DDL used by the exporter can be found in `internal/sqltemplates`. Be sure to customize the indexes, TTL, and partitioning to fit your deployment. Column names and types must be the same to preserve compatibility with the exporter's `INSERT` statements. As long as the column names/types match the `INSERT` statement, you can create whatever kind of table you want. See [ClickHouse's LogHouse](https://clickhouse.com/blog/building-a-logging-platform-with-clickhouse-and-saving-millions-over-datadog#schema) as an example of this flexibility. ### Upgrading existing tables Sometimes new columns are added to the exporter in a backwards compatible way. The exporter runs a `DESC TABLE` command on startup to determine which of these new columns are available on the table schema. If you already have tables created by a previous version of the exporter, you will need to add these new columns manually. Here is an example of a command you can use to update your existing table (adjust database and table names as needed): ```sql ALTER TABLE otel.otel_logs ADD COLUMN IF NOT EXISTS EventName String CODEC(ZSTD(1)); ``` To find the newest columns available check the `internal/sqltemplates` folder. The `CREATE TABLE` statements will always have the latest columns. In some cases the table changes will not be backwards compatible. Be sure to check the changelog for breaking changes before upgrading your collector. ### Optional table upgrades As mentioned in the previous section, the exporter is able to detect which columns are present on the schema for backwards compatibility. Here are some columns you can add to your table to update the schema: ```sql -- EventName ALTER TABLE otel.otel_logs ADD COLUMN IF NOT EXISTS EventName String CODEC(ZSTD(1)); -- JSON tables only. These 3 columns are part of one feature, you must add all 3 at once. ALTER TABLE otel.otel_logs ADD COLUMN IF NOT EXISTS ResourceAttributesKeys Array(LowCardinality(String)) CODEC(ZSTD(1)), ADD COLUMN IF NOT EXISTS ScopeAttributesKeys Array(LowCardinality(String)) CODEC(ZSTD(1)), ADD COLUMN IF NOT EXISTS LogAttributesKeys Array(LowCardinality(String)) CODEC(ZSTD(1)), -- Optional indices ADD INDEX IF NOT EXISTS idx_res_attr_keys ResourceAttributesKeys TYPE bloom_filter(0.01) GRANULARITY 1, ADD INDEX IF NOT EXISTS idx_scope_attr_keys ScopeAttributesKeys TYPE bloom_filter(0.01) GRANULARITY 1, ADD INDEX IF NOT EXISTS idx_log_attr_keys LogAttributesKeys TYPE bloom_filter(0.01) GRANULARITY 1; -- JSON tables only. These 2 columns are part of one feature, you must add all 2 at once. ALTER TABLE otel.otel_traces ADD COLUMN IF NOT EXISTS ResourceAttributesKeys Array(LowCardinality(String)) CODEC(ZSTD(1)), ADD COLUMN IF NOT EXISTS SpanAttributesKeys Array(LowCardinality(String)) CODEC(ZSTD(1)), -- Optional indices ADD INDEX IF NOT EXISTS idx_res_attr_keys ResourceAttributesKeys TYPE bloom_filter(0.01) GRANULARITY 1, ADD INDEX IF NOT EXISTS idx_span_attr_keys SpanAttributesKeys TYPE bloom_filter(0.01) GRANULARITY 1; ``` ## Example Config This example shows how to configure the exporter to send data to a ClickHouse server. It uses the native protocol without TLS. The exporter will create the database and tables if they don't exist. The data is stored for 72 hours (3 days). ```yaml receivers: examplereceiver: exporters: clickhouse: endpoint: tcp://127.0.0.1:9000?dial_timeout=10s database: otel async_insert: true ttl: 72h compress: lz4 create_schema: true logs_table_name: otel_logs traces_table_name: otel_traces timeout: 5s metrics_tables: gauge: name: "otel_metrics_gauge" sum: name: "otel_metrics_sum" summary: name: "otel_metrics_summary" histogram: name: "otel_metrics_histogram" exponential_histogram: name: "otel_metrics_exp_histogram" retry_on_failure: enabled: true initial_interval: 5s max_interval: 30s max_elapsed_time: 300s # cluster_name: my_cluster # table_engine: # name: ReplicatedMergeTree # params: service: pipelines: logs: receivers: [ examplereceiver ] exporters: [ clickhouse ] ``` ## Experimental JSON support JSON column types can be enabled per-exporter using the `json` config option: ```yaml exporters: clickhouse: endpoint: clickhouse://localhost:9000?enable_json_type=1 json: true ``` Previously, the `clickhouse.json` feature gate was used to enable JSON for all ClickHouse exporter instances. This feature gate is now deprecated. Use the `json` config option instead, which allows per-pipeline control. You may also need to add `enable_json_type=1` to your endpoint or `connection_params`. DDL has been updated, but feel free to tune the schema as needed. DDL can be found in the `internal/sqltemplates` package. All `Map` columns have been replaced with `JSON`. ClickHouse v25+ is recommended for reliable JSON support. ## Contributing Before contributing, review the contribution guidelines in [CONTRIBUTING.md](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/CONTRIBUTING.md). #### Integration tests Integration tests can be run with the following command (includes unit tests): ```sh go test -tags integration ``` *Note: Make sure integration tests pass after making changes to SQL.*