Timeseries can accept OpenTelemetry metrics through a durable queue in object storage instead of (or in addition to) the direct OTLP/HTTP endpoint. An OpenTelemetry Collector with theDocumentation Index
Fetch the complete documentation index at: https://opendata.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
opendata exporter writes batches into the queue, and a background consumer
inside the Timeseries server drains them into the TSDB.
This is the stateless zonal ingestion
pattern enabled by the opendata Buffer library applied to
metrics.
Why use it
The direct OTLP/HTTP endpoint couples ingest availability to TSDB availability. If the server is down or crashes before it has flushed an accepted request, the metrics are lost. Routing writes through object storage changes that:- Producers keep writing even when the TSDB is unavailable.
- A crashed consumer resumes from the last acked batch.
- Traffic stays within the AZ, avoiding cross-zone transfer fees that add up at high metric volumes.
Architecture
Both paths converge on the same OTLP-to-Prometheus conversion and TSDB write, so semantics match exactly. You can run both at once if some pipelines need immediate writes and others tolerate queue latency.Collector side
Theopendata exporter lives in the
opendata-go repository and is
distributed as a standalone OTel Collector component. Build a custom collector
with the OpenTelemetry Collector Builder.
Builder config
builder-config.yaml
Dockerfile
Exporter config
collector-config.yaml
| Field | Description |
|---|---|
object_store | Bucket where batches and the manifest are written. Must match the consumer’s object_store. |
data_path_prefix | Path prefix for batch objects. |
manifest_path | Path to the queue manifest. The consumer reads the same path. |
flush_interval | Maximum time a batch is held before flushing. |
flush_size_bytes | Size threshold that triggers a flush. |
compression | none or zstd. Zstd uses level 3. |
ConsumeMetrics call marshals the OTLP protobuf, writes it as one entry
with a 4-byte metadata header identifying it as metrics, and awaits durable
confirmation from object storage before returning to the pipeline. That means a
batch processor upstream is the right place to trade off request rate against
flush rate.
Consumer side
Turn on the consumer by addingbuffer_consumer to prometheus.yaml:
prometheus.yaml
| Field | Default | Description |
|---|---|---|
object_store | required | Bucket holding the queue. |
manifest_path | ingest/manifest | Must match the exporter’s manifest_path. |
poll_interval | 1s | Delay between polls when the queue is empty. |
Batch format
Batches are self-describing. Each file contains a record block (optionally compressed) followed by a 7-byte footer that indicates the compression type, record count, and format version. The consumer reads the footer, decompresses the record block if needed, and parses the length-prefixed entries. See RFC 0001 for the wire format and RFC 0006 for the 4-byte metadata header that tells the consumer a batch holds OTLP metrics.Delivery semantics
At-least-once. If the consumer crashes between processing a batch and acking it, the batch is re-read on restart. Duplicate writes to the TSDB are idempotent: samples are keyed by(series, timestamp), so a replay overwrites
with the same value.
Observability
The consumer publishes metrics under thebuffer_ prefix on the Timeseries
/metrics endpoint.
| Metric | Type | Description |
|---|---|---|
buffer_batches_collected | counter | Batches fetched from object store. |
buffer_entries_collected | counter | Entries across collected batches. |
buffer_bytes_collected | counter | Bytes read from object store. |
buffer_acks | counter | Batch acks processed. |
buffer_consumer_lag_seconds | gauge | Wall clock minus last batch ingestion time. |
buffer_queue_length | gauge | Entries currently in the manifest queue. |
buffer_fetch_duration_seconds | histogram | Per-batch fetch latency from object store. |
buffer_gc_files_deleted | counter | Batch files cleaned by GC. |
buffer_gc_files_failed | counter | Failed GC file deletions. |
buffer_gc_duration_seconds | histogram | GC cycle wall time. |
buffer_manifest_writes | counter | Manifest write attempts (label role). |
buffer_manifest_conflicts | counter | Manifest CAS conflicts (label role). |
tsdb_ingest_entries_skipped_total | counter | Entries skipped due to decode or conversion errors. |
buffer_queue_length without buffer_acks keeping up means the
consumer is falling behind. Check TSDB write latency and consider reducing
flush_interval on the exporter side (smaller, more frequent batches reduce
per-batch variance) or raising the consumer’s CPU budget.
Running both paths
buffer_consumer and the OTLP/HTTP endpoint coexist. A common setup is:
- Route high-volume, lossy-tolerant OTel traffic through the queue.
- Keep the direct endpoint for low-latency writes (scrapers, remote-write senders, local agents).
TimeSeriesDb::write, so
there is no semantic difference between a metric that arrived via the queue
and one that arrived over HTTP.