# AI Token Ledger Explained > The emerging system of record for AI work — metering, tracing, storing, and governing every token transaction **Source:** [know.imbila.ai/ai-token-ledger-explainer](https://know.imbila.ai/ai-token-ledger-explainer) **Validated:** March 2026 | **Author:** [Imbila.AI](https://imbila.ai) | **License:** CC-BY-SA-4.0 **Structured data:** [ai-token-ledger.json](https://raw.githubusercontent.com/imbilawork/know.agent/main/agentapi/data/ai-token-ledger.json) --- ## What Is an AI Token Ledger An AI Token Ledger is a system of record that meters, logs, attributes and audits every unit of work performed by AI agents and LLM-powered applications. It treats the **token** — the fundamental unit of LLM input and output — as an atomic unit of economic activity, the same way cloud billing treats API calls or fintech ledgers treat financial transactions. No single product or protocol owns this space today. What exists is a **composable architecture** assembled from open-source components spanning observability, API gateways, workflow engines, immutable storage, and policy engines. Think of it as the "Stripe + Ledger + Observability + Policy Engine" for AI tokens — each layer handled by a different tool, none of them yet unified into one canonical system. The problem it solves is straightforward: as organisations deploy autonomous agents that make API calls, invoke tools, chain reasoning steps, and consume real money in tokens, they need to know **what happened, who did it, what it cost, and whether it was authorised**. Without a system of record, AI spend is ungovernable and agent behaviour is unauditable. ### The Problem **Invisible AI spend and unauditable agent actions.** Tokens are consumed across providers, models, agents, and workflows with no unified accounting. Traditional logging captures requests but not economic attribution or decision provenance. ### The Solution **A composable stack that meters, traces, stores, and governs every token transaction.** By layering an LLM gateway, observability platform, event stream, immutable ledger, and policy engine, teams create a complete audit trail from prompt to cost allocation. ### The Result **Full visibility into AI economics and agent behaviour.** Every API call is metered (tokens, latency, cost). Every agent action is logged (inputs, outputs, tools used). Every decision is auditable (human vs agent vs chain-of-agents). Every cost is allocatable (per user, agent, task, customer). ### How It Flows ``` Agent / User ↓ LLM Gateway (LiteLLM / Portkey / Helicone) ↓ Observability (Langfuse / OpenTelemetry) ↓ Event Stream (Kafka / Inngest / Temporal) ↓ Immutable Ledger (Postgres append-only / immudb) ↓ Analytics (ClickHouse) ↓ Governance (Open Policy Agent / custom rules) ``` --- ## Why It Matters AI spend is now a material line item. As organisations move from single-prompt applications to multi-agent systems that run autonomously for hours or days, the gap between "tokens consumed" and "value delivered" becomes a governance problem. Every untracked agent run is an unaudited financial transaction. ### Key Numbers | Metric | Value | |--------|-------| | **10B+** | Tokens processed daily through LiteLLM gateway alone (per BerriAI) | | **24K+** | GitHub stars on Langfuse (largest open-source LLM observability platform) | | **$0** | Cost of a unified token ledger standard today — because none exists yet | | **5** | Distinct infrastructure layers required to build a complete AI system of record | ### Why not just use cloud billing? Cloud provider bills tell you aggregate spend. They don't tell you which agent made which decision, whether it was authorised, what reasoning chain led to the cost, or how to attribute that cost to a specific customer, project, or workflow. The AI Token Ledger fills the gap between cloud invoices and operational accountability. --- ## How It Works The system of record for AI work is built from five composable layers. Each layer has open-source options at production maturity. ### Layer A — Observability & Tracing Captures token usage, latency, cost, prompts, responses, and traces across agent steps. - **Langfuse** — Open-source LLM engineering platform (YC W23). Tracks traces, sessions, token usage, cost per request. Supports OpenTelemetry ingestion natively since v3. Self-hostable via Docker or Kubernetes. Architecture: Postgres + ClickHouse + Redis + S3. License: MIT (core). [github.com/langfuse/langfuse](https://github.com/langfuse/langfuse) - **Helicone** — Open-source LLM observability (YC W23). Proxy-based: change one line to start logging. Rust-based AI gateway. [github.com/Helicone/helicone](https://github.com/Helicone/helicone) - **OpenTelemetry GenAI Semantic Conventions** — Emerging standard for LLM telemetry. Defines span attributes for `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`, `gen_ai.request.model`. Agent-level conventions for `create_agent` and `invoke_agent`. Status: experimental. [opentelemetry.io/docs/specs/semconv/gen-ai/](https://opentelemetry.io/docs/specs/semconv/gen-ai/) - **OpenLLMetry** (by Traceloop) — Extends OTel semantic conventions for GenAI. Libraries for Python, TypeScript, Go, Ruby. Contributors lead the OTel GenAI SIG. ### Layer B — API Gateway / Proxy Centralises all LLM calls through a single control point. Enforces rate limits, budgets, routing. - **LiteLLM** — Python SDK + Proxy Server. Calls 100+ LLMs in OpenAI-compatible format. Multi-tenant cost tracking. 8ms P95 latency at 1K RPS. License: MIT. [github.com/BerriAI/litellm](https://github.com/BerriAI/litellm) - **Portkey AI Gateway** — <1ms latency, 122KB footprint. Routes to 1,600+ models. 10B+ tokens daily. License: MIT. [github.com/Portkey-AI/gateway](https://github.com/Portkey-AI/gateway) - **Helicone AI Gateway** — Rust-based, open-source. 20+ providers. [github.com/Helicone/ai-gateway](https://github.com/Helicone/ai-gateway) ### Layer C — Event & Workflow Engine Treats every agent action as a durable event. Provides replayability, fault tolerance, audit trails. - **Temporal** — Durable execution platform. Fork of Uber's Cadence. License: MIT. [github.com/temporalio/temporal](https://github.com/temporalio/temporal) - **Inngest** — Serverless event-driven workflow engine. License: SSPL + Apache 2.0. [github.com/inngest/inngest](https://github.com/inngest/inngest) - **Apache Kafka** — Distributed event streaming. Append-only commit log. License: Apache 2.0. [github.com/apache/kafka](https://github.com/apache/kafka) ### Layer D — Immutable Ledger / Storage Stores the permanent, tamper-proof record. **Most underdeveloped layer** — no dominant open-source AI token ledger exists. - **PostgreSQL with append-only tables** — Most common pragmatic choice. INSERT-only patterns, triggers to prevent UPDATE/DELETE. - **immudb** — Open-source immutable database. Cryptographic verification via Merkle trees. License: BSL 1.1. [github.com/codenotary/immudb](https://github.com/codenotary/immudb) - **ClickHouse** — Column-oriented OLAP. Used by Langfuse as analytics backend. License: Apache 2.0. - **AWS QLDB** — Discontinued. immudb is the primary open-source alternative. ### Layer E — Policy & Governance - **Open Policy Agent (OPA)** — CNCF Graduated. Policy-as-code via Rego. License: Apache 2.0. [github.com/open-policy-agent/opa](https://github.com/open-policy-agent/opa) --- ## Ecosystem ### When to Use What | Need | Use This | Why | |------|----------|-----| | Track token usage + cost | Langfuse | Purpose-built LLM observability with native OTel | | Centralise LLM calls + budgets | LiteLLM | OpenAI-compatible proxy with spend tracking | | Fast gateway + guardrails | Portkey | Sub-ms latency, 1,600+ models | | Proxy-first observability | Helicone | One-line integration, Rust gateway | | Standardised telemetry | OTel GenAI SemConv | Vendor-neutral LLM span attributes | | Durable workflows | Temporal | Complete execution history | | Serverless workflows | Inngest | No worker fleet, built-in retries | | Tamper-proof storage | immudb | Cryptographic Merkle tree verification | | Cost analytics | ClickHouse | Column-oriented OLAP | | Policy governance | OPA | CNCF Graduated, Rego language | ### Proven Combinations - **LiteLLM + Langfuse** — Most documented integration. Endorsed by both projects. Lemonade runs this in production. - **Helicone + n8n** — Official n8n nodes package. - **OpenTelemetry + Langfuse** — Langfuse v3 natively ingests OTel spans (v1.37+). - **OpenTelemetry + Datadog** — Datadog LLM Observability supports OTel GenAI SemConv natively. --- ## Use Cases - **FinOps for AI** — Track and allocate token spend per team, project, customer. Enforce budget caps. Generate chargeback reports. - **Agent Audit Trails** — Log every autonomous agent decision: tools invoked, reasoning chains, data accessed. Critical for regulated industries. - **Multi-Agent Cost Attribution** — Attribute full cost chains across Agent A → B → C across different models back to originating request. - **Compliance & Regulatory Reporting** — Demonstrate AI systems operated within policies. Immutable ledger + policy engine. - **AI-Powered Product Billing** — Accurate token metering per customer request for AI feature billing. - **Incident Investigation & Replay** — Replay exact call sequences via Temporal/Inngest; trace detail via Langfuse. --- ## Evolution | Period | Milestone | |--------|-----------| | 2017–2019 | OpenTelemetry formed. OPA accepted into CNCF. Temporal forked from Cadence. | | 2020–2022 | GPT-3 launches. Ad-hoc LLM logging. immudb gains financial services adoption. | | 2023 | Langfuse, Helicone launch (YC W23). LiteLLM, Portkey gain traction. Gateway + observability pattern emerges. | | 2024 | OTel GenAI SemConv published (experimental). Agent-level conventions added. AWS QLDB discontinued. | | 2025–2026 | Langfuse v3 adds OTel ingestion. LiteLLM processes 10B+ tokens/day. Portkey handles 400B+ tokens. Five-layer architecture solidifies. Still no unified standard. | --- ## Decision Guide ### Build this stack when - Running autonomous agents consuming real budget unsupervised - Need cost attribution to customers, projects, or business units - Regulated industry requiring auditable agent decisions - Scaling past prototype with multi-provider LLM usage ### Skip this complexity when - Single LLM integration with low, predictable usage - AI spend is immaterial - No autonomous agent decisions — just human-initiated completions - Early prototyping where infrastructure overhead slows you down ### Minimum Viable Stack 1. **LiteLLM** (gateway + metering) 2. **Langfuse** (observability + tracing) 3. **PostgreSQL** (append-only tables) Add Temporal/Inngest, immudb, and OPA as requirements grow. --- ## Ledger Schema — Conceptual Primitive ``` event_id — Unique identifier for this token transaction agent_id — Which agent (or agent chain) made the call human_id — Which human initiated or is responsible workflow_id — Which workflow / session this belongs to tokens_in — Input tokens consumed tokens_out — Output tokens generated cost — Calculated cost in currency model — Which model was called provider — Which provider served the request tools_used — Which tools / functions were invoked timestamp — When it happened (UTC) trace_id — OpenTelemetry trace ID for full context ``` --- ## Resources ### Standards & Specifications - [OTel GenAI Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/) - [OTel GenAI Metrics Spec](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/) - [OTel GenAI Agent Spans](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-agent-spans/) - [OpenLLMetry Semantic Conventions](https://www.traceloop.com/docs/openllmetry/contributing/semantic-conventions) ### Key Documentation - [Langfuse Self-Hosting Guide](https://langfuse.com/self-hosting) - [Langfuse OTel Integration](https://langfuse.com/integrations/native/opentelemetry) - [LiteLLM Proxy Docs](https://docs.litellm.ai/docs/simple_proxy) - [LiteLLM + Langfuse Integration](https://langfuse.com/integrations/gateways/litellm) - [Portkey Open Source Docs](https://portkey.ai/docs/product/open-source) - [immudb Documentation](https://docs.immudb.io/) - [OPA Documentation](https://www.openpolicyagent.org/docs/latest/) - [Temporal Documentation](https://temporal.io/) --- *Content validated March 2026. Independent educational explainer by [Imbila.AI](https://imbila.ai).*