# Lumina

[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![npm version](https://img.shields.io/npm/v/@uselumina/sdk.svg)](https://www.npmjs.com/package/@uselumina/sdk)
[![PyPI version](https://img.shields.io/pypi/v/lumina-sdk.svg)](https://pypi.org/project/lumina-sdk/)
[![CI](https://github.com/use-lumina/Lumina/actions/workflows/ci.yml/badge.svg)](https://github.com/use-lumina/Lumina/actions/workflows/ci.yml)
[![GitHub stars](https://img.shields.io/github/stars/use-lumina/Lumina?style=social)](https://github.com/use-lumina/Lumina/stargazers)

**Production-grade observability for AI applications.**

Lumina is an OpenTelemetry-native platform for monitoring LLM applications in production. Track costs, latency, and quality across distributed AI systems with full trace visibility and regression testing.

Self-hosted. Open source. Zero vendor lock-in.

![Lumina Dashboard](./docs/assets/screenshots/dashboard-home.png)

---

## Why Lumina

AI applications are fundamentally different from traditional software. Token costs accumulate rapidly. Response quality degrades silently. Latency compounds across multi-step workflows. Production incidents require full trace context to debug.

Existing APM tools treat LLM calls as opaque HTTP requests. Lumina provides native observability for AI systems with automatic cost calculation, quality tracking, and hierarchical tracing for complex pipelines like RAG and agents.

Built on OpenTelemetry standards, Lumina integrates into your existing infrastructure without vendor lock-in.

---

## Features

**Cost Tracking**
Automatic cost calculation for OpenAI, Anthropic, and other providers. Track spending per service, model, user, or query.

**Distributed Tracing**
OpenTelemetry-native architecture with hierarchical span support. Visualize complex AI workflows including RAG pipelines, agent loops, and multi-model systems.

**Regression Testing**
Capture production traces and replay them with modified prompts. Compare responses side-by-side with semantic similarity scoring to detect quality regressions before deployment.

**Real-Time Analytics**
Query traces by service, model, tags, cost, latency, or custom metadata. Built on PostgreSQL with efficient indexing for production workloads.

**Smart Alerting**
Configure thresholds for cost spikes, latency degradation, and quality drops. Webhook integration for Slack, PagerDuty, or custom endpoints.

**Production Ready**
PostgreSQL backend with automatic schema management. NATS-based queue for reliable ingestion. Configurable retention policies and rate limits.

---

## Quick Start

### Docker Compose (Recommended)

Start Lumina with all services in under 60 seconds:

```bash
git clone https://github.com/use-lumina/Lumina.git
cd Lumina/infra/docker
docker compose up -d
```

Access the dashboard at `http://localhost:3000`.

All services run without authentication by default. Secure via network isolation or reverse proxy in production.

### Manual Setup

For development or custom deployments:

```bash
# 1. Clone and install dependencies
git clone https://github.com/use-lumina/Lumina.git
cd Lumina
bun install

# 2. Initialize database
createdb lumina

# 3. Start services
cd services/ingestion && bun run dev  # Port 9411
cd services/api && bun run dev        # Port 8081
cd services/replay && bun run dev     # Port 8082
cd apps/dashboard && bun run dev      # Port 3000
```

---

## Instrument Your Application

### TypeScript / JavaScript

```bash
npm install @uselumina/sdk
```

```typescript
import Anthropic from '@anthropic-ai/sdk';
import { initLumina } from '@uselumina/sdk';

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

const lumina = initLumina({
  endpoint: 'http://localhost:9411/v1/traces',
  service_name: 'my-service',
});

const response = await lumina.traceLLM(
  async () =>
    anthropic.messages.create({
      model: 'claude-sonnet-4-5',
      max_tokens: 1024,
      messages: [{ role: 'user', content: 'Hello!' }],
    }),
  {
    name: 'chat-completion',
    system: 'anthropic',
    prompt: 'Hello!',
    metadata: { userId: 'user-123' },
  }
);
```

### Python

```bash
pip install lumina-sdk
```

```python
import anthropic
from lumina import init_lumina

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

lumina = init_lumina({
    "endpoint": "http://localhost:9411/v1/traces",
    "service_name": "my-service",
})

response = lumina.trace_llm(
    lambda: client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Hello!"}],
    ),
    name="chat-completion",
    system="anthropic",
    prompt="Hello!",
    metadata={"user_id": "user-123"},
)
```

Traces appear in the dashboard immediately with automatic cost calculation and token tracking.

### Hierarchical Tracing

Track complex workflows with parent-child span relationships:

```typescript
const result = await lumina.trace('rag_pipeline', async (parentSpan) => {
  parentSpan.setAttribute('user_query', query);

  // Child operation 1: Vector retrieval
  const documents = await lumina.trace('vector_search', async () => {
    return await vectorDB.search(query);
  });

  // Child operation 2: LLM synthesis (automatically nested)
  const completion = await lumina.traceLLM(
    () =>
      anthropic.messages.create({
        model: 'claude-sonnet-4-5',
        messages: [{ role: 'user', content: buildPrompt(query, documents) }],
      }),
    { name: 'llm_synthesis', system: 'anthropic' }
  );

  return completion;
});
```

View the complete trace hierarchy in the dashboard with per-span costs and latency breakdowns.

---

## Architecture

```
┌─────────────────────────────────┐
│  Application                    │
│  + @uselumina/sdk (TypeScript)  │
│  + lumina-sdk (Python)          │
└──────────┬──────────────────────┘
           │ OTLP/HTTP
           v
┌──────────────────────────────────────────────┐
│  Lumina Platform                             │
│                                              │
│  ┌──────────┐    ┌──────────┐   ┌────────┐ │
│  │Ingestion │───►│   NATS   │──►│Workers │ │
│  │  :9411   │    │  Queue   │   │(Async) │ │
│  └──────────┘    └──────────┘   └───┬────┘ │
│                                      │      │
│  ┌──────────┐    ┌──────────────────▼────┐ │
│  │ Query    │◄───│    PostgreSQL         │ │
│  │ API      │    │  (Traces + Analytics) │ │
│  │  :8081   │    └───────────────────────┘ │
│  └────┬─────┘                               │
│       │                                     │
│  ┌────▼──────┐   ┌──────────────┐          │
│  │ Dashboard │   │Replay Engine │          │
│  │  :3000    │   │    :8082     │          │
│  └───────────┘   └──────────────┘          │
└──────────────────────────────────────────────┘
```

**Ingestion Service**
Receives OTLP traces over HTTP. Validates schema and publishes to NATS for async processing.

**Worker Pool**
Consumes traces from queue. Calculates costs, extracts metadata, and persists to PostgreSQL.

**Query API**
Provides REST endpoints for trace retrieval, analytics, and filtering. Powers the dashboard and external integrations.

**Replay Engine**
Captures production traces, re-executes with modified parameters, and compares outputs using semantic similarity models.

**Dashboard**
Next.js application for visualization, trace inspection, and replay management.

Full architecture documentation: [docs/guides/ARCHITECTURE.md](./docs/guides/ARCHITECTURE.md)

---

## Self-Hosted Limits

The open-source version includes usage limits to prevent abuse:

- **50,000 traces per day** — Resets at midnight UTC
- **7-day retention** — Automatic cleanup of older traces
- **All features enabled** — No paywalled functionality

For unlimited usage, deploy with custom configuration or consider the managed cloud offering.

---

## Configuration

### Environment Variables

```bash
# Database
DATABASE_URL=postgres://user:password@localhost:5432/lumina

# Service Ports
INGESTION_PORT=9411
QUERY_PORT=8081
REPLAY_PORT=8082

# Retention Policy
TRACE_RETENTION_DAYS=7
DAILY_TRACE_LIMIT=50000

# Replay (optional - only for LLM re-execution)
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
```

### SDK Configuration

```typescript
const lumina = initLumina({
  endpoint: 'http://localhost:9411/v1/traces',
  service_name: 'my-service',

  // Optional: disable in development
  enabled: process.env.NODE_ENV === 'production',

  // Optional: custom metadata
  resource: {
    environment: 'production',
    version: '1.2.0',
  },
});
```

---

## API Reference

### Ingestion (Port 9411)

```bash
POST /v1/traces
```

Accepts OTLP/JSON formatted traces. Compatible with OpenTelemetry SDKs.

### Query API (Port 8081)

```bash
GET  /api/traces
GET  /api/traces/:id
GET  /api/analytics/cost
GET  /api/analytics/latency
POST /api/alerts
```

Full API documentation: [docs/api/API_REFERENCE.md](./docs/api/API_REFERENCE.md)

### Replay Engine (Port 8082)

```bash
POST /replay/capture     # Create replay set from trace IDs
POST /replay/run         # Execute replay with new parameters
GET  /replay/:id         # Get replay status
GET  /replay/:id/diff    # Compare original vs replayed outputs
```

---

## Use Cases

**Production Monitoring**
Track all LLM calls across microservices. Identify expensive queries, slow endpoints, and quality degradations.

**Cost Optimization**
Analyze spending by model, service, and user. Find opportunities to switch models or optimize prompts.

**Regression Testing**
Test prompt changes against real production queries before deployment. Catch quality regressions with semantic scoring.

**Debugging**
Reproduce production issues with full trace context. View prompt, response, model parameters, and execution timeline.

**Compliance**
Audit all AI interactions with complete logs. Filter by user, timestamp, or custom metadata.

---

## Documentation

- [Quickstart Guide](./docs/guides/quickstart.md) — Get running in 5 minutes
- [Architecture](./docs/guides/architecture.md) — System design and components
- [API Reference](./docs/api/api-reference.md) — Complete REST API documentation
- [Multi-Span Tracing](./docs/guides/multi-span-tracing.md) — Hierarchical trace implementation
- [Troubleshooting](./docs/guides/troubleshooting.md) — Common issues and solutions

---

## Examples

Full example applications in [`/examples`](./examples):

**TypeScript**

- **[openai-basic](./examples/typescript/openai-basic)** — Basic GPT integration
- **[anthropic-basic](./examples/typescript/anthropic-basic)** — Basic Claude integration
- **[nextjs-rag](./examples/typescript/nextjs-rag)** — Next.js RAG application with hierarchical tracing

**Python**

- **[openai-basic](./examples/python/openai-basic)** — Basic GPT integration
- **[anthropic-basic](./examples/python/anthropic-basic)** — Basic Claude integration

---

## Deployment

### Docker

Use the provided Docker Compose configuration for production deployments:

```bash
cd infra/docker
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d
```

### Kubernetes

Helm charts coming soon. See [infra/k8s](./infra/k8s) for manifests.

### Managed Cloud

For teams preferring a fully managed solution with enterprise features, SSO, and SLA guarantees, visit [uselumina.io](https://uselumina.io).

---

## Requirements

- **Runtime:** Bun 1.0+ or Node.js 20+
- **Database:** PostgreSQL 14+
- **Queue:** NATS 2.9+
- **Memory:** 2GB minimum, 4GB recommended

---

## Roadmap

- [ ] Helm charts for Kubernetes deployment
- [ ] Support for additional LLM providers (Cohere, Replicate, Together)
- [ ] Custom alert rules with complex thresholds
- [ ] Trace sampling for high-volume workloads
- [ ] Multi-tenancy support
- [ ] Prometheus metrics export

See [GitHub Issues](https://github.com/use-lumina/Lumina/issues) for detailed roadmap and feature requests.

---

## Contributing

Contributions are welcome from developers of all experience levels.

**Getting Started**

1. Read the [Contributing Guide](./CONTRIBUTING.md)
2. Browse [Good First Issues](https://github.com/use-lumina/Lumina/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22)
3. Join the discussion in [GitHub Discussions](https://github.com/use-lumina/Lumina/discussions)

**Development Setup**

```bash
git clone https://github.com/use-lumina/Lumina.git
cd Lumina
bun install
createdb lumina

# Run tests
bun test

# Start services in dev mode
bun run dev
```

---

## License

Apache 2.0 — See [LICENSE](./LICENSE) for details.

Lumina is free and open-source software. Use it for any purpose, including commercial projects, without restriction.

---

## Support

- **Documentation:** [docs.uselumina.io](http://docs.uselumina.io/)
- **Bug Reports:** [GitHub Issues](https://github.com/use-lumina/Lumina/issues)
- **Discussions:** [GitHub Discussions](https://github.com/use-lumina/Lumina/discussions)
- **Commercial Support:** Contact us at [uselumina.io](https://uselumina.io)

---

Built by the open-source community. [Star us on GitHub](https://github.com/use-lumina/Lumina) if Lumina helps your team.