# Tracing Guide

---

### Overview
AxonHub captures every inbound request in a thread-aware trace without forcing you to adopt a new SDK. If your client already speaks the OpenAI-compatible protocol, you can opt into observability simply by forwarding trace and thread headers—or rely on AxonHub to create them for you automatically.

Key benefits of using tracing include:
- **Observability**: Gain clear visibility into every user message and all associated agent requests.
- **Performance Optimization**: AxonHub prioritizes routing requests within the same Trace to the same upstream channel. This significantly improves provider-side cache hit rates (e.g., Anthropic's Prompt Caching), reducing latency and lowering costs.
- **Efficient Debugging**: Reconstruct the full conversation context using Thread IDs to quickly pinpoint issues in multi-turn interactions.

### Key Concepts
- **Thread ID (`AH-Thread-Id`)** – Represents a complete user conversation session. Links multiple traces together so you can follow the entire user journey across multiple messages.
- **Trace ID (`AH-Trace-Id`)** – Represents a single user message and all the agent requests it triggers. You must provide this header when you need multiple requests to be linked; omitting it causes AxonHub to record requests separately even though it can auto-generate IDs.
- **Request** – The smallest unit of a single API call, containing complete request/response data, latency, token usage, and other details.
- **Extra Trace Headers** – Configure fallbacks (e.g. `Sentry-Trace`) to reuse existing observability tooling.

### Thread, Trace, and Request Relationship

```
Thread (complete user conversation session)
  └── Trace 1 (user message 1 + all agent requests)
        ├── Request 1 (agent call 1)
        ├── Request 2 (agent call 2)
        └── Request 3 (agent call 3)
  └── Trace 2 (user message 2 + all agent requests)
        ├── Request 4 (agent call 4)
        └── Request 5 (agent call 5)
```

- **Thread**: Represents a complete user conversation session, containing multiple user messages (each message corresponds to a trace)
- **Trace**: Represents a single user message and all the agent requests it triggers during processing
- **Request**: Represents a single API call to an LLM or other service, containing detailed information such as request body, response body, and token usage

**Hierarchy**:
- 1 Thread can contain multiple Traces (one trace per user message)
- 1 Trace can contain multiple Requests (all agent calls triggered by that message)
- 1 Request can only belong to 1 Trace
- 1 Trace can only belong to 1 Thread (optional association)

**Practical Use Cases**:
- **Single message with agent**: 1 Thread → 1 Trace → N Requests (user sends one message, agent makes multiple API calls)
- **Multi-turn conversation**: 1 Thread → Multiple Traces (one trace per user message) → N Requests per trace
- **Independent request**: No Thread → 1 Trace → 1 Request (single API call without conversation context)

### Configuration
```yaml
# config.yml
trace:
  thread_header: "AH-Thread-Id"
  trace_header: "AH-Trace-Id"
  extra_trace_headers:
    - "Sentry-Trace"
```

- Set `extra_trace_headers` to reuse existing instrumentation headers.
- Leave headers empty to fall back to the defaults shown above.

### Using Tracing with OpenAI-Compatible Clients
```bash
curl https://your-axonhub-instance/v1/chat/completions \
  -H "Authorization: Bearer ${AXONHUB_API_KEY}" \
  -H "Content-Type: application/json" \
  -H "AH-Trace-Id: at-demo-123" \
  -H "AH-Thread-Id: thread-abc" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      { "role": "user", "content": "Diagnose latency in my pipeline" }
    ]
  }'
```

- Provide `AH-Trace-Id` when you need sequential requests to appear under the same trace. Without it AxonHub will log them independently, even though autogenerated IDs are available for standalone calls.
- All standard OpenAI SDKs work out of the box—no code changes beyond optional header injection.

### SDK Examples
Looking for complete runnable samples? See `integration_test/openai/trace_multiple_requests/trace_test.go` and `integration_test/anthropic/trace_multiple_requests/trace_test.go`. The snippets below extract the essentials for production code.

#### OpenAI Go SDK
```go
package traces

import (
    "context"

    "github.com/openai/openai-go/v3"
    "github.com/openai/openai-go/v3/option"
)

func sendTracedChat(ctx context.Context, apiKey string) (*openai.ChatCompletion, error) {
    client := openai.NewClient(
        option.WithAPIKey(apiKey),
        option.WithBaseURL("https://your-axonhub-instance/v1"),
    )

    params := openai.ChatCompletionNewParams{
        Model: openai.ChatModel("gpt-4o"),
        Messages: []openai.ChatCompletionMessageParamUnion{
            openai.UserMessage("Diagnose latency in my pipeline"),
        },
    }

    // Pass trace and thread headers at request level
    return client.Chat.Completions.New(ctx, params,
        option.WithHeader("AH-Trace-Id", "trace-example-123"),
        option.WithHeader("AH-Thread-Id", "thread-example-abc"),
    )
}
```

#### Anthropic Go SDK
```go
package traces

import (
    "context"

    anthropic "github.com/anthropics/anthropic-sdk-go"
    "github.com/anthropics/anthropic-sdk-go/option"
)

func sendTracedMessage(ctx context.Context, apiKey string) (*anthropic.Message, error) {
    client := anthropic.NewClient(
        option.WithAPIKey(apiKey),
        option.WithBaseURL("https://your-axonhub-instance/anthropic"),
    )

    params := anthropic.MessageNewParams{
        Model: anthropic.Model("claude-3-5-sonnet"),
        Messages: []anthropic.MessageParam{
            anthropic.NewUserMessage(
                anthropic.NewTextBlock("Diagnose latency in my pipeline"),
            ),
        },
    }

    // Pass trace and thread headers at request level
    return client.Messages.New(ctx, params,
        option.WithHeader("AH-Trace-Id", "trace-example-123"),
        option.WithHeader("AH-Thread-Id", "thread-example-abc"),
    )
}
```

### Data Storage for Trace Payloads
- Decide whether to keep full request/response bodies by adjusting your storage policy. Disable it if you only need metadata.
- Configure a default storage location in the admin console; AxonHub will fall back to the primary storage if the preferred option is unavailable.
- Large payloads can live in external storage (local disk, S3, or GCS) so traces stay responsive even when responses are big.

### Claude Code Trace Support
- Turn on Claude Code extraction with `server.trace.claude_code_trace_enabled: true` so AxonHub can pick up trace IDs automatically.
- The `/anthropic/v1/messages` (and `/v1/messages`) endpoint will reuse the Claude Code `metadata.user_id` as the trace ID while keeping your payload untouched for downstream usage.
- If you already send a trace header, AxonHub keeps your value—manual instrumentation and auto-extraction work together.

### Codex Trace Support
- Turn on Codex extraction with `server.trace.codex_trace_enabled: true` so AxonHub can reuse the `Session_id` header as the trace ID.
- If you already send a trace header, AxonHub keeps your value—manual instrumentation and auto-extraction work together.

### Exploring Traces in the Console
1. Navigate to **Traces** in the AxonHub admin console.
2. Filter by project, model, or time range to locate the trace of interest.
3. Expand a trace to inspect spans, prompt/response payloads, timing, and channel metadata.
4. Jump to the linked thread to review the overall conversation timeline alongside trace details.

<table>
  <tr align="center">
    <td align="center">
      <a href="../../screenshots/axonhub-trace.png">
        <img src="../../screenshots/axonhub-trace.png" alt="Trace Details" width="600"/>
      </a>
      <br/>
      The Trace details page displays the request timeline, token usage, and cache hit status
    </td>
  </tr>
</table>

### Troubleshooting
- **No trace recorded** – Ensure the request is authenticated and the project ID is resolved (API Key must belong to a project).
- **Missing thread linkage** – Provide `AH-Thread-Id` or create threads via the API before sending requests.
- **Unexpected trace IDs** – Check for upstream reverse proxies overriding headers.