--- name: langfuse-observability description: Instrument LLM applications with Langfuse tracing. Use when setting up Langfuse, adding observability to LLM calls, or auditing existing instrumentation. --- # Langfuse Observability Instrument LLM applications with Langfuse tracing, following best practices and tailored to your use case. ## When to Use - Setting up Langfuse in a new project - Auditing existing Langfuse instrumentation - Adding observability to LLM calls ## Workflow ### 1. Assess Current State Check the project: - Is Langfuse SDK installed? - What LLM frameworks are used? (OpenAI SDK, LangChain, LlamaIndex, Vercel AI SDK, etc.) - Is there existing instrumentation? **No integration yet:** Set up Langfuse using a framework integration if available. Integrations capture more context automatically and require less code than manual instrumentation. **Integration exists:** Audit against baseline requirements below. ### 2. Verify Baseline Requirements Every trace should have these fundamentals: | Requirement | Check | Why | |-------------|-------|-----| | Model name | Is the LLM model captured? | Enables model comparison and filtering | | Token usage | Are input/output tokens tracked? | Enables automatic cost calculation | | Good trace names | Are names descriptive? (`chat-response`, not `trace-1`) | Makes traces findable and filterable | | Span hierarchy | Are multi-step operations nested properly? | Shows which step is slow or failing | | Correct observation types | Are generations marked as generations? | Enables model-specific analytics | | Sensitive data masked | Is PII/confidential data excluded or masked? | Prevents data leakage | | Trace input/output | Does the trace capture the full data being processed as input, and the result as output? | Enables debugging and understanding what was processed | Framework integrations (OpenAI, LangChain, etc.) handle model name, tokens, and observation types automatically. Prefer integrations over manual instrumentation. Docs: https://langfuse.com/docs/tracing ### 3. Explore Traces First Once baseline instrumentation is working, encourage the user to explore their traces in the Langfuse UI before adding more context: "Your traces are now appearing in Langfuse. Take a look at a few of them—see what data is being captured, what's useful, and what's missing. This will help us decide what additional context to add." This helps the user: - Understand what they're already getting - Form opinions about what's missing - Ask better questions about what they need ### 4. Discover Additional Context Needs Determine what additional instrumentation would be valuable. **Infer from code when possible, only ask when unclear.** **Infer from code:** | If you see in code... | Infer | Suggest | |-----------------------|-------|---------| | Conversation history, chat endpoints, message arrays | Multi-turn app | `session_id` | | User authentication, `user_id` variables | User-aware app | `user_id` on traces | | Multiple distinct endpoints/features | Multi-feature app | `feature` tag | | Customer/tenant identifiers | Multi-tenant app | `customer_id` or tier tag | | Feedback collection, ratings | Has user feedback | Capture as scores | **Only ask when not obvious from code:** - "How do you know when a response is good vs bad?" → Determines scoring approach - "What would you want to filter by in a dashboard?" → Surfaces non-obvious tags - "Are there different user segments you'd want to compare?" → Customer tiers, plans, etc. **Additions and their value:** | Addition | Why | Docs | |----------|-----|------| | `session_id` | Groups conversations together | https://langfuse.com/docs/tracing-features/sessions | | `user_id` | Enables user filtering and cost attribution | https://langfuse.com/docs/tracing-features/users | | User feedback score | Enables quality filtering and trends | https://langfuse.com/docs/scores/overview | | `feature` tag | Per-feature analytics | https://langfuse.com/docs/tracing-features/tags | | `customer_tier` tag | Cost/quality breakdown by segment | https://langfuse.com/docs/tracing-features/tags | These are NOT baseline requirements—only add what's relevant based on inference or user input. ### 5. Guide to UI After adding context, point users to relevant UI features: - Traces view: See individual requests - Sessions view: See grouped conversations (if session_id added) - Dashboard: Build filtered views using tags - Scores: Filter by quality metrics ## Framework Integrations Prefer these over manual instrumentation: | Framework | Integration | Docs | |-----------|-------------|------| | OpenAI SDK | Drop-in replacement | https://langfuse.com/docs/integrations/openai | | LangChain | Callback handler | https://langfuse.com/docs/integrations/langchain | | LlamaIndex | Callback handler | https://langfuse.com/docs/integrations/llama-index | | Vercel AI SDK | OpenTelemetry exporter | https://langfuse.com/docs/integrations/vercel-ai-sdk | | LiteLLM | Callback or proxy | https://langfuse.com/docs/integrations/litellm | Full list: https://langfuse.com/docs/integrations ## Always Explain Why When suggesting additions, explain the user benefit: ``` "I recommend adding session_id to your traces. Why: This groups messages from the same conversation together. You'll be able to see full conversation flows in the Sessions view, making it much easier to debug multi-turn interactions. Learn more: https://langfuse.com/docs/tracing-features/sessions" ``` ## Common Mistakes | Mistake | Problem | Fix | |---------|---------|-----| | No `flush()` in scripts | Traces never sent | Call `langfuse.flush()` before exit | | Flat traces | Can't see which step failed | Use nested spans for distinct steps | | Generic trace names | Hard to filter | Use descriptive names: `chat-response`, `doc-summary` | | Logging sensitive data | Data leakage risk | Mask PII before tracing | | Manual instrumentation when integration exists | More code, less context | Use framework integration | | Langfuse import before env vars loaded | Langfuse initializes with missing/wrong credentials | Import Langfuse AFTER loading environment variables (e.g., after `load_dotenv()`) | | Wrong import order with OpenAI | Langfuse can't patch the OpenAI client | Import Langfuse and call its setup BEFORE importing OpenAI client |