--- name: build-rag-with-telnyx-inference title: "Build RAG with Telnyx Inference" description: "Build a retrieval-augmented generation API with Telnyx embeddings and chat completions." language: python framework: flask telnyx_products: [AI Inference] --- # Build RAG with Telnyx Inference Build a retrieval-augmented generation API with Telnyx embeddings and chat completions. ## Telnyx API Endpoints Used - **Embeddings**: `POST /v2/ai/embeddings` - create vectors for documents and questions - **AI Inference**: `POST /v2/ai/chat/completions` - [API reference](https://developers.telnyx.com/api/inference/chat-completions) ## Architecture ``` User question | v Embed question with Telnyx | v Compare against document embeddings | v Send retrieved context to Telnyx AI | v Grounded answer + source titles ``` ## Environment Variables Copy `.env.example` to `.env` and fill in: | Variable | Type | Example | Required | Description | Where to get it | |----------|------|---------|----------|-------------|-----------------| | `TELNYX_API_KEY` | `string` | `KEY0123456789ABCDEF` | **yes** | Telnyx API v2 key | [Portal](https://portal.telnyx.com/api-keys) | | `AI_MODEL` | `string` | `moonshotai/Kimi-K2.6` | no | Telnyx chat model | [Models](https://developers.telnyx.com/docs/inference/models) | | `EMBEDDING_MODEL` | `string` | `thenlper/gte-large` | no | Telnyx embedding model | [Models](https://developers.telnyx.com/docs/inference/models) | | `HOST` | `string` | `127.0.0.1` | no | Local server host | - | | `PORT` | `integer` | `5000` | no | Local server port | - | ## Setup ```bash git clone https://github.com/team-telnyx/telnyx-code-examples.git cd telnyx-code-examples/build-rag-with-telnyx-inference-python cp .env.example .env pip install -r requirements.txt python app.py ``` ## API Reference ### `POST /rag/ask` Ask a question. The app retrieves relevant in-memory support docs and answers using only that context. ```bash curl -X POST http://localhost:5000/rag/ask \ -H "Content-Type: application/json" \ -d '{ "question": "Production signup broke after rotating an API key. Logs show 401 errors. What should we check first?" }' ``` **Response:** ```json { "answer": "Check that production services are using the new active API key and that the key has the required permissions. Also verify no old key is cached in deployment secrets.", "model": "moonshotai/Kimi-K2.6", "embedding_model": "thenlper/gte-large", "sources": [ {"title": "API Key Authentication", "score": 0.9123}, {"title": "Verification Message Delivery", "score": 0.7811} ] } ``` ### `GET /documents` Returns the sample knowledge base. ### `GET /health` Returns service status, configured models, and document count. ## Troubleshooting | Issue | Cause | Fix | |-------|-------|-----| | `401 Unauthorized` | Invalid or missing Telnyx API key | Verify `TELNYX_API_KEY` in `.env` | | Slow first request | The app creates document embeddings lazily | First request may take longer; later requests reuse embeddings in memory | | Weak answers | Sample knowledge base is too small | Add more documents or replace `DOCUMENTS` with your own content | ## Related Examples - [Run LLM Inference (Python)](https://raw.githubusercontent.com/team-telnyx/telnyx-code-examples/main/run-llm-inference-python/README.md) - [Extract Structured JSON with AI (Python)](https://raw.githubusercontent.com/team-telnyx/telnyx-code-examples/main/extract-structured-json-with-ai-python/README.md) - [AI Assistant Knowledge Base (Python)](https://raw.githubusercontent.com/team-telnyx/telnyx-code-examples/main/ai-assistant-knowledge-base-python/README.md) ## Resources - [AI Inference Guide](https://developers.telnyx.com/docs/inference) - [Telnyx Developer Docs](https://developers.telnyx.com) - [Telnyx Portal](https://portal.telnyx.com) ## Why Telnyx Telnyx is an **AI Communications Infrastructure** platform - voice, messaging, SIP, AI, and IoT on one private, global network.