# Embedding API Reference ## Overview AxonHub provides comprehensive support for text and multimodal embedding generation through OpenAI-compatible and Jina AI-specific APIs. ## Key Benefits - **OpenAI Compatibility**: Use existing OpenAI SDKs without modification - **Jina AI Support**: Native Jina embedding format support for specialized use cases - **Multiple Input Types**: Support for single text, text arrays, token arrays, and multiple token arrays - **Flexible Output Formats**: Choose between float arrays or base64-encoded embeddings ## Supported Endpoints **Endpoints:** - `POST /v1/embeddings` - OpenAI-compatible embedding API - `POST /jina/v1/embeddings` - Jina AI-specific embedding API ## Request Format ```json { "input": "The text to embed", "model": "text-embedding-3-small", "encoding_format": "float", "dimensions": 1536, "user": "user-id" } ``` **Parameters:** | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | `input` | string \| string[] \| number[] \| number[][] | ✅ | The text(s) to embed. Can be a single string, array of strings, token array, or multiple token arrays. | | `model` | string | ✅ | The model to use for embedding generation. | | `encoding_format` | string | ❌ | Format to return embeddings in. Either `float` or `base64`. Default: `float`. | | `dimensions` | integer | ❌ | Number of dimensions for the output embeddings. | | `user` | string | ❌ | Unique identifier for the end-user. | **Jina-Specific Parameters:** | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | `task` | string | ❌ | Task type for Jina embeddings. Options: `text-matching`, `retrieval.query`, `retrieval.passage`, `separation`, `classification`, `none`. | ## Response Format ```json { "object": "list", "data": [ { "object": "embedding", "embedding": [0.123, 0.456, ...], "index": 0 } ], "model": "text-embedding-3-small", "usage": { "prompt_tokens": 4, "total_tokens": 4 } } ``` ## Examples ### OpenAI SDK (Python) ```python import openai client = openai.OpenAI( api_key="your-axonhub-api-key", base_url="http://localhost:8090/v1" ) response = client.embeddings.create( input="Hello, world!", model="text-embedding-3-small" ) print(response.data[0].embedding[:5]) # First 5 dimensions ``` ### OpenAI SDK (Go) ```go package main import ( "context" "fmt" "log" "github.com/openai/openai-go" "github.com/openai/openai-go/option" ) func main() { client := openai.NewClient( option.WithAPIKey("your-axonhub-api-key"), option.WithBaseURL("http://localhost:8090/v1"), ) embedding, err := client.Embeddings.New(context.TODO(), openai.EmbeddingNewParams{ Input: openai.Union[string](openai.String("Hello, world!")), Model: openai.String("text-embedding-3-small"), option.WithHeader("AH-Trace-Id", "trace-example-123"), option.WithHeader("AH-Thread-Id", "thread-example-abc"), }) if err != nil { log.Fatal(err) } fmt.Printf("Embedding dimensions: %d\n", len(embedding.Data[0].Embedding)) fmt.Printf("First 5 values: %v\n", embedding.Data[0].Embedding[:5]) } ``` ### Multiple Texts ```python response = client.embeddings.create( input=["Hello, world!", "How are you?"], model="text-embedding-3-small" ) for i, data in enumerate(response.data): print(f"Text {i}: {data.embedding[:3]}...") ``` ### Jina-Specific Task ```python import requests response = requests.post( "http://localhost:8090/jina/v1/embeddings", headers={ "Authorization": "Bearer your-axonhub-api-key", "Content-Type": "application/json" }, json={ "input": "What is machine learning?", "model": "jina-embeddings-v2-base-en", "task": "retrieval.query" } ) result = response.json() print(result["data"][0]["embedding"][:5]) ``` ## Authentication The Embedding API uses Bearer token authentication: - **Header**: `Authorization: Bearer ` The API keys are managed through AxonHub's API Key management system. ## Best Practices 1. **Use Tracing Headers**: Include `AH-Trace-Id` and `AH-Thread-Id` headers for better observability 2. **Batch Requests**: When embedding multiple texts, send them in a single request for better performance 3. **Choose Appropriate Dimensions**: Use the `dimensions` parameter to reduce embedding size if full dimensionality isn't needed 4. **Select Proper Encoding**: Use `base64` encoding if you need to transmit embeddings over the network to reduce payload size 5. **Jina Task Types**: When using Jina embeddings, select the appropriate `task` type for your use case to optimize retrieval quality ## Related Resources - [OpenAI API](openai-api.md) - [Rerank API](rerank-api.md)