# Model Management Guide AxonHub provides a flexible model management system that supports mapping abstract models to specific channels and model implementations through Model Associations, enabling a unified model interface and intelligent channel selection. ## 🎯 Core Concepts ### Model A Model in AxonHub is an abstract model definition containing: - **ModelID** - Unique model identifier (e.g., `gpt-4`, `claude-3-opus`) - **Developer** - Model developer (e.g., `openai`, `anthropic`) - **Settings** - Model settings, including association rules - **ModelCard** - Model metadata (capabilities, costs, limits, etc.) ### Channel A Channel is a specific AI service provider connection containing: - List of supported models - API configuration and authentication - Channel weight and tags ### Model Association Model Associations define the mapping relationships between abstract models and channels, supporting multiple matching strategies. ## 🔗 Model-Channel Relationship ### Relationship Diagram ``` ┌─────────────────┐ │ AxonHub Model │ (Abstract Model) │ ID: gpt-4 │ └────────┬────────┘ │ │ Associations (Association Rules) │ ▼ ┌─────────────────────────────────────────────────┐ │ ModelChannelConnection │ │ ┌──────────────┐ ┌──────────────┐ │ │ │ Channel A │ │ Channel B │ │ │ │ gpt-4-turbo │ │ gpt-4 │ │ │ │ Priority: 0 │ │ Priority: 1 │ │ │ └──────────────┘ └──────────────┘ │ └─────────────────────────────────────────────────┘ │ │ Load Balancer │ ▼ ┌─────────────────┐ │ Selected │ (Selected Candidate) │ Candidate │ └─────────────────┘ ``` ### Request Flow 1. **Request Arrives** - Client requests to use model `gpt-4` 2. **Model Query** - System looks up the Model entity with ModelID `gpt-4` 3. **Association Resolution** - Resolves matching channels and models based on model's Associations 4. **Candidate Generation** - Generates `ChannelModelCandidate` list 5. **Load Balancing** - Applies load balancing strategies to sort candidates 6. **Request Execution** - Executes request using the first candidate 7. **Failure Retry** - Switches to next candidate on failure ## 📋 Model Association Types ### 1. channel_model - Specific Channel, Specific Model Precisely matches a specific model in a specific channel. ```json { "type": "channel_model", "priority": 0, "channelModel": { "channelId": 1, "modelId": "gpt-4-turbo" } } ``` **Use Cases**: - Need precise control over which channel a model uses - Primary channel with highest priority ### 2. channel_regex - Specific Channel, Regex Match Matches all models matching the regex pattern in a specific channel. ```json { "type": "channel_regex", "priority": 1, "channelRegex": { "channelId": 2, "pattern": "gpt-4.*" } } ``` **Use Cases**: - Channel supports multiple model variants - Need flexible model name matching ### 3. regex - All Channels, Regex Match Matches models matching the regex pattern across all enabled channels. ```json { "type": "regex", "priority": 2, "regex": { "pattern": "gpt-4.*", "exclude": [ { "channelNamePattern": ".*test.*", "channelIds": [5, 6], "channelTags": ["beta"] } ] } } ``` **Exclusion Rules**: - `channelNamePattern` - Exclude channels with matching names - `channelIds` - Exclude channels with specific IDs - `channelTags` - Exclude channels with specific tags **Use Cases**: - Broadly match models across multiple channels - Need to exclude specific channels ### 4. model - All Channels, Specific Model Matches a specific model across all enabled channels. ```json { "type": "model", "priority": 3, "modelId": { "modelId": "gpt-4", "exclude": [ { "channelNamePattern": ".*backup.*", "channelIds": [10], "channelTags": ["low-priority"] } ] } } ``` **Use Cases**: - All channels supporting the model can be candidates - Need to exclude specific channels ### 5. channel_tags_model - Tagged Channels, Specific Model Matches a specific model in channels with specified tags (OR logic). ```json { "type": "channel_tags_model", "priority": 4, "channelTagsModel": { "channelTags": ["production", "high-performance"], "modelId": "gpt-4" } } ``` **Use Cases**: - Select candidates based on channel tags - Environment isolation (production/test) ### 6. channel_tags_regex - Tagged Channels, Regex Match Matches models matching the regex pattern in channels with specified tags (OR logic). ```json { "type": "channel_tags_regex", "priority": 5, "channelTagsRegex": { "channelTags": ["openai", "azure"], "pattern": "gpt-4.*" } } ``` **Use Cases**: - Select based on provider tags - Flexible matching of model variants ## 🚀 Quick Start ### 1. Create a Model ```bash curl -X POST http://localhost:8090/api/models \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_ADMIN_TOKEN" \ -d '{ "developer": "openai", "modelId": "gpt-4", "name": "GPT-4", "type": "chat", "group": "gpt", "icon": "🤖", "settings": { "associations": [ { "type": "channel_model", "priority": 0, "channelModel": { "channelId": 1, "modelId": "gpt-4-turbo" } }, { "type": "regex", "priority": 1, "regex": { "pattern": "gpt-4.*", "exclude": [ { "channelTags": ["test"] } ] } } ] } }' ``` ### 2. Configure Channels Ensure channels are configured and support the corresponding models: ```bash curl -X POST http://localhost:8090/api/channels \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_ADMIN_TOKEN" \ -d '{ "name": "OpenAI Primary", "type": "openai", "enabled": true, "weight": 100, "tags": ["production", "openai"], "config": { "apiKey": "sk-...", "baseUrl": "https://api.openai.com/v1" }, "models": [ { "requestModel": "gpt-4", "actualModel": "gpt-4-turbo" } ] }' ``` ### 3. Use the Model Clients use the abstract ModelID for requests: ```python from openai import OpenAI client = OpenAI( api_key="your-axonhub-api-key", base_url="http://localhost:8090/v1" ) response = client.chat.completions.create( model="gpt-4", # Use abstract ModelID messages=[{"role": "user", "content": "Hello!"}] ) ``` The system will automatically: 1. Look up the `gpt-4` model entity 2. Resolve association rules 3. Select the optimal channel 4. Map to the actual model (e.g., `gpt-4-turbo`) ## 🎛️ Advanced Configuration ### Priority Control Associations are processed in priority order, with smaller priority values being higher priority: ```json { "settings": { "associations": [ { "type": "channel_model", "priority": 0, // Highest priority "channelModel": { "channelId": 1, "modelId": "gpt-4-turbo" } }, { "type": "regex", "priority": 10, // Lower priority "regex": { "pattern": "gpt-4.*" } } ] } } ``` ### Deduplication The system automatically deduplicates, so the same (channel, model) combination will only appear once. ### Cache Optimization Association resolution results are cached. Cache invalidation conditions: - Channel count changes - Channel update time changes - Model update time changes - Cache expires (5 minutes) ### Fallback Strategy When a model doesn't exist, it can fall back to traditional channel selection: ```json { "queryAllChannelModels": true, "fallbackToChannelsOnModelNotFound": true } ``` ## 📊 Monitoring and Debugging ### Query Model Associations ```bash # Query channel connections for a model curl -X POST http://localhost:8090/api/models/connections \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_ADMIN_TOKEN" \ -d '{ "associations": [ { "type": "regex", "regex": { "pattern": "gpt-4.*" } } ] }' ``` ### Query Unassociated Channels ```bash curl -X GET http://localhost:8090/api/models/unassociated-channels \ -H "Authorization: Bearer YOUR_ADMIN_TOKEN" ``` ### Debug Logs ```bash # View model selection logs tail -f axonhub.log | grep "selected model candidates" # View association resolution logs tail -f axonhub.log | grep "association resolution" # View candidate selection logs tail -f axonhub.log | grep "Load balanced candidates" ``` ## 🔧 Best Practices ### 1. Model Design - **Use Standardized ModelIDs** - e.g., `gpt-4`, `claude-3-opus` - **Reasonable Priority Settings** - Primary channels 0-10, backup channels 10-100 - **Flexible Regex Usage** - Avoid overly broad patterns ### 2. Channel Organization - **Use Tags for Classification** - e.g., `production`, `test`, `backup` - **Set Reasonable Weights** - Use in conjunction with load balancing - **Regular Cleanup** - Remove unused channels ### 3. Association Strategies - **Precision First** - Use `channel_model` for primary channels - **Flexible Backup** - Use `regex` or `model` for backup - **Environment Isolation** - Use tag associations to separate production/test environments ### 4. Performance Optimization - **Leverage Caching** - Association resolution results are cached - **Avoid Frequent Updates** - Minimize update frequency for models and channels - **Monitor Candidate Count** - Too many candidates can impact performance ## 🐛 Common Issues ### Q: Why did the request fail? A: Check the following: 1. Does the model exist and is it enabled? 2. Are association rules correctly configured? 3. Are channels enabled and support the corresponding models? 4. Check logs for specific errors ### Q: How do I verify associations are working? A: 1. Use the `/api/models/connections` API to query association results 2. Enable debug logs to view the candidate selection process 3. Send test requests to verify routing ### Q: How many association rules are supported? A: Theoretically unlimited, but recommended: - No more than 10 associations per model - Total associations not exceeding 100 - Avoid overly complex regex patterns ### Q: How to implement A/B testing? A: Use multiple associations with the same priority: ```json { "settings": { "associations": [ { "type": "channel_model", "priority": 0, "channelModel": { "channelId": 1, "modelId": "gpt-4-turbo" } }, { "type": "channel_model", "priority": 0, "channelModel": { "channelId": 2, "modelId": "gpt-4" } } ] } } ``` Load balancing will select among candidates with the same priority. ### Q: How to exclude specific channels? A: Use the `exclude` field: ```json { "type": "regex", "regex": { "pattern": "gpt-4.*", "exclude": [ { "channelNamePattern": ".*test.*", "channelIds": [5, 6], "channelTags": ["beta"] } ] } } ``` ## 🔗 Related Documentation - [Adaptive Load Balancing Guide](load-balance.md) - [API Key Profiles Guide](api-key-profiles.md) - [Unified API Documentation](../api-reference/unified-api.md) - [Tracing and Debugging](tracing.md)