openapi: 3.0.1 info: title: Not Diamond API description: >- REST API for the Not Diamond AI model router. The modelSelect endpoint routes a prompt to the best LLM across providers based on quality, cost, and latency tradeoffs. Additional endpoints list supported models, report feedback and latency metrics to personalize routing in real time, and train custom routers from evaluation datasets. termsOfService: https://www.notdiamond.ai/terms contact: name: Not Diamond url: https://www.notdiamond.ai version: '2.0' servers: - url: https://api.notdiamond.ai/v2 paths: /modelRouter/modelSelect: post: operationId: modelSelect tags: - Model Routing summary: Route a prompt to the best model. description: >- Given OpenAI-format messages and a list of candidate LLM providers, returns the recommended provider/model along with a session ID for the routing decision. Supports cost/latency tradeoffs and personalized routing via a preference ID. requestBody: required: true content: application/json: schema: $ref: '#/components/schemas/ModelSelectRequest' responses: '200': description: The recommended provider(s) and a session ID. content: application/json: schema: $ref: '#/components/schemas/ModelSelectResponse' '401': description: Missing or invalid API key. '422': description: Validation error. /models: get: operationId: listModels tags: - Models summary: List supported models. description: >- Lists all supported text generation models with provider, context length, and per-million-token input/output pricing. Results are cacheable for one hour. parameters: - name: provider in: query required: false description: Filter by one or more provider names. schema: type: array items: type: string - name: openrouter_only in: query required: false description: Return only OpenRouter-supported models. schema: type: boolean default: false responses: '200': description: A list of supported models. content: application/json: schema: $ref: '#/components/schemas/ModelsListResponse' /report/metrics/feedback: post: operationId: reportFeedback tags: - Feedback summary: Report outcome feedback for a routing session. description: >- Submits binary outcome feedback (0 or 1) for the provider selected in a routing session so Not Diamond can personalize future routing. requestBody: required: true content: application/json: schema: $ref: '#/components/schemas/FeedbackRequest' responses: '200': description: Feedback accepted. '401': description: Missing or invalid API key. '422': description: Validation error. /report/metrics/latency: post: operationId: reportLatency tags: - Feedback summary: Report latency metrics for a routing session. description: >- Reports observed latency (for example tokens per second) for the provider selected in a routing session. requestBody: required: true content: application/json: schema: $ref: '#/components/schemas/LatencyRequest' responses: '200': description: Latency metrics accepted. '401': description: Missing or invalid API key. '422': description: Validation error. /pzn/trainCustomRouter: post: operationId: trainCustomRouter tags: - Custom Routers summary: Train a custom router. description: >- Trains a custom router from an uploaded CSV evaluation dataset. Training is asynchronous and typically takes 5-15 minutes. Returns a preference ID used to drive personalized routing in subsequent modelSelect calls. requestBody: required: true content: multipart/form-data: schema: $ref: '#/components/schemas/TrainCustomRouterRequest' responses: '200': description: The trained router's preference ID. content: application/json: schema: $ref: '#/components/schemas/TrainCustomRouterResponse' '401': description: Missing or invalid API key. '422': description: Validation error. /preferences/userPreferenceCreate: post: operationId: createUserPreference tags: - Custom Routers summary: Create a user preference. description: >- Creates a user preference (router) and returns its preference ID, which can be referenced in modelSelect calls and custom router training. requestBody: required: false content: application/json: schema: $ref: '#/components/schemas/UserPreferenceCreateRequest' responses: '200': description: The created preference ID. content: application/json: schema: $ref: '#/components/schemas/UserPreferenceCreateResponse' '401': description: Missing or invalid API key. components: securitySchemes: bearerAuth: type: http scheme: bearer description: Not Diamond API key passed as a Bearer token. schemas: Message: type: object description: An OpenAI-format chat message. properties: role: type: string example: user content: type: string required: - role - content LLMProvider: type: object description: A candidate provider/model that the router may select. properties: provider: type: string description: Provider identifier (e.g. openai, anthropic). Omit when using an OpenRouter slug. example: openai model: type: string description: Model identifier, or an OpenRouter slug (e.g. openai/gpt-4o). example: gpt-4o is_custom: type: boolean description: Set to true to describe a custom model. context_length: type: integer description: Maximum context length (required for custom models). input_price: type: number description: Cost per million input tokens in USD (required for custom models). output_price: type: number description: Cost per million output tokens in USD (required for custom models). latency: type: number description: Expected latency for the custom model. required: - model ModelSelectRequest: type: object properties: messages: type: array description: Array of OpenAI-format message objects. items: $ref: '#/components/schemas/Message' llm_providers: type: array description: List of candidate LLM providers to route between. At least one required. items: $ref: '#/components/schemas/LLMProvider' tools: type: array nullable: true description: OpenAI-format function-calling tools. items: type: object preference_id: type: string nullable: true description: Preference ID for personalized or custom routing. tradeoff: type: string nullable: true description: Optimization tradeoff strategy. enum: - cost - latency cost_quality_tradeoff: type: integer nullable: true minimum: 0 maximum: 10 description: Continuous cost/quality blend (0 = quality, 10 = cost). hash_content: type: boolean default: false description: Whether to hash message content for privacy. metric: type: string default: accuracy description: Optimization metric for model selection. max_model_depth: type: integer nullable: true description: Maximum number of models to consider for routing. previous_session: type: string nullable: true description: Previous session ID to link related requests. required: - messages - llm_providers ModelSelectResponse: type: object properties: providers: type: array description: List containing the selected provider/model. items: $ref: '#/components/schemas/LLMProvider' session_id: type: string description: Unique session ID for this routing decision. Model: type: object properties: provider: type: string example: openai model: type: string example: gpt-4o context_length: type: integer input_price: type: number description: Cost per million input tokens in USD. output_price: type: number description: Cost per million output tokens in USD. openrouter_model: type: string nullable: true description: OpenRouter slug if supported. ModelsListResponse: type: object properties: models: type: array items: $ref: '#/components/schemas/Model' total: type: integer deprecated_models: type: array items: $ref: '#/components/schemas/Model' FeedbackRequest: type: object properties: session_id: type: string description: The session ID returned by modelSelect. provider: $ref: '#/components/schemas/LLMProvider' feedback: type: object description: Feedback payload. The outcome value must be 0 or 1. properties: value: type: integer enum: - 0 - 1 required: - session_id - provider - feedback LatencyRequest: type: object properties: session_id: type: string provider: $ref: '#/components/schemas/LLMProvider' feedback: type: object properties: tokens_per_second: type: number required: - session_id - provider - feedback TrainCustomRouterRequest: type: object properties: dataset_file: type: string format: binary description: CSV file with evaluation data (minimum 25 samples). language: type: string enum: - english - multilingual llm_providers: type: string description: JSON array of model providers. prompt_column: type: string description: CSV column name containing prompts. maximize: type: boolean description: Whether higher scores indicate better performance. preference_id: type: string nullable: true description: Existing router ID to update. override: type: boolean nullable: true default: true description: Override existing router. required: - dataset_file - language - llm_providers - prompt_column - maximize TrainCustomRouterResponse: type: object properties: preference_id: type: string description: Unique identifier for the trained router. UserPreferenceCreateRequest: type: object properties: name: type: string nullable: true description: Optional name for the preference/router. UserPreferenceCreateResponse: type: object properties: preference_id: type: string security: - bearerAuth: []