openapi: 3.1.0 info: title: Cohere Chat API description: >- The Cohere Chat API enables developers to integrate large language model text generation capabilities into their applications through a conversational interface. It supports multi-turn conversations, tool use with JSON schema definitions, retrieval-augmented generation, and streaming responses. The API is available via the v2 endpoint and works with Cohere's Command family of models. version: '2.0' contact: name: Cohere Support url: https://support.cohere.com termsOfService: https://cohere.com/terms-of-use externalDocs: description: Cohere Chat API Documentation url: https://docs.cohere.com/reference/chat servers: - url: https://api.cohere.com description: Cohere Production Server tags: - name: Chat description: >- Endpoints for generating text responses through conversational interactions with Cohere language models. security: - bearerAuth: [] paths: /v2/chat: post: operationId: chat summary: Chat with a Cohere model description: >- Generates a text response to a user message. Accepts a list of chat messages in chronological order representing a conversation between the user and the model. Messages can include User, Assistant, Tool, and System roles. Supports tool use, retrieval-augmented generation, and structured JSON output. tags: - Chat requestBody: required: true content: application/json: schema: $ref: '#/components/schemas/ChatRequest' responses: '200': description: Successful chat completion response content: application/json: schema: $ref: '#/components/schemas/ChatResponse' '400': description: Bad request due to invalid parameters content: application/json: schema: $ref: '#/components/schemas/Error' '401': description: Unauthorized due to missing or invalid API key content: application/json: schema: $ref: '#/components/schemas/Error' '429': description: Rate limit exceeded content: application/json: schema: $ref: '#/components/schemas/Error' /v2/chat/stream: post: operationId: chatStream summary: Chat with streaming response description: >- Generates a text response to a user message and streams it using server-sent events (SSE). Partial results are delivered as they are generated, enabling real-time display in user interfaces. Emits stream-start, content-delta, citation-start, citation-end, and stream-end events. tags: - Chat requestBody: required: true content: application/json: schema: $ref: '#/components/schemas/ChatRequest' responses: '200': description: Streaming chat completion response delivered as SSE events content: text/event-stream: schema: $ref: '#/components/schemas/StreamEvent' '400': description: Bad request due to invalid parameters content: application/json: schema: $ref: '#/components/schemas/Error' '401': description: Unauthorized due to missing or invalid API key content: application/json: schema: $ref: '#/components/schemas/Error' '429': description: Rate limit exceeded content: application/json: schema: $ref: '#/components/schemas/Error' components: securitySchemes: bearerAuth: type: http scheme: bearer description: >- Bearer authentication using a Cohere API key. Pass the API key in the Authorization header as Bearer . schemas: ChatRequest: type: object required: - model - messages properties: model: type: string description: >- The name of a compatible Cohere model to use for generation. example: command-r-plus messages: type: array description: >- A list of chat messages in chronological order representing a conversation between the user and the model. Messages can be from User, Assistant, Tool, and System roles. items: $ref: '#/components/schemas/Message' tools: type: array description: >- A list of tools (functions) available to the model. The model may choose to call these tools during generation. Each tool is defined with a name, description, and JSON schema for parameters. items: $ref: '#/components/schemas/Tool' max_tokens: type: integer description: >- The maximum number of output tokens the model will generate in the response. If not set, defaults to the model's maximum output token limit. minimum: 1 stop_sequences: type: array description: >- A list of up to 5 strings that the model will use to stop generation. If the model generates a string matching any entry, it will stop generating tokens. items: type: string maxItems: 5 temperature: type: number description: >- A non-negative float that tunes the degree of randomness in generation. Lower temperatures mean less random generations and higher temperatures mean more random generations. Defaults to 0.3. minimum: 0 maximum: 2 default: 0.3 response_format: type: object description: >- Controls the format of the model's output. Set type to json_object to force JSON output. Optionally provide a JSON Schema to ensure a specific structure. properties: type: type: string enum: - text - json_object description: >- The format type for the response output. json_schema: type: object description: >- An optional JSON Schema that the output must conform to when type is json_object. safety_mode: type: string enum: - CONTEXTUAL - STRICT - NONE description: >- Used to select the safety instruction inserted into the prompt. Defaults to CONTEXTUAL. default: CONTEXTUAL Message: type: object required: - role properties: role: type: string enum: - user - assistant - system - tool description: >- The role of the message author in the conversation. content: type: string description: >- The text content of the message. tool_call_id: type: string description: >- The ID of the tool call this message is responding to. Required when role is tool. tool_calls: type: array description: >- Tool calls generated by the model. Present when role is assistant and the model decided to call tools. items: $ref: '#/components/schemas/ToolCall' Tool: type: object required: - type - function properties: type: type: string enum: - function description: >- The type of tool. Currently only function is supported. function: type: object required: - name - description properties: name: type: string description: >- The name of the function to be called. description: type: string description: >- A description of what the function does. parameters: type: object description: >- The parameters the function accepts, described as a JSON Schema object. ToolCall: type: object properties: id: type: string description: >- The unique identifier for this tool call. type: type: string enum: - function description: >- The type of tool call. function: type: object properties: name: type: string description: >- The name of the function to call. arguments: type: string description: >- The arguments to pass to the function, as a JSON string. ChatResponse: type: object properties: id: type: string description: >- Unique identifier for the chat completion. finish_reason: type: string enum: - complete - max_tokens - stop_sequence - tool_call - error - timeout description: >- The reason the chat request finished. Values include complete, max_tokens, stop_sequence, tool_call, error, and timeout. message: $ref: '#/components/schemas/Message' usage: $ref: '#/components/schemas/Usage' Usage: type: object properties: billed_units: type: object properties: input_tokens: type: integer description: >- The number of billed input tokens. output_tokens: type: integer description: >- The number of billed output tokens. tokens: type: object properties: input_tokens: type: integer description: >- The total number of input tokens processed. output_tokens: type: integer description: >- The total number of output tokens generated. StreamEvent: type: object description: >- A server-sent event emitted during streaming chat generation. Event types include stream-start, content-delta, citation-start, citation-end, tool-call-start, tool-call-delta, tool-call-end, and stream-end. properties: event_type: type: string enum: - stream-start - content-delta - citation-start - citation-end - tool-call-start - tool-call-delta - tool-call-end - stream-end description: >- The type of streaming event. delta: type: object description: >- The incremental content payload for delta events. properties: message: type: object properties: content: type: object properties: text: type: string description: >- The incremental text content. Error: type: object properties: message: type: string description: >- A human-readable error message describing what went wrong.