components: parameters: parameters_prefer_header: description: >- Leave the request open and wait for the model to finish generating output. Set to `wait=n` where n is a number of seconds between 1 and 60. See https://replicate.com/docs/topics/predictions/create-a-prediction#sync-mode for more information. in: header name: Prefer schema: example: wait=5 pattern: ^wait(=([1-9]|[1-9][0-9]|60))?$ type: string schemas: schemas_prediction_request: additionalProperties: false properties: input: description: > The model's input as a JSON object. The input schema depends on what model you are running. To see the available inputs, click the "API" tab on the model you are running or [get the model version](#models.versions.get) and look at its `openapi_schema` property. For example, [stability-ai/sdxl](https://replicate.com/stability-ai/sdxl) takes `prompt` as an input. Files should be passed as HTTP URLs or data URLs. Use an HTTP URL when: - you have a large file > 256kb - you want to be able to use the file multiple times - you want your prediction metadata to be associable with your input files Use a data URL when: - you have a small file <= 256kb - you don't want to upload and host the file somewhere - you don't need to use the file again (Replicate will not store it) type: object stream: description: > **This field is deprecated.** Request a URL to receive streaming output using [server-sent events (SSE)](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events). This field is no longer needed as the returned prediction will always have a `stream` entry in its `url` property if the model supports streaming. type: boolean webhook: description: > An HTTPS URL for receiving a webhook when the prediction has new output. The webhook will be a POST request where the request body is the same as the response body of the [get prediction](#predictions.get) operation. If there are network problems, we will retry the webhook a few times, so make sure it can be safely called more than once. Replicate will not follow redirects when sending webhook requests to your service, so be sure to specify a URL that will resolve without redirecting. type: string webhook_events_filter: description: > By default, we will send requests to your webhook URL whenever there are new outputs or the prediction has finished. You can change which events trigger webhook requests by specifying `webhook_events_filter` in the prediction request: - `start`: immediately on prediction start - `output`: each time a prediction generates an output (note that predictions can generate multiple outputs) - `logs`: each time log output is generated by a prediction - `completed`: when the prediction reaches a terminal state (succeeded/canceled/failed) For example, if you only wanted requests to be sent at the start and end of the prediction, you would provide: ```json { "input": { "text": "Alice" }, "webhook": "https://example.com/my-webhook", "webhook_events_filter": ["start", "completed"] } ``` Requests for event types `output` and `logs` will be sent at most once every 500ms. If you request `start` and `completed` webhooks, then they'll always be sent regardless of throttling. items: enum: - start - output - logs - completed type: string type: array required: - input type: object schemas_training_request: properties: destination: description: > A string representing the desired model to push to in the format `{destination_model_owner}/{destination_model_name}`. This should be an existing model owned by the user or organization making the API request. If the destination is invalid, the server will return an appropriate 4XX response. type: string input: description: | An object containing inputs to the Cog model's `train()` function. type: object webhook: description: >- An HTTPS URL for receiving a webhook when the training completes. The webhook will be a POST request where the request body is the same as the response body of the [get training](#trainings.get) operation. If there are network problems, we will retry the webhook a few times, so make sure it can be safely called more than once. Replicate will not follow redirects when sending webhook requests to your service, so be sure to specify a URL that will resolve without redirecting. type: string webhook_events_filter: description: > By default, we will send requests to your webhook URL whenever there are new outputs or the training has finished. You can change which events trigger webhook requests by specifying `webhook_events_filter` in the training request: - `start`: immediately on training start - `output`: each time a training generates an output (note that trainings can generate multiple outputs) - `logs`: each time log output is generated by a training - `completed`: when the training reaches a terminal state (succeeded/canceled/failed) For example, if you only wanted requests to be sent at the start and end of the training, you would provide: ```json { "destination": "my-organization/my-model", "input": { "text": "Alice" }, "webhook": "https://example.com/my-webhook", "webhook_events_filter": ["start", "completed"] } ``` Requests for event types `output` and `logs` will be sent at most once every 500ms. If you request `start` and `completed` webhooks, then they'll always be sent regardless of throttling. items: enum: - start - output - logs - completed type: string type: array required: - destination - input type: object schemas_version_prediction_request: additionalProperties: false properties: input: description: > The model's input as a JSON object. The input schema depends on what model you are running. To see the available inputs, click the "API" tab on the model you are running or [get the model version](#models.versions.get) and look at its `openapi_schema` property. For example, [stability-ai/sdxl](https://replicate.com/stability-ai/sdxl) takes `prompt` as an input. Files should be passed as HTTP URLs or data URLs. Use an HTTP URL when: - you have a large file > 256kb - you want to be able to use the file multiple times - you want your prediction metadata to be associable with your input files Use a data URL when: - you have a small file <= 256kb - you don't want to upload and host the file somewhere - you don't need to use the file again (Replicate will not store it) type: object stream: description: > **This field is deprecated.** Request a URL to receive streaming output using [server-sent events (SSE)](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events). This field is no longer needed as the returned prediction will always have a `stream` entry in its `url` property if the model supports streaming. type: boolean version: description: The ID of the model version that you want to run. type: string webhook: description: > An HTTPS URL for receiving a webhook when the prediction has new output. The webhook will be a POST request where the request body is the same as the response body of the [get prediction](#predictions.get) operation. If there are network problems, we will retry the webhook a few times, so make sure it can be safely called more than once. Replicate will not follow redirects when sending webhook requests to your service, so be sure to specify a URL that will resolve without redirecting. type: string webhook_events_filter: description: > By default, we will send requests to your webhook URL whenever there are new outputs or the prediction has finished. You can change which events trigger webhook requests by specifying `webhook_events_filter` in the prediction request: - `start`: immediately on prediction start - `output`: each time a prediction generates an output (note that predictions can generate multiple outputs) - `logs`: each time log output is generated by a prediction - `completed`: when the prediction reaches a terminal state (succeeded/canceled/failed) For example, if you only wanted requests to be sent at the start and end of the prediction, you would provide: ```json { "version": "5c7d5dc6dd8bf75c1acaa8565735e7986bc5b66206b55cca93cb72c9bf15ccaa", "input": { "text": "Alice" }, "webhook": "https://example.com/my-webhook", "webhook_events_filter": ["start", "completed"] } ``` Requests for event types `output` and `logs` will be sent at most once every 500ms. If you request `start` and `completed` webhooks, then they'll always be sent regardless of throttling. items: enum: - start - output - logs - completed type: string type: array required: - version - input type: object securitySchemes: bearerAuth: bearerFormat: > All API requests must include a valid API token in the `Authorization` request header. The token must be prefixed by "Bearer", followed by a space and the token value. Example: `Authorization: Bearer r8_Hw***********************************` Find your tokens at https://replicate.com/account/api-tokens scheme: bearer type: http externalDocs: description: Replicate HTTP API url: https://replicate.com/docs/reference/http info: contact: email: team@replicate.com description: >- AI can do extraordinary things, but its still too hard to use. We don't believe AI is inherently hard. We just don't have the right tools and abstractions yet. Were building tools so all software engineers can use AI as if it were normal software. You should be able to import an image generator the same way you import an npm package. You should be able to customize a model as easily as you can fork something on GitHub. termsOfService: https://replicate.com/terms title: Replicate version: 1.0.0-a1 openapi: 3.1.0 paths: /account: get: description: > Returns information about the user or organization associated with the provided API token. Example cURL request: ```console curl -s \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/account ``` The response will be a JSON object describing the account: ```json { "type": "organization", "username": "acme", "name": "Acme Corp, Inc.", "github_url": "https://github.com/acme", } ``` operationId: account.get responses: '200': content: application/json: schema: properties: github_url: description: The GitHub URL of the account. format: uri type: string name: description: The name of the account. type: string type: description: The account type. Can be a user or an organization. enum: - organization - user type: string username: description: The username of the account. type: string type: object description: Success summary: Get the Authenticated Account tags: - Accounts /collections: get: description: | Example cURL request: ```console curl -s \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/collections ``` The response will be a paginated JSON list of collection objects: ```json { "next": "null", "previous": null, "results": [ { "name": "Super resolution", "slug": "super-resolution", "description": "Upscaling models that create high-quality images from low-quality images." } ] } ``` operationId: collections.list responses: '200': description: Success summary: List Collections of Models tags: - Collections /collections/{collection_slug}: get: description: > Example cURL request: ```console curl -s \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/collections/super-resolution ``` The response will be a collection object with a nested list of the models in that collection: ```json { "name": "Super resolution", "slug": "super-resolution", "description": "Upscaling models that create high-quality images from low-quality images.", "models": [...] } ``` operationId: collections.get parameters: - description: > The slug of the collection, like `super-resolution` or `image-restoration`. See [replicate.com/collections](https://replicate.com/collections). in: path name: collection_slug required: true schema: type: string responses: '200': description: Success summary: Get a Collection of Models tags: - Collections - Slug /deployments: get: description: > Get a list of deployments associated with the current account, including the latest release configuration for each deployment. Example cURL request: ```console curl -s \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/deployments ``` The response will be a paginated JSON array of deployment objects, sorted with the most recent deployment first: ```json { "next": "http://api.replicate.com/v1/deployments?cursor=cD0yMDIzLTA2LTA2KzIzJTNBNDAlM0EwOC45NjMwMDAlMkIwMCUzQTAw", "previous": null, "results": [ { "owner": "replicate", "name": "my-app-image-generator", "current_release": { "number": 1, "model": "stability-ai/sdxl", "version": "da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf", "created_at": "2024-02-15T16:32:57.018467Z", "created_by": { "type": "organization", "username": "acme", "name": "Acme Corp, Inc.", "github_url": "https://github.com/acme", }, "configuration": { "hardware": "gpu-t4", "min_instances": 1, "max_instances": 5 } } } ] } ``` operationId: deployments.list responses: '200': content: application/json: schema: properties: next: description: >- A URL pointing to the next page of deployment objects if any nullable: true type: string previous: description: >- A URL pointing to the previous page of deployment objects if any nullable: true type: string results: description: An array containing a page of deployment objects items: properties: current_release: properties: configuration: properties: hardware: description: >- The SKU for the hardware used to run the model. type: string max_instances: description: The maximum number of instances for scaling. type: integer min_instances: description: The minimum number of instances for scaling. type: integer type: object created_at: description: The time the release was created. format: date-time type: string created_by: properties: github_url: description: >- The GitHub URL of the account that created the release. format: uri type: string name: description: >- The name of the account that created the release. type: string type: description: >- The account type of the creator. Can be a user or an organization. enum: - organization - user type: string username: description: >- The username of the account that created the release. type: string type: object model: description: >- The model identifier string in the format of `{model_owner}/{model_name}`. type: string number: description: >- The release number. This is an auto-incrementing integer that starts at 1, and is set automatically when a deployment is created. type: integer version: description: The ID of the model version used in the release. type: string type: object name: description: The name of the deployment. type: string owner: description: The owner of the deployment. type: string type: object type: array type: object description: Success summary: List Deployments tags: - Deployments post: description: | Create a new deployment: Example cURL request: ```console curl -s \ -X POST \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "name": "my-app-image-generator", "model": "stability-ai/sdxl", "version": "da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf", "hardware": "gpu-t4", "min_instances": 0, "max_instances": 3 }' \ https://api.replicate.com/v1/deployments ``` The response will be a JSON object describing the deployment: ```json { "owner": "acme", "name": "my-app-image-generator", "current_release": { "number": 1, "model": "stability-ai/sdxl", "version": "da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf", "created_at": "2024-02-15T16:32:57.018467Z", "created_by": { "type": "organization", "username": "acme", "name": "Acme Corp, Inc.", "github_url": "https://github.com/acme", }, "configuration": { "hardware": "gpu-t4", "min_instances": 1, "max_instances": 5 } } } ``` operationId: deployments.create requestBody: content: application/json: schema: properties: hardware: description: >- The SKU for the hardware used to run the model. Possible values can be retrieved from the `hardware.list` endpoint. type: string max_instances: description: The maximum number of instances for scaling. maximum: 20 minimum: 0 type: integer min_instances: description: The minimum number of instances for scaling. maximum: 5 minimum: 0 type: integer model: description: >- The full name of the model that you want to deploy e.g. stability-ai/sdxl. type: string name: description: The name of the deployment. type: string version: description: >- The 64-character string ID of the model version that you want to deploy. type: string required: - name - model - version - hardware - min_instances - max_instances type: object required: true responses: '200': content: application/json: schema: properties: current_release: properties: configuration: properties: hardware: description: The SKU for the hardware used to run the model. type: string max_instances: description: The maximum number of instances for scaling. type: integer min_instances: description: The minimum number of instances for scaling. type: integer type: object created_at: description: The time the release was created. format: date-time type: string created_by: properties: github_url: description: >- The GitHub URL of the account that created the release. format: uri type: string name: description: The name of the account that created the release. type: string type: description: >- The account type of the creator. Can be a user or an organization. enum: - organization - user type: string username: description: >- The username of the account that created the release. type: string type: object model: description: >- The model identifier string in the format of `{model_owner}/{model_name}`. type: string number: description: The release number. type: integer version: description: The ID of the model version used in the release. type: string type: object name: description: The name of the deployment. type: string owner: description: The owner of the deployment. type: string type: object description: Success summary: Create a Deployment tags: - Deployments /deployments/{deployment_owner}/{deployment_name}: delete: description: > Delete a deployment Deployment deletion has some restrictions: - You can only delete deployments that have been offline and unused for at least 15 minutes. Example cURL request: ```command curl -s -X DELETE \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/deployments/acme/my-app-image-generator ``` The response will be an empty 204, indicating the deployment has been deleted. operationId: deployments.delete parameters: - description: | The name of the user or organization that owns the deployment. in: path name: deployment_owner required: true schema: type: string - description: | The name of the deployment. in: path name: deployment_name required: true schema: type: string responses: '204': description: Success summary: Delete a Deployment tags: - Deployments - Name - Owner get: description: > Get information about a deployment by name including the current release. Example cURL request: ```console curl -s \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/deployments/replicate/my-app-image-generator ``` The response will be a JSON object describing the deployment: ```json { "owner": "acme", "name": "my-app-image-generator", "current_release": { "number": 1, "model": "stability-ai/sdxl", "version": "da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf", "created_at": "2024-02-15T16:32:57.018467Z", "created_by": { "type": "organization", "username": "acme", "name": "Acme Corp, Inc.", "github_url": "https://github.com/acme", }, "configuration": { "hardware": "gpu-t4", "min_instances": 1, "max_instances": 5 } } } ``` operationId: deployments.get parameters: - description: | The name of the user or organization that owns the deployment. in: path name: deployment_owner required: true schema: type: string - description: | The name of the deployment. in: path name: deployment_name required: true schema: type: string responses: '200': content: application/json: schema: properties: current_release: properties: configuration: properties: hardware: description: The SKU for the hardware used to run the model. type: string max_instances: description: The maximum number of instances for scaling. type: integer min_instances: description: The minimum number of instances for scaling. type: integer type: object created_at: description: The time the release was created. format: date-time type: string created_by: properties: github_url: description: >- The GitHub URL of the account that created the release. format: uri type: string name: description: The name of the account that created the release. type: string type: description: >- The account type of the creator. Can be a user or an organization. enum: - organization - user type: string username: description: >- The username of the account that created the release. type: string type: object model: description: >- The model identifier string in the format of `{model_owner}/{model_name}`. type: string number: description: The release number. type: integer version: description: The ID of the model version used in the release. type: string type: object name: description: The name of the deployment. type: string owner: description: The owner of the deployment. type: string type: object description: Success summary: Get a Deployment tags: - Deployments - Name - Owner patch: description: > Update properties of an existing deployment, including hardware, min/max instances, and the deployment's underlying model [version](https://replicate.com/docs/how-does-replicate-work#versions). Example cURL request: ```console curl -s \ -X PATCH \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ -H "Content-Type: application/json" \ -d '{"min_instances": 3, "max_instances": 10}' \ https://api.replicate.com/v1/deployments/acme/my-app-image-generator ``` The response will be a JSON object describing the deployment: ```json { "owner": "acme", "name": "my-app-image-generator", "current_release": { "number": 2, "model": "stability-ai/sdxl", "version": "da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf", "created_at": "2024-02-15T16:32:57.018467Z", "created_by": { "type": "organization", "username": "acme", "name": "Acme Corp, Inc.", "github_url": "https://github.com/acme", }, "configuration": { "hardware": "gpu-t4", "min_instances": 3, "max_instances": 10 } } } ``` Updating any deployment properties will increment the `number` field of the `current_release`. operationId: deployments.update parameters: - description: | The name of the user or organization that owns the deployment. in: path name: deployment_owner required: true schema: type: string - description: | The name of the deployment. in: path name: deployment_name required: true schema: type: string requestBody: content: application/json: schema: properties: hardware: description: >- The SKU for the hardware used to run the model. Possible values can be retrieved from the `hardware.list` endpoint. type: string max_instances: description: The maximum number of instances for scaling. maximum: 20 minimum: 0 type: integer min_instances: description: The minimum number of instances for scaling. maximum: 5 minimum: 0 type: integer version: description: The ID of the model version that you want to deploy type: string type: object responses: '200': content: application/json: schema: properties: current_release: properties: configuration: properties: hardware: description: The SKU for the hardware used to run the model. type: string max_instances: description: The maximum number of instances for scaling. type: integer min_instances: description: The minimum number of instances for scaling. type: integer type: object created_at: description: The time the release was created. format: date-time type: string created_by: properties: github_url: description: >- The GitHub URL of the account that created the release. format: uri type: string name: description: The name of the account that created the release. type: string type: description: >- The account type of the creator. Can be a user or an organization. enum: - organization - user type: string username: description: >- The username of the account that created the release. type: string type: object model: description: >- The model identifier string in the format of `{model_owner}/{model_name}`. type: string number: description: The release number. type: integer version: description: The ID of the model version used in the release. type: string type: object name: description: The name of the deployment. type: string owner: description: The owner of the deployment. type: string type: object description: Success summary: Update a Deployment tags: - Deployments - Name - Owner /deployments/{deployment_owner}/{deployment_name}/predictions: post: description: > Create a prediction for the deployment and inputs you provide. Example cURL request: ```console curl -s -X POST -H 'Prefer: wait' \ -d '{"input": {"prompt": "A photo of a bear riding a bicycle over the moon"}}' \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ -H 'Content-Type: application/json' \ https://api.replicate.com/v1/deployments/acme/my-app-image-generator/predictions ``` The request will wait up to 60 seconds for the model to run. If this time is exceeded the prediction will be returned in a `"starting"` state and need to be retrieved using the `predictions.get` endpiont. For a complete overview of the `deployments.predictions.create` API check out our documentation on [creating a prediction](https://replicate.com/docs/topics/predictions/create-a-prediction) which covers a variety of use cases. operationId: deployments.predictions.create parameters: - description: | The name of the user or organization that owns the deployment. in: path name: deployment_owner required: true schema: type: string - description: | The name of the deployment. in: path name: deployment_name required: true schema: type: string - $ref: '#/components/parameters/parameters_prefer_header' requestBody: content: application/json: schema: $ref: '#/components/schemas/schemas_prediction_request' responses: '201': description: >- Prediction has been created. If the `Prefer: wait` header is provided it will contain the final output. '202': description: Prediction has been created but does not yet have all outputs summary: Create a Prediction Using a Deployment tags: - Deployments - Name - Owner - Predictions /hardware: get: description: | Example cURL request: ```console curl -s \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/hardware ``` The response will be a JSON array of hardware objects: ```json [ {"name": "CPU", "sku": "cpu"}, {"name": "Nvidia T4 GPU", "sku": "gpu-t4"}, {"name": "Nvidia A40 GPU", "sku": "gpu-a40-small"}, {"name": "Nvidia A40 (Large) GPU", "sku": "gpu-a40-large"}, ] ``` operationId: hardware.list responses: '200': content: application/json: schema: items: properties: name: description: The name of the hardware. type: string sku: description: The SKU of the hardware. type: string type: object type: array description: Success summary: List Available Hardware for Models tags: - Hardware /models: get: description: > Get a paginated list of public models. Example cURL request: ```console curl -s \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/models ``` The response will be a paginated JSON array of model objects: ```json { "next": null, "previous": null, "results": [ { "url": "https://replicate.com/acme/hello-world", "owner": "acme", "name": "hello-world", "description": "A tiny model that says hello", "visibility": "public", "github_url": "https://github.com/replicate/cog-examples", "paper_url": null, "license_url": null, "run_count": 5681081, "cover_image_url": "...", "default_example": {...}, "latest_version": {...} } ] } ``` The `cover_image_url` string is an HTTPS URL for an image file. This can be: - An image uploaded by the model author. - The output file of the example prediction, if the model author has not set a cover image. - The input file of the example prediction, if the model author has not set a cover image and the example prediction has no output file. - A generic fallback image. operationId: models.list responses: '200': description: Success summary: List Public Models tags: - Models post: description: > Create a model. Example cURL request: ```console curl -s -X POST \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ -H 'Content-Type: application/json' \ -d '{"owner": "alice", "name": "my-model", "description": "An example model", "visibility": "public", "hardware": "cpu"}' \ https://api.replicate.com/v1/models ``` The response will be a model object in the following format: ```json { "url": "https://replicate.com/alice/my-model", "owner": "alice", "name": "my-model", "description": "An example model", "visibility": "public", "github_url": null, "paper_url": null, "license_url": null, "run_count": 0, "cover_image_url": null, "default_example": null, "latest_version": null, } ``` Note that there is a limit of 1,000 models per account. For most purposes, we recommend using a single model and pushing new [versions](https://replicate.com/docs/how-does-replicate-work#versions) of the model as you make changes to it. operationId: models.create requestBody: content: application/json: schema: properties: cover_image_url: description: >- A URL for the model's cover image. This should be an image file. type: string description: description: A description of the model. type: string github_url: description: A URL for the model's source code on GitHub. type: string hardware: description: >- The SKU for the hardware used to run the model. Possible values can be retrieved from the `hardware.list` endpoint. type: string license_url: description: A URL for the model's license. type: string name: description: >- The name of the model. This must be unique among all models owned by the user or organization. type: string owner: description: >- The name of the user or organization that will own the model. This must be the same as the user or organization that is making the API request. In other words, the API token used in the request must belong to this user or organization. type: string paper_url: description: A URL for the model's paper. type: string visibility: description: >- Whether the model should be public or private. A public model can be viewed and run by anyone, whereas a private model can be viewed and run only by the user or organization members that own the model. enum: - public - private type: string required: - owner - name - visibility - hardware type: object required: true responses: '201': description: Success summary: Create a Model tags: - Models query: description: > Get a list of public models matching a search query. Example cURL request: ```console curl -s -X QUERY \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ -H "Content-Type: text/plain" \ -d "hello" \ https://api.replicate.com/v1/models ``` The response will be a paginated JSON object containing an array of model objects: ```json { "next": null, "previous": null, "results": [ { "url": "https://replicate.com/acme/hello-world", "owner": "acme", "name": "hello-world", "description": "A tiny model that says hello", "visibility": "public", "github_url": "https://github.com/replicate/cog-examples", "paper_url": null, "license_url": null, "run_count": 5681081, "cover_image_url": "...", "default_example": {...}, "latest_version": {...} } ] } ``` The `cover_image_url` string is an HTTPS URL for an image file. This can be: - An image uploaded by the model author. - The output file of the example prediction, if the model author has not set a cover image. - The input file of the example prediction, if the model author has not set a cover image and the example prediction has no output file. - A generic fallback image. operationId: models.search requestBody: content: text/plain: schema: description: The search query type: string required: true responses: '200': description: Success summary: Search Public Models tags: [] /models/{model_owner}/{model_name}: delete: description: > Delete a model Model deletion has some restrictions: - You can only delete models you own. - You can only delete private models. - You can only delete models that have no versions associated with them. Currently you'll need to [delete the model's versions](#models.versions.delete) before you can delete the model itself. Example cURL request: ```command curl -s -X DELETE \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/models/replicate/hello-world ``` The response will be an empty 204, indicating the model has been deleted. operationId: models.delete parameters: - description: | The name of the user or organization that owns the model. in: path name: model_owner required: true schema: type: string - description: | The name of the model. in: path name: model_name required: true schema: type: string responses: '204': description: Success summary: Delete a Model tags: - Model - Name - Owner get: description: > Example cURL request: ```console curl -s \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/models/replicate/hello-world ``` The response will be a model object in the following format: ```json { "url": "https://replicate.com/replicate/hello-world", "owner": "replicate", "name": "hello-world", "description": "A tiny model that says hello", "visibility": "public", "github_url": "https://github.com/replicate/cog-examples", "paper_url": null, "license_url": null, "run_count": 5681081, "cover_image_url": "...", "default_example": {...}, "latest_version": {...}, } ``` The `cover_image_url` string is an HTTPS URL for an image file. This can be: - An image uploaded by the model author. - The output file of the example prediction, if the model author has not set a cover image. - The input file of the example prediction, if the model author has not set a cover image and the example prediction has no output file. - A generic fallback image. The `default_example` object is a [prediction](#predictions.get) created with this model. The `latest_version` object is the model's most recently pushed [version](#models.versions.get). operationId: models.get parameters: - description: | The name of the user or organization that owns the model. in: path name: model_owner required: true schema: type: string - description: | The name of the model. in: path name: model_name required: true schema: type: string responses: '200': description: Success summary: Get a Model tags: - Model - Name - Owner /models/{model_owner}/{model_name}/predictions: post: description: > Create a prediction for the deployment and inputs you provide. Example cURL request: ```console curl -s -X POST -H 'Prefer: wait' \ -d '{"input": {"prompt": "Write a short poem about the weather."}}' \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ -H 'Content-Type: application/json' \ https://api.replicate.com/v1/models/meta/meta-llama-3-70b-instruct/predictions ``` The request will wait up to 60 seconds for the model to run. If this time is exceeded the prediction will be returned in a `"starting"` state and need to be retrieved using the `predictions.get` endpiont. For a complete overview of the `deployments.predictions.create` API check out our documentation on [creating a prediction](https://replicate.com/docs/topics/predictions/create-a-prediction) which covers a variety of use cases. operationId: models.predictions.create parameters: - description: | The name of the user or organization that owns the model. in: path name: model_owner required: true schema: type: string - description: | The name of the model. in: path name: model_name required: true schema: type: string - $ref: '#/components/parameters/parameters_prefer_header' requestBody: content: application/json: schema: $ref: '#/components/schemas/schemas_prediction_request' responses: '201': description: >- Prediction has been created. If the `Prefer: wait` header is provided it will contain the final output. '202': description: Prediction has been created but does not yet have all outputs summary: Create a Prediction Using an Official Model tags: - Model - Name - Owner - Predictions /models/{model_owner}/{model_name}/versions: get: description: > Example cURL request: ```console curl -s \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/models/replicate/hello-world/versions ``` The response will be a JSON array of model version objects, sorted with the most recent version first: ```json { "next": null, "previous": null, "results": [ { "id": "5c7d5dc6dd8bf75c1acaa8565735e7986bc5b66206b55cca93cb72c9bf15ccaa", "created_at": "2022-04-26T19:29:04.418669Z", "cog_version": "0.3.0", "openapi_schema": {...} } ] } ``` operationId: models.versions.list parameters: - description: | The name of the user or organization that owns the model. in: path name: model_owner required: true schema: type: string - description: | The name of the model. in: path name: model_name required: true schema: type: string responses: '200': description: Success summary: List Model Versions tags: - Model - Name - Owner /models/{model_owner}/{model_name}/versions/{version_id}: delete: description: > Delete a model version and all associated predictions, including all output files. Model version deletion has some restrictions: - You can only delete versions from models you own. - You can only delete versions from private models. - You cannot delete a version if someone other than you has run predictions with it. - You cannot delete a version if it is being used as the base model for a fine tune/training. - You cannot delete a version if it has an associated deployment. - You cannot delete a version if another model version is overridden to use it. Example cURL request: ```command curl -s -X DELETE \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/models/replicate/hello-world/versions/5c7d5dc6dd8bf75c1acaa8565735e7986bc5b66206b55cca93cb72c9bf15ccaa ``` The response will be an empty 202, indicating the deletion request has been accepted. It might take a few minutes to be processed. operationId: models.versions.delete parameters: - description: | The name of the user or organization that owns the model. in: path name: model_owner required: true schema: type: string - description: | The name of the model. in: path name: model_name required: true schema: type: string - description: | The ID of the version. in: path name: version_id required: true schema: type: string responses: '202': description: >- Deletion request has been accepted. It might take a few minutes to be processed. summary: Delete a Model Version tags: - Model - Name - Owner - Version get: description: > Example cURL request: ```console curl -s \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/models/replicate/hello-world/versions/5c7d5dc6dd8bf75c1acaa8565735e7986bc5b66206b55cca93cb72c9bf15ccaa ``` The response will be the version object: ```json { "id": "5c7d5dc6dd8bf75c1acaa8565735e7986bc5b66206b55cca93cb72c9bf15ccaa", "created_at": "2022-04-26T19:29:04.418669Z", "cog_version": "0.3.0", "openapi_schema": {...} } ``` Every model describes its inputs and outputs with [OpenAPI Schema Objects](https://spec.openapis.org/oas/latest.html#schemaObject) in the `openapi_schema` property. The `openapi_schema.components.schemas.Input` property for the [replicate/hello-world](https://replicate.com/replicate/hello-world) model looks like this: ```json { "type": "object", "title": "Input", "required": [ "text" ], "properties": { "text": { "x-order": 0, "type": "string", "title": "Text", "description": "Text to prefix with 'hello '" } } } ``` The `openapi_schema.components.schemas.Output` property for the [replicate/hello-world](https://replicate.com/replicate/hello-world) model looks like this: ```json { "type": "string", "title": "Output" } ``` For more details, see the docs on [Cog's supported input and output types](https://github.com/replicate/cog/blob/75b7802219e7cd4cee845e34c4c22139558615d4/docs/python.md#input-and-output-types) operationId: models.versions.get parameters: - description: | The name of the user or organization that owns the model. in: path name: model_owner required: true schema: type: string - description: | The name of the model. in: path name: model_name required: true schema: type: string - description: | The ID of the version. in: path name: version_id required: true schema: type: string responses: '200': description: Success summary: Get a Model Version tags: - Model - Name - Owner - Version /models/{model_owner}/{model_name}/versions/{version_id}/trainings: post: description: > Start a new training of the model version you specify. Example request body: ```json { "destination": "{new_owner}/{new_name}", "input": { "train_data": "https://example.com/my-input-images.zip", }, "webhook": "https://example.com/my-webhook", } ``` Example cURL request: ```console curl -s -X POST \ -d '{"destination": "{new_owner}/{new_name}", "input": {"input_images": "https://example.com/my-input-images.zip"}}' \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ -H 'Content-Type: application/json' \ https://api.replicate.com/v1/models/stability-ai/sdxl/versions/da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf/trainings ``` The response will be the training object: ```json { "id": "zz4ibbonubfz7carwiefibzgga", "model": "stability-ai/sdxl", "version": "da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf", "input": { "input_images": "https://example.com/my-input-images.zip" }, "logs": "", "error": null, "status": "starting", "created_at": "2023-09-08T16:32:56.990893084Z", "urls": { "cancel": "https://api.replicate.com/v1/predictions/zz4ibbonubfz7carwiefibzgga/cancel", "get": "https://api.replicate.com/v1/predictions/zz4ibbonubfz7carwiefibzgga" } } ``` As models can take several minutes or more to train, the result will not be available immediately. To get the final result of the training you should either provide a `webhook` HTTPS URL for us to call when the results are ready, or poll the [get a training](#trainings.get) endpoint until it has finished. When a training completes, it creates a new [version](https://replicate.com/docs/how-does-replicate-work#terminology) of the model at the specified destination. To find some models to train on, check out the [trainable language models collection](https://replicate.com/collections/trainable-language-models). operationId: trainings.create parameters: - description: | The name of the user or organization that owns the model. in: path name: model_owner required: true schema: type: string - description: | The name of the model. in: path name: model_name required: true schema: type: string - description: | The ID of the version. in: path name: version_id required: true schema: type: string requestBody: content: application/json: schema: $ref: '#/components/schemas/schemas_training_request' responses: '201': description: Success summary: Create a Training tags: - Model - Name - Owner - Version /predictions: get: description: > Get a paginated list of all predictions created by the user or organization associated with the provided API token. This will include predictions created from the API and the website. It will return 100 records per page. Example cURL request: ```console curl -s \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/predictions ``` The response will be a paginated JSON array of prediction objects, sorted with the most recent prediction first: ```json { "next": null, "previous": null, "results": [ { "completed_at": "2023-09-08T16:19:34.791859Z", "created_at": "2023-09-08T16:19:34.907244Z", "data_removed": false, "error": null, "id": "gm3qorzdhgbfurvjtvhg6dckhu", "input": { "text": "Alice" }, "metrics": { "predict_time": 0.012683 }, "output": "hello Alice", "started_at": "2023-09-08T16:19:34.779176Z", "source": "api", "status": "succeeded", "urls": { "get": "https://api.replicate.com/v1/predictions/gm3qorzdhgbfurvjtvhg6dckhu", "cancel": "https://api.replicate.com/v1/predictions/gm3qorzdhgbfurvjtvhg6dckhu/cancel" }, "model": "replicate/hello-world", "version": "5c7d5dc6dd8bf75c1acaa8565735e7986bc5b66206b55cca93cb72c9bf15ccaa", } ] } ``` `id` will be the unique ID of the prediction. `source` will indicate how the prediction was created. Possible values are `web` or `api`. `status` will be the status of the prediction. Refer to [get a single prediction](#predictions.get) for possible values. `urls` will be a convenience object that can be used to construct new API requests for the given prediction. If the requested model version supports streaming, this will have a `stream` entry with an HTTPS URL that you can use to construct an [`EventSource`](https://developer.mozilla.org/en-US/docs/Web/API/EventSource). `model` will be the model identifier string in the format of `{model_owner}/{model_name}`. `version` will be the unique ID of model version used to create the prediction. `data_removed` will be `true` if the input and output data has been deleted. operationId: predictions.list responses: '200': description: Success summary: List Predictions tags: - Predictions post: description: > Create a prediction for the model version and inputs you provide. Example cURL request: ```console curl -s -X POST -H 'Prefer: wait' \ -d '{"version": "5c7d5dc6dd8bf75c1acaa8565735e7986bc5b66206b55cca93cb72c9bf15ccaa", "input": {"text": "Alice"}}' \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ -H 'Content-Type: application/json' \ https://api.replicate.com/v1/predictions ``` The request will wait up to 60 seconds for the model to run. If this time is exceeded the prediction will be returned in a `"starting"` state and need to be retrieved using the `predictions.get` endpiont. For a complete overview of the `predictions.create` API check out our documentation on [creating a prediction](https://replicate.com/docs/topics/predictions/create-a-prediction) which covers a variety of use cases. operationId: predictions.create parameters: - $ref: '#/components/parameters/parameters_prefer_header' requestBody: content: application/json: schema: $ref: '#/components/schemas/schemas_version_prediction_request' responses: '201': description: >- Prediction has been created. If the `Prefer: wait` header is provided it will contain the final output. '202': description: Prediction has been created but does not yet have all outputs summary: Create a Prediction tags: - Predictions /predictions/{prediction_id}: get: description: > Get the current state of a prediction. Example cURL request: ```console curl -s \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/predictions/gm3qorzdhgbfurvjtvhg6dckhu ``` The response will be the prediction object: ```json { "id": "gm3qorzdhgbfurvjtvhg6dckhu", "model": "replicate/hello-world", "version": "5c7d5dc6dd8bf75c1acaa8565735e7986bc5b66206b55cca93cb72c9bf15ccaa", "input": { "text": "Alice" }, "logs": "", "output": "hello Alice", "error": null, "status": "succeeded", "created_at": "2023-09-08T16:19:34.765994Z", "data_removed": false, "started_at": "2023-09-08T16:19:34.779176Z", "completed_at": "2023-09-08T16:19:34.791859Z", "metrics": { "predict_time": 0.012683 }, "urls": { "cancel": "https://api.replicate.com/v1/predictions/gm3qorzdhgbfurvjtvhg6dckhu/cancel", "get": "https://api.replicate.com/v1/predictions/gm3qorzdhgbfurvjtvhg6dckhu" } } ``` `status` will be one of: - `starting`: the prediction is starting up. If this status lasts longer than a few seconds, then it's typically because a new worker is being started to run the prediction. - `processing`: the `predict()` method of the model is currently running. - `succeeded`: the prediction completed successfully. - `failed`: the prediction encountered an error during processing. - `canceled`: the prediction was canceled by its creator. In the case of success, `output` will be an object containing the output of the model. Any files will be represented as HTTPS URLs. You'll need to pass the `Authorization` header to request them. In the case of failure, `error` will contain the error encountered during the prediction. Terminated predictions (with a status of `succeeded`, `failed`, or `canceled`) will include a `metrics` object with a `predict_time` property showing the amount of CPU or GPU time, in seconds, that the prediction used while running. It won't include time waiting for the prediction to start. All input parameters, output values, and logs are automatically removed after an hour, by default, for predictions created through the API. You must save a copy of any data or files in the output if you'd like to continue using them. The `output` key will still be present, but it's value will be `null` after the output has been removed. Output files are served by `replicate.delivery` and its subdomains. If you use an allow list of external domains for your assets, add `replicate.delivery` and `*.replicate.delivery` to it. operationId: predictions.get parameters: - description: | The ID of the prediction to get. in: path name: prediction_id required: true schema: type: string responses: '200': description: Success summary: Get a Prediction tags: - Predictions /predictions/{prediction_id}/cancel: post: operationId: predictions.cancel parameters: - description: | The ID of the prediction to cancel. in: path name: prediction_id required: true schema: type: string responses: '200': description: Success summary: Cancel a Prediction tags: - Cancel - Predictions /trainings: get: description: > Get a paginated list of all trainings created by the user or organization associated with the provided API token. This will include trainings created from the API and the website. It will return 100 records per page. Example cURL request: ```console curl -s \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/trainings ``` The response will be a paginated JSON array of training objects, sorted with the most recent training first: ```json { "next": null, "previous": null, "results": [ { "completed_at": "2023-09-08T16:41:19.826523Z", "created_at": "2023-09-08T16:32:57.018467Z", "error": null, "id": "zz4ibbonubfz7carwiefibzgga", "input": { "input_images": "https://example.com/my-input-images.zip" }, "metrics": { "predict_time": 502.713876 }, "output": { "version": "...", "weights": "..." }, "started_at": "2023-09-08T16:32:57.112647Z", "source": "api", "status": "succeeded", "urls": { "get": "https://api.replicate.com/v1/trainings/zz4ibbonubfz7carwiefibzgga", "cancel": "https://api.replicate.com/v1/trainings/zz4ibbonubfz7carwiefibzgga/cancel" }, "model": "stability-ai/sdxl", "version": "da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf", } ] } ``` `id` will be the unique ID of the training. `source` will indicate how the training was created. Possible values are `web` or `api`. `status` will be the status of the training. Refer to [get a single training](#trainings.get) for possible values. `urls` will be a convenience object that can be used to construct new API requests for the given training. `version` will be the unique ID of model version used to create the training. operationId: trainings.list responses: '200': description: Success summary: List Trainings tags: - Trainings /trainings/{training_id}: get: description: > Get the current state of a training. Example cURL request: ```console curl -s \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/trainings/zz4ibbonubfz7carwiefibzgga ``` The response will be the training object: ```json { "completed_at": "2023-09-08T16:41:19.826523Z", "created_at": "2023-09-08T16:32:57.018467Z", "error": null, "id": "zz4ibbonubfz7carwiefibzgga", "input": { "input_images": "https://example.com/my-input-images.zip" }, "logs": "...", "metrics": { "predict_time": 502.713876 }, "output": { "version": "...", "weights": "..." }, "started_at": "2023-09-08T16:32:57.112647Z", "status": "succeeded", "urls": { "get": "https://api.replicate.com/v1/trainings/zz4ibbonubfz7carwiefibzgga", "cancel": "https://api.replicate.com/v1/trainings/zz4ibbonubfz7carwiefibzgga/cancel" }, "model": "stability-ai/sdxl", "version": "da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf", } ``` `status` will be one of: - `starting`: the training is starting up. If this status lasts longer than a few seconds, then it's typically because a new worker is being started to run the training. - `processing`: the `train()` method of the model is currently running. - `succeeded`: the training completed successfully. - `failed`: the training encountered an error during processing. - `canceled`: the training was canceled by its creator. In the case of success, `output` will be an object containing the output of the model. Any files will be represented as HTTPS URLs. You'll need to pass the `Authorization` header to request them. In the case of failure, `error` will contain the error encountered during the training. Terminated trainings (with a status of `succeeded`, `failed`, or `canceled`) will include a `metrics` object with a `predict_time` property showing the amount of CPU or GPU time, in seconds, that the training used while running. It won't include time waiting for the training to start. operationId: trainings.get parameters: - description: | The ID of the training to get. in: path name: training_id required: true schema: type: string responses: '200': description: Success summary: Get a Training tags: - Training /trainings/{training_id}/cancel: post: operationId: trainings.cancel parameters: - description: | The ID of the training you want to cancel. in: path name: training_id required: true schema: type: string responses: '200': description: Success summary: Cancel a Training tags: - Cancel - Training /webhooks/default/secret: get: description: > Get the signing secret for the default webhook endpoint. This is used to verify that webhook requests are coming from Replicate. Example cURL request: ```console curl -s \ -H "Authorization: Bearer $REPLICATE_API_TOKEN" \ https://api.replicate.com/v1/webhooks/default/secret ``` The response will be a JSON object with a `key` property: ```json { "key": "..." } ``` operationId: webhooks.default.secret.get responses: '200': content: application/json: schema: properties: key: description: The signing secret. type: string type: object description: Success summary: Get the Signing Secret for the Default Webhook tags: - Secrets - Webhooks security: - bearerAuth: [] servers: - url: https://api.replicate.com/v1 tags: - name: Accounts - name: Cancel - name: Collections - name: Deployments - name: Hardware - name: Model - name: Models - name: Name - name: Owner - name: Predictions - name: Secrets - name: Slug - name: Training - name: Trainings - name: Version - name: Webhooks