openapi: 3.0.3 info: title: Data Repository Service version: 1.5.0 x-logo: url: >- https://www.ga4gh.org/wp-content/themes/ga4gh/dist/assets/svg/logos/logo-full-color.svg termsOfService: https://www.ga4gh.org/terms-and-conditions/ contact: name: GA4GH Cloud Work Stream email: ga4gh-cloud@ga4gh.org license: name: Apache 2.0 url: >- https://raw.githubusercontent.com/ga4gh/data-repository-service-schemas/master/LICENSE servers: - url: https://{serverURL}/ga4gh/drs/v1 variables: serverURL: default: drs.example.org description: > DRS server endpoints MUST be prefixed by the '/ga4gh/drs/v1' endpoint path security: - {} - BasicAuth: [] - BearerAuth: [] tags: - name: Introduction description: > The Data Repository Service (DRS) API provides a generic interface to data repositories so data consumers, including workflow systems, can access data objects in a single, standard way regardless of where they are stored and how they are managed. The primary functionality of DRS is to map a logical ID to a means for physically retrieving the data represented by the ID. The sections below describe the characteristics of those IDs, the types of data supported, how they can be pointed to using URIs, and how clients can use these URIs to ultimately make successful DRS API requests. This document also describes the DRS API in detail and provides information on the specific endpoints, request formats, and responses. This specification is intended for developers of DRS-compatible services and of clients that will call these DRS services. The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in [RFC 2119](https://datatracker.ietf.org/doc/html/rfc2119). - name: DRS API Principles description: > ## DRS IDs Each implementation of DRS can choose its own id scheme, as long as it follows these guidelines: * DRS IDs are strings made up of uppercase and lowercase letters, decimal digits, hyphen, period, underscore and tilde [A-Za-z0-9.-_~]. See [RFC 3986 § 2.3](https://datatracker.ietf.org/doc/html/rfc3986#section-2.3). * DRS IDs can contain other characters, but they MUST be encoded into valid DRS IDs whenever they are used in API calls. This is because non-encoded IDs may interfere with the interpretation of the objects/{id}/access endpoint. To overcome this limitation use percent-encoding of the ID, see [RFC 3986 § 2.4](https://datatracker.ietf.org/doc/html/rfc3986#section-2.4) * One DRS ID MUST always return the same object data (or, in the case of a collection, the same set of objects). This constraint aids with reproducibility. * DRS implementations MAY have more than one ID that maps to the same object. * DRS version 1.x does NOT support semantics around multiple versions of an object. (For example, there’s no notion of “get latest version” or “list all versions”.) Individual implementations MAY choose an ID scheme that includes version hints. ## DRS URIs For convenience, including when passing content references to a [WES server](https://github.com/ga4gh/workflow-execution-service-schemas), we define a [URI scheme](https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Generic_syntax) for DRS-accessible content. This section documents the syntax of DRS URIs, and the rules clients follow for translating a DRS URI into a URL that they use for making the DRS API calls described in this spec. There are two styles of DRS URIs, Hostname-based and Compact Identifier-based, both using the `drs://` URI scheme. DRS servers may choose either style when exposing references to their content;. DRS clients MUST support resolving both styles. Tip: > See [Appendix: Background Notes on DRS URIs](#tag/Background-Notes-on-DRS-URIs) for more information on our design motivations for DRS URIs. ### Hostname-based DRS URIs Hostname-based DRS URIs are simpler than compact identifier-based URIs. They contain the DRS server name and the DRS ID only and can be converted directly into a fetchable URL based on a simple rule. They take the form: ``` drs:/// ``` DRS URIs of this form mean *\"you can fetch the content with DRS id \ from the DRS server at \\"*. For example, here are the client resolution steps if the URI is: ``` drs://drs.example.org/314159 ``` 1. The client parses the string to extract the hostname of “drs.example.org” and the id of “314159”. 2. The client makes a GET request to the DRS server, using the standard DRS URL syntax: ``` GET https://drs.example.org/ga4gh/drs/v1/objects/314159 ``` The protocol is always https and the port is always the standard 443 SSL port. It is invalid to include a different port in a DRS hostname-based URI. Tip: > See the [Appendix: Hostname-Based URIs](#tag/Hostname-Based-URIs) for information on how hostname-based DRS URI resolution to URLs is likely to change in the future, when the DRS v2 major release happens. ### Compact Identifier-based DRS URIs Compact Identifier-based DRS URIs use resolver registry services (specifically, [identifiers.org](https://identifiers.org/) and [n2t.net (Name-To-Thing)](https://n2t.net/)) to provide a layer of indirection between the DRS URI and the DRS server name — the actual DNS name of the DRS server is not present in the URI. This approach is based on the Joint Declaration of Data Citation Principles as detailed by [Wimalaratne et al (2018)](https://www.nature.com/articles/sdata201829). For more information, see the document [More Background on Compact Identifiers](./more-background-on-compact-identifiers.html). Compact Identifiers take the form: ``` drs://[provider_code/]namespace:accession ``` Together, provider code and the namespace are referred to as the `prefix`. The provider code is optional and is used by identifiers.org/n2t.net for compact identifier resolver mirrors. Both the `provider_code` and `namespace` disallow spaces or punctuation, only lowercase alphanumerical characters, underscores and dots are allowed (e.g. [A-Za-z0-9._]). Tip: > See the [Appendix: Compact Identifier-Based URIs](#tag/Compact-Identifier-Based-URIs) for more background on Compact Identifiers and resolver registry services like identifiers.org/n2t.net (aka meta-resolvers), how to register prefixes, possible caching strategies, and security considerations. #### For DRS Servers If your DRS implementation will issue DRS URIs based *on your own* compact identifiers, you MUST first register a new prefix with identifiers.org (which is automatically mirrored to n2t.net). You will also need to include a provider resolver resource in this registration which links the prefix to your DRS server, so that DRS clients can get sufficient information to make a successful DRS GET request. For clarity, we recommend you choose a namespace beginning with `drs`. #### For DRS Clients A DRS client parses the DRS URI compact identifier components to extract the prefix and the accession, and then uses meta-resolver APIs to locate the actual DRS server. For example, here are the client resolution steps if the URI is: ``` drs://drs.42:314159 ``` 1. The client parses the string to extract the prefix of `drs.42` and the accession of `314159`, using the first occurrence of a colon (":") character after the initial `drs://` as a delimiter. (The colon character is not allowed in a Hostname-based DRS URI, making it easy to tell them apart.) 2. The client makes API calls to a meta-resolver to look up the URL pattern for the namespace. (See [Calling Meta-Resolver APIs for Compact Identifier-Based DRS URIs](#section/Calling-Meta-Resolver-APIs-for-Compact-Identifier-Based-DRS-URIs) for details.) The URL pattern is a string containing a `{$id}` parameter, such as: ``` https://drs.myexample.org/ga4gh/drs/v1/objects/{$id} ``` 3. The client generates a DRS URL from the URL template by replacing {$id} with the accession it extracted in step 1. It then makes a GET request to the DRS server: ``` GET https://drs.myexample.org/ga4gh/drs/v1/objects/314159 ``` 4. The client follows any HTTP redirects returned in step 3, in case the resolver goes through an extra layer of redirection. For performance reasons, DRS clients SHOULD cache the URL pattern returned in step 2, with a suggested 24 hour cache life. ### Choosing a URI Style DRS servers can choose to issue either hostname-based or compact identifier-based DRS URIs, and can be confident that compliant DRS clients will support both. DRS clients must be able to accommodate both URI types. Tradeoffs that DRS server builders, and third parties who need to cite DRS objects in datasets, workflows or elsewhere, may want to consider include: *Table 1: Choosing a URI Style* | | Hostname-based | Compact Identifier-based | |-------------------|----------------|--------------------------| | URI Durability | URIs are valid for as long as the server operator maintains ownership of the published DNS address. (They can of course point that address at different physical serving infrastructure as often as they would like.) | URIs are valid for as long as the server operator maintains ownership of the published compact identifier resolver namespace. (They also depend on the meta-resolvers like identifiers.org/n2t.net remaining operational, which is intended to be essentially forever.) | | Client Efficiency | URIs require minimal client logic, and no network requests, to resolve. | URIs require small client logic, and 1-2 cacheable network requests, to resolve. | | Security | Servers have full control over their own security practices. | Server operators, in addition to maintaining their own security practices, should confirm they are comfortable with the resolver registry security practices, including protection against denial of service and namespace-hijacking attacks. (See the [Appendix: Compact Identifier-Based URIs](#tag/Compact-Identifier-Based-URIs) for more information on resolver registry security.) | ## DRS Datatypes DRS's job is data access, period. Therefore, the DRS API supports a simple flat content model -- every `DrsObject`, like a file, represents a single opaque blob of bytes. DRS has no understanding of the meaning of objects and only provides simple domain-agnostic metadata. Understanding the semantics of specific object types is the responsibility of the applications that use DRS to fetch those objects (e.g. samtools for BAM files, DICOM viewers for DICOM objects). ### Atomic Objects DRS can be used to access individual objects of all kinds, simple or complex, large or small, stored in type-specific formats (e.g. BAM files, VCF files, CSV files). At the API level these are all the same; at the application level, DRS clients and servers are expected to agree on object semantics using non-DRS mechanisms, including but not limited to the GA4GH Data Connect API. ### Compound Objects DRS can also be used to access compound objects, consisting of two or more atomic objects related to each other in a well-specified way. See the [Appendix: Compound Objects](#tag/Working-With-Compound-Objects) for suggested best practices for working with compound objects. ### [DEPRECATED] Bundles Previous versions of the DRS API spec included support for a *bundle* content type, which was a folder-like collection of other DRS objects (either blobs or bundles), represented by a `DrsObject` with a `contents` array. As of v1.3, bundles have been deprecated in favor of the best practices documented in the [Appendix: Compound Objects](#tag/Working-With-Compound-Objects). A future version of the API spec may remove bundle support entirely and/or replace bundles with a scalable approach based on the needs of our driver projects. ## Read-only DRS v1 is a read-only API. We expect that each implementation will define its own mechanisms and interfaces (graphical and/or programmatic) for adding and updating data. ## Standards The DRS API specification is written in OpenAPI and embodies a RESTful service philosophy. It uses JSON in requests and responses and standard HTTPS on port 443 for information transport. Optionally, it supports authentication and authorization using the [GA4GH Passport](https://github.com/ga4gh-duri/ga4gh-duri.github.io/tree/master/researcher_ids) standard. - name: Authorization & Authentication description: > ## Making DRS Requests The DRS implementation is responsible for defining and enforcing an authorization policy that determines which users are allowed to make which requests. GA4GH recommends that DRS implementations use an OAuth 2.0 [bearer token](https://oauth.net/2/bearer-tokens/) or a [GA4GH Passport](https://github.com/ga4gh-duri/ga4gh-duri.github.io/tree/master/researcher_ids), although they can choose other mechanisms if appropriate. ## Fetching DRS Objects The DRS API allows implementers to support a variety of different content access policies, depending on what `AccessMethod` records they return. Implementers have a choice to make the GET /objects/{object_id} and GET /objects/{object_id}/access/{access_id} calls open or requiring a Basic, Bearer, or Passport token (Passport requiring a POST). The following describes the various access approaches following a successful GET/POST /objects/{object_id} request in order to them obtain access to the bytes for a given object ID/access ID: * public content: * server provides an `access_url` with a `url` and no `headers` * caller fetches the object bytes without providing any auth info * private content that requires the caller to have out-of-band auth knowledge (e.g. service account credentials): * server provides an `access_url` with a `url` and no `headers` * caller fetches the object bytes, passing the auth info they obtained out-of-band * private content that requires the caller to pass an Authorization token: * server provides an `access_url` with a `url` and `headers` * caller fetches the object bytes, passing auth info via the specified header(s) * private content that uses an expensive-to-generate auth mechanism (e.g. a signed URL): * server provides an `access_id` * caller passes the `access_id` to the `/access` endpoint * server provides an `access_url` with the generated mechanism (e.g. a signed URL in the `url` field) * caller fetches the object bytes from the `url` (passing auth info from the specified headers, if any) In the approaches above [GA4GH Passports](https://github.com/ga4gh-duri/ga4gh-duri.github.io/tree/master/researcher_ids) are not mentioned and that is on purpose. A DRS server may return a Bearer token or other platform-specific token in a header in response to a valid Bearer token or GA4GH Passport (Option 3 above). But it is not the responsibility of a DRS server to return a Passport, that is the responsibility of a Passport Broker and outside the scope of DRS. DRS implementers should ensure their solutions restrict access to targets as much as possible, detect attempts to exploit through log monitoring, and they are prepared to take action if an exploit in their DRS implementation is detected. ## Authentication ### Discovery The APIs to fetch [DrsObjects](#tag/DrsObjectModel) and [AccessURLs](#tag/AccessURLModel) may require authorization. The authorization mode may vary between DRS objects hosted by a service. The authorization mode may vary between the APIs to fetch a [DrsObject](#tag/DrsObjectModel) and an associated [AccessURL](#tag/AccessURLModel). Implementers should indicate how to authenticate to fetch a [DrsObject](#tag/DrsObjectModel) by implementing the [OptionsOjbect](#operation/OptionsObject) API. Implementers should indicate how to authenticate to fetch an [AccessURL](#tag/AccessURLModel) within a [DrsObject](#tag/DrsObjectModel). ### Modes #### BasicAuth A valid authorization token must be passed in the 'Authorization' header, e.g. "Basic ${token_string}" | Security Scheme Type | HTTP | |----------------------|------| | **HTTP Authorization Scheme** | basic | #### BearerAuth A valid authorization token must be passed in the 'Authorization' header, e.g. "Bearer ${token_string}" | Security Scheme Type | HTTP | |----------------------|------| | **HTTP Authorization Scheme** | bearer | #### PassportAuth A valid authorization [GA4GH Passport](https://github.com/ga4gh-duri/ga4gh-duri.github.io/tree/master/researcher_ids) token must be passed in the body of a POST request | Security Scheme Type | HTTP | |----------------------|------| | **HTTP POST** | tokens[] | - name: Objects - name: Upload Request description: > # Upload Requests and Object Registration > **Optional Functionality**: Upload and object registration are optional DRS extensions. Clients should check `/service-info` for `uploadRequestSupported` and `objectRegistrationSupported` before attempting to use these endpoints. The DRS upload and object registration endpoints allows clients to negotiate with servers on mutually convenient storage backends and then register uploads as DRS objects through a three-phase workflow: 1. **Request Upload URLs**: POST `/upload-request` with file metadata to receive upload methods and credentials 2. **Upload Files**: Use returned URLs and credentials to upload files to storage using storage provider specific upload mechanisms. DRS is not involved in this step at all, DRS simply enables clients and servers to agree on a mutually convenient storage service. 3. **Register Objects**: POST `/objects/register` to register DRS objects with the server This approach separates storage service and credential negotiation from file transfer and object registration, supporting a vendor-neutral means of sharing data in a DRS network. The `/objects/register` endpoint can be used independently to register existing data without using the `/upload-request` endpoint, and servers can choose to only support object registration and not uploads by setting the `uploadRequestSupported` and `objectRegistrationSupported` flags appropriately in `/service-info`. Upload requests and object registration endpoints only support bulk requests to simplify implementation and reflect real-world usage patterns. Bioinformatics workflows often involve uploading multiple related files together (e.g., BAM and VCF files with their indices, or analysis result sets), making bulk operations a natural fit. Single files/objects are handled as lists with one element. Implementations of the `/objects/register` endpoint SHOULD implement transaction semantics so that either all of the objects included in the request are successfully registered or none of them are, and clients should be robust to this behaviour. Transaction semantics for the `/upload-request` are encouraged but not required due to the variety and complexity of data transfer technologies. The `/upload-request` endpoint does not result in any state that needs to be maintained on the DRS server (intermediate DRS object IDs etc.) it is simply a means for a server to provide details of where a client can upload data, and the server should ensure that it trusts the client before providing such details (e.g. with appropriate authentication and authorisation before processing the request). This means that if uploads fail and there is no later call to `/objects/register` there is no DRS state to manage, simplifying server implementation. However, servers SHOULD ensure that any data from unsuccessful uploads (e.g. incomplete multi-part uploads) are cleaned up, for example by using lifecycle configuration in the backend storage. There is _no_ means of requiring that a client ultimately registers a DRS object pointing at data uploaded, and so servers should consider implementing some form of storage "garbage collection", a straightforward approach is to set a short lifecycle policy on the upload location and move uploaded data that is later registered as DRS objects to other locations, updating object `access_methods` accordingly. Servers should also implement some means of constraining upload size (quotas etc.) to protect against accidental or malicious unconstrained uploads. Servers can choose to validate that the uploads match the claimed object size when `/objects/register` is called, and should sdvertise this behaviour with the `validateFileSizes` flag in `/service-info`. The `/upload-request` endpoint can return one or more `upload_methods` of different types for each requested file, and backend specific details such as bucket names, object keys and credentials are supplied in a generic `upload_details` field. A straightforward implementation might return an single time-limited pre-signed POST URL as the `post_url` for an `upload_method` of type `https` which incorporates authentication into the URL, but because DRS is often used for large files such as BAMs and CRAMs this specification also supports more sophisticated upload approaches implemented by cloud storage backends such as multi-part uploads, automatic retries etc. The `upload_details` field can be used to include bucket names, keys and temporary credentials that can be used in native clients and SDKs. This offers a natural way to adapt this protocol to new storage technologies. Refer to the examples below for some suggested implementations. ## Service Discovery Check `/service-info` for upload capabilities: ```json { "drs": { "uploadRequestSupported": true, "objectRegistrationSupported": true, "supportedUploadMethodTypes": ["s3", "https"], "maxUploadSize": 5368709120, "maxUploadRequestLength": 50, "maxRegisterRequestLength": 50, "validateChecksums": true, "validateFileSizes": false, "relatedFileStorageSupported": true } } ``` Upload related fields: - `uploadRequestSupported`: Upload request operations available via `/upload-request` - `objectRegistrationSupported`: Object registration operations available via `/objects/register` - `supportedUploadMethodTypes`: Available storage backends - `maxUploadSize`: File size limit (bytes) - `maxUploadRequestLength`: Files per request limit for upload requests - `maxRegisterRequestLength`: Candidate objects per request limit for registration - `validateChecksums`/`validateFileSizes`: Server validation behavior - `relatedFileStorageSupported`: Files from same upload request will be stored under common prefixes ## Upload Methods Upon receipt of a request for an upload method for a specific file, the server will respond with an array of `upload_methods`, each with associated `type` and corresponding `upload_details` with upload locations, temporary credentials etc. These details are specific to backend implementations. Example storage backends: - **https**: Presigned POST URLs for HTTP uploads - **s3**: Direct S3 upload with temporary AWS credentials for an IAM session policy. - **gs**: Google Cloud Storage with OAuth2 tokens - **ftp/sftp**: Traditional file transfer protocols using negotiated credentials Servers may return a subset of advertised methods based on file characteristics, for example they may choose to store large objects such as WGS BAM files in different backends to small csv files. Clients can request specific upload methods in the initial request. ## Related File Storage (Optional) Servers MAY support storing files from the same upload request under common prefixes, enabling bioinformatics workflows that expect co-located files: - **CRAM + CRAI**: Alignment files with index files - **VCF + TBI**: Variant files with tabix indexes - **FASTQ.ora + ORADATA.tar.gz**: Compressed files with associated reference data Check `relatedFileStorageSupported` in service-info or examine upload URLs for common prefixes. ## Object Registration After upload, clients can register files in bulk as DRS objects using POST `/objects/register`. Registration is all-or-nothing. If any candidate object fails to be registered in the server, the entire request fails and no objects are registered. **Candidate DRS object equirements**: - Complete metadata (name, size, checksums, MIME type) - Access methods pointing to file locations - Valid authorization (if required) - Do not include fields managed by the DRS server (id, self_uri, timestamps) Upon receipt of candidate objects for registration the server will create unique object IDs and returns complete DRS objects. Note that the server is not obliged to retain the clients supplied `access_methods` and is free to move data to different locations/backends once the object is registered. This means that a server can choose to receive uploads in an untrusted "dropzone", with hard quotas and additional security, and then move them to more permanent storage once the DRS object is registered and any validation is successful. Clients SHOULD NOT cache the response from `/objects/register` as the `access_methods` may change after registration. The `/objects/register` endpoint can also be used independently to register existing data that is already stored in accessible locations, without using the `/upload-request` workflow. This is useful for registering pre-existing datasets or files uploaded through other means. Servers may choose only to support registration and not uploads, and should advertise this in `/service-info` ## Authentication & Validation **Authentication**: Supports GA4GH Passports, Basic auth, and Bearer tokens. **Checksums**: Required for all files (SHA-256, MD5, or IANA-registered algorithms). Servers MAY validate checksums and file sizes as advertised in service-info flags. ## Error Handling **Client Errors (4xx)**: - Invalid metadata (400) - Missing auth (401) - Insufficient permissions (403) **Server Errors (5xx)**: - Storage unavailable (500) - Capacity limits (503) ## Best Practices **Clients**: Check service-info first, calculate checksums, select supported upload methods, be robust to failed object registration **Servers**: Use short-lived tightly scoped credentials, support multiple upload methods, implement rate limiting, ensure unique storage backend names to avoid inadvertent overwrites (e.g. using UUIDs), ensure that quotas are enforced and incomplete or unregistered uploads are deleted **Security**: Time and scope-limited credentials, single-use URLs, logging for audit ## Security Considerations **Credential Scoping**: Implementers SHOULD scope upload credentials to the minimum necessary permissions and duration. Credentials should: - Allow write access only to the specific upload URL/path provided - Have the shortest practical expiration time (e.g. 15 minutes to 1 hour) - Be restricted to the specific file size and content type when possible - Not grant broader storage access beyond the intended upload location This principle of least privilege reduces security exposure if credentials are compromised or misused. ## Example Workflows ### Simple HTTPS Upload Upload Request: ```http POST /upload-request Content-Type: application/json { "requests": [ { "name": "variants.vcf", "size": 52428800, "mime_type": "text/plain", "checksums": [ { "checksum": "5d41402abc4b2a76b9719d911017c592", "type": "md5" } ], "upload_method_types": ["https"] } ] } ``` Response: ```json { "responses": [ { "name": "variants.vcf", "size": 52428800, "mime_type": "text/plain", "checksums": [ { "checksum": "5d41402abc4b2a76b9719d911017c592", "type": "md5" } ], "upload_methods": [ { "type": "https", "access_url": { "url": "https://uploads.example.org/variants.vcf" }, "upload_details": { "post_url": { "url": "https://uploads.example.org/presigned-upload?signature=FAKE_SIG", "headers": ["Header1", "Header2"] } } } ] } ] } ``` Upload via HTTPS: ```bash # Simple PUT upload to presigned POST URL curl -X PUT "https://uploads.example.org/presigned-upload?signature=FAKE_SIG" -H "Header1" -H "Header2" \ --data-binary @variants.vcf ``` Register DRS Object: ```http POST /objects/register Content-Type: application/json { "candidates": [ { "name": "variants.vcf", "size": 52428800, "mime_type": "text/plain", "checksums": [ { "checksum": "5d41402abc4b2a76b9719d911017c592", "type": "md5" } ], "access_methods": [ { "type": "https", "access_url": { "url": "https://uploads.example.org/variants.vcf" } } ], "description": "Variant calls in VCF format" } ] } ``` Response: ```json { "objects": [ { "id": "drs_obj_f6e5d4c3b2a1", "self_uri": "drs://drs.example.org/drs_obj_f6e5d4c3b2a1", "name": "variants.vcf", "size": 52428800, "mime_type": "text/plain", "created_time": "2024-01-15T10:45:00Z", "updated_time": "2024-01-15T10:45:00Z", "version": "1.0", "checksums": [ { "checksum": "5d41402abc4b2a76b9719d911017c592", "type": "md5" } ], "access_methods": [ { "type": "https", "access_url": { "url": "https://uploads.example.org/variants.vcf" } } ], "description": "Variant calls in VCF format" } ] } ``` ### S3 Bulk Upload (BAM + Index) Request Upload Methods for Related Files ```http POST /upload-request Content-Type: application/json { "requests": [ { "name": "sample.bam", "size": 1073741824, "mime_type": "application/octet-stream", "checksums": [ { "checksum": "d41d8cd98f00b204e9800998ecf8427e", "type": "md5" } ], "upload_method_types": ["s3"] }, { "name": "sample.bam.bai", "size": 2097152, "mime_type": "application/octet-stream", "checksums": [ { "checksum": "098f6bcd4621d373cade4e832627b4f6", "type": "md5" } ], "upload_method_types": ["s3"] } ] } ``` Response: ```json { "responses": [ { "name": "sample.bam", "size": 1073741824, "mime_type": "application/octet-stream", "checksums": [ { "checksum": "d41d8cd98f00b204e9800998ecf8427e", "type": "md5" } ], "upload_methods": [ { "type": "s3", "access_url": { "url": "s3://genomics-uploads/x7k9m/sample.bam" }, "upload_details": { "bucket": "genomics-uploads", "key": "x7k9m/sample.bam", "access_key_id": "FAKE_ACCESS_KEY_123", "secret_access_key": "FAKE_SECRET_KEY_456", "session_token": "FAKE_SESSION_TOKEN_789", "expires_at": "2024-01-15T12:00:00Z" } } ] }, { "name": "sample.bam.bai", "size": 2097152, "mime_type": "application/octet-stream", "checksums": [ { "checksum": "098f6bcd4621d373cade4e832627b4f6", "type": "md5" } ], "upload_methods": [ { "type": "s3", "access_url": { "url": "s3://genomics-uploads/x7k9m/sample.bam.bai" }, "upload_details": { "bucket": "genomics-uploads", "key": "x7k9m/sample.bam.bai", "access_key_id": "FAKE_ACCESS_KEY_123", "secret_access_key": "FAKE_SECRET_KEY_456", "session_token": "FAKE_SESSION_TOKEN_789", "expires_at": "2024-01-15T12:00:00Z" } } ] } ] } ``` Upload Both Files to S3: ```bash # Upload BAM and index files using the supplied credentials (note common prefix) export AWS_ACCESS_KEY_ID=... aws s3 cp sample.bam s3://genomics-uploads/x7k9m/sample.bam aws s3 cp sample.bam.bai s3://genomics-uploads/x7k9m/sample.bam.bai ``` Register Both DRS Objects: ```http POST /objects/register Content-Type: application/json { "candidates": [ { "name": "sample.bam", "size": 1073741824, "mime_type": "application/octet-stream", "checksums": [ { "checksum": "d41d8cd98f00b204e9800998ecf8427e", "type": "md5" } ], "access_methods": [ { "type": "s3", "access_id": "s3", "access_url": { "url": "s3://genomics-uploads/x7k9m/sample.bam" } } ], "description": "BAM alignment file" }, { "name": "sample.bam.bai", "size": 2097152, "mime_type": "application/octet-stream", "checksums": [ { "checksum": "098f6bcd4621d373cade4e832627b4f6", "type": "md5" } ], "access_methods": [ { "type": "s3", "access_id": "s3", "access_url": { "url": "s3://genomics-uploads/x7k9m/sample.bam.bai" } } ], "description": "BAM index file" } ] } ``` Response: ```json { "objects": [ { "id": "drs_obj_a1b2c3d4e5f6", "self_uri": "drs://drs.example.org/drs_obj_a1b2c3d4e5f6", "name": "sample.bam", "size": 1073741824, "mime_type": "application/octet-stream", "created_time": "2024-01-15T10:30:00Z", "updated_time": "2024-01-15T10:30:00Z", "version": "1.0", "checksums": [ { "checksum": "d41d8cd98f00b204e9800998ecf8427e", "type": "md5" } ], "access_methods": [ { "type": "s3", "access_id": "s3", "access_url": { "url": "s3://genomics-uploads/x7k9m/sample.bam" } } ], "description": "BAM alignment file" }, { "id": "drs_obj_b2c3d4e5f6a1", "self_uri": "drs://drs.example.org/drs_obj_b2c3d4e5f6a1", "name": "sample.bam.bai", "size": 2097152, "mime_type": "application/octet-stream", "created_time": "2024-01-15T10:30:00Z", "updated_time": "2024-01-15T10:30:00Z", "version": "1.0", "checksums": [ { "checksum": "098f6bcd4621d373cade4e832627b4f6", "type": "md5" } ], "access_methods": [ { "type": "s3", "access_id": "s3", "access_url": { "url": "s3://genomics-uploads/x7k9m/sample.bam.bai" } } ], "description": "BAM index file" } ] } ``` - name: Service Info - name: AccessMethodModel x-displayName: AccessMethod description: | - name: AccessURLModel x-displayName: AccessURL description: | - name: ChecksumModel x-displayName: Checksum description: | - name: ContentsObjectModel x-displayName: ContentsObject description: | - name: DrsObjectModel x-displayName: DrsObject description: | - name: DrsObjectCandidateModel x-displayName: DrsObjectCandidate description: | - name: ErrorModel x-displayName: Error description: | - name: UploadRequestModel x-displayName: UploadRequest description: | - name: UploadResponseModel x-displayName: UploadResponse description: | - name: UploadRequestObjectModel x-displayName: UploadRequestObject description: | - name: UploadResponseObjectModel x-displayName: UploadResponseObject description: | - name: UploadMethodModel x-displayName: UploadMethod description: | - name: DeleteRequestModel x-displayName: DeleteRequest description: | - name: BulkDeleteRequestModel x-displayName: BulkDeleteRequest description: | - name: DeleteResultModel x-displayName: DeleteResult description: | - name: BulkDeleteResponseModel x-displayName: BulkDeleteResponse description: | - name: Motivation description: >
Data sharing requires portable data, consistent with the FAIR data principles (findable, accessible, interoperable, reusable). Today’s researchers and clinicians are surrounded by potentially useful data, but often need bespoke tools and processes to work with each dataset. Today’s data publishers don’t have a reliable way to make their data useful to all (and only) the people they choose. And today’s data controllers are tasked with implementing standard controls of non-standard mechanisms for data access. Figure 1: there’s an ocean of data, with many different tools to drink from it, but no guarantee that any tool will work with any subset of the data
We need a standard way for data producers to make their data available to data consumers, that supports the control needs of the former and the access needs of the latter. And we need it to be interoperable, so anyone who builds access tools and systems can be confident they’ll work with all the data out there, and anyone who publishes data can be confident it will work with all the tools out there. Figure 2: by defining a standard Data Repository API, and adapting tools to use it, every data publisher can now make their data useful to every data consumer
We envision a world where:
  • there are many many data consumers, working in research and in care, who can use the tools of their choice to access any and all data that they have permission to see
  • there are many data access tools and platforms, supporting discovery, visualization, analysis, and collaboration
  • there are many data repositories, each with their own policies and characteristics, which can be accessed by a variety of tools
  • there are many data publishing tools and platforms, supporting a variety of data lifecycles and formats
  • there are many many data producers, generating data of all types, who can use the tools of their choice to make their data as widely available as is appropriate
Figure 3: a standard Data Repository API enables an ecosystem of data producers and consumers
This spec defines a standard **Data Repository Service (DRS) API** (“the yellow box”), to enable that ecosystem of data producers and consumers. Our goal is that the only thing data consumers need to know about a data repo is *\"here’s the DRS endpoint to access it\"*, and the only thing data publishers need to know to tap into the world of consumption tools is *\"here’s how to tell it where my DRS endpoint lives\"*. ## Federation The world’s biomedical data is controlled by groups with very different policies and restrictions on where their data lives and how it can be accessed. A primary purpose of DRS is to support unified access to disparate and distributed data. (As opposed to the alternative centralized model of "let’s just bring all the data into one single data repository”, which would be technically easier but is no more realistic than “let’s just bring all the websites into one single web host”.) In a DRS-enabled world, tool builders don’t have to worry about where the data their tools operate on lives — they can count on DRS to give them access. And tool users only need to know which DRS server is managing the data they need, and whether they have permission to access it; they don’t have to worry about how to physically get access to, or (worse) make a copy of the data. For example, if I have appropriate permissions, I can run a pooled analysis where I run a single tool across data managed by different DRS servers, potentially in different locations. - name: Working With Compound Objects description: > ## Compound Objects The DRS API supports access to data objects, with each `DrsObject` representing a single opaque blob of bytes. Much content (e.g. VCF files) is well represented as a single atomic `DrsObject`. Some content, however (e.g. DICOM images) is best represented as a compound object consisting of a structured collection of atomic `DrsObject`s. In both cases, DRS isn't aware of the semantics of the objects it serves -- understanding those semantics is the responsibility of the applications that call DRS. Common examples of compound objects in biomedicine include: * BAM+BAI genomic reads, with a small index (the BAI object) to large data (the BAM object), each object using a well-defined file format. * DICOM images, with a contents object pointing to one or more raw image objects, each containing pixels from different aspects of a single logical biomedical image (e.g. different z-coordinates) * studies, with a single table of contents listing multiple objects of various types that were generated together and are meant to be processed together ## Best Practice: Manifests As with atomic objects, DRS applications and servers are expected to agree on the semantics of compound objects using non-DRS mechanisms. The recommended best practice for representing a particular compound object type is: 1. Define a manifest file syntax, which contains the DRS IDs of the constituent atomic objects, plus type-specific information about the relationship between those constituents. * Manifest file syntax isn't prescribed by the spec, but we expect they will often be JSON files. * For example, for a BAM+BAI pair the manifest file could contain two key-value pairs mapping the type of each constituent file to its DRS ID. 3. Make manifest objects and their constituent objects available using standard DRS mechanisms -- each object is referenced via its own DRS ID, just like any other atomic object. * For example, for a BAM+BAI pair, there would be three DRS IDs -- one for the manifest, one for the BAM, and one for the BAI. 5. Document the expected client logic for processing compound objects of interest. This logic typically consists of using standard DRS mechanisms to fetch the manifest, parsing its syntax, extracting the DRS IDs of constituent objects, and using standard DRS mechanisms to fetch the constituents as needed. * In some cases the application will always want to fetch all of the constituents; in other cases it may want to initially fetch a subset, and only fetch the others on demand. For example, a DICOM image viewer may only want to fetch the layers that are being rendered. - name: Background Notes on DRS URIs description: > ## Design Motivation DRS URIs are aligned with the [FAIR data principles](https://www.nature.com/articles/sdata201618) and the [Joint Declaration of Data Citation Principles](https://www.nature.com/articles/sdata20182) — both hostname-based and compact identifier-based URIs provide globally unique, machine-resolvable, persistent identifiers for data. * We require all URIs to begin with `drs://` as a signal to humans and systems consuming these URIs that the response they will ultimately receive, after transforming the URI to a fetchable URL, will be a DRS JSON packet. This signal differentiates DRS URIs from the wide variety of other entities (HTML documents, PDFs, ontology notes, etc.) that can be represented by compact identifiers. * We support hostname-based URIs because of their simplicity and efficiency for server and client implementers. * We support compact identifier-based URIs, and the meta-resolver services of identifiers.org and n2t.net (Name-to-Thing), because of the wide adoption of compact identifiers in the research community. as detailed by [Wimalaratne et al (2018)](https://www.nature.com/articles/sdata201829) in "Uniform resolution of compact identifiers for biomedical data." - name: Compact Identifier-Based URIs description: > **Note: Identifiers.org/n2t.net API Changes** The examples below show the current API interactions with [n2t.net](https://n2t.net/e/compact_ids.html) and [identifiers.org](https://docs.identifiers.org/) which may change over time. Please refer to the documentation from each site for the most up-to-date information. We will make best efforts to keep the DRS specification current but DRS clients MUST maintain their ability to use either the identifiers.org or n2t.net APIs to resolve compact identifier-based DRS URIs. ## Registering a DRS Server on a Meta-Resolver See the documentation on the [n2t.net](https://n2t.net/e/compact_ids.html) and [identifiers.org](https://docs.identifiers.org/) meta-resolvers for adding your own compact identifier type and registering your DRS server as a resolver. You can register new prefixes (or mirrors by adding resource provider codes) for free using a simple online form. For more information see [More Background on Compact Identifiers](./more-background-on-compact-identifiers.html). ## Calling Meta-Resolver APIs for Compact Identifier-Based DRS URIs Clients resolving Compact Identifier-based URIs need to convert a prefix (e.g. “drs.42”) into a URL pattern. They can do so by calling either the identifiers.org or the n2t.net API, since the two meta-resolvers keep their mapping databases in sync. ### Calling the identifiers.org API as a Client It takes two API calls to get the URL pattern. 1. The client makes a GET request to identifiers.org to find information about the prefix: ``` GET https://registry.api.identifiers.org/restApi/namespaces/search/findByPrefix?prefix=drs.42 ``` This request returns a JSON structure including various URLs containing an embedded namespace id, such as: ``` "namespace" : { "href":"https://registry.api.identifiers.org/restApi/namespaces/1234" } ``` 2. The client extracts the namespace id (in this example 1234), and uses it to make a second GET request to identifiers.org to find information about the namespace: ``` GET https://registry.api.identifiers.org/restApi/resources/search/findAllByNamespaceId?id=1234 ``` This request returns a JSON structure including an urlPattern field, whose value is a URL pattern containing a ${id} parameter, such as: ``` "urlPattern" : "https://drs.myexample.org/ga4gh/drs/v1/objects/{$id}" ``` ### Calling the n2t.net API as a Client It takes one API call to get the URL pattern. The client makes a GET request to n2t.net to find information about the namespace. (Note the trailing colon.) ``` GET https://n2t.net/drs.42: ``` This request returns a text structure including a redirect field, whose value is a URL pattern containing an `$id` parameter, such as: ``` redirect: https://drs.myexample.org/ga4gh/drs/v1/objects/$id ``` ## Caching with Compact Identifiers Identifiers.org/n2t.net compact identifier resolver records do not change frequently. This reality is useful for caching resolver records and their URL patterns for performance reasons. Builders of systems that use compact identifier-based DRS URIs should cache prefix resolver records from identifiers.org/n2t.net and occasionally refresh the records (such as every 24 hours). This approach will reduce the burden on these community services since we anticipate many DRS URIs will be regularly resolved in workflow systems. Alternatively, system builders may decide to directly mirror the registries themselves, instructions are provided on the identifiers.org/n2t.net websites. ## Security with Compact Identifiers As mentioned earlier, identifiers.org/n2t.net performs some basic verification of new prefixes and provider code mirror registrations on their sites. However, builders of systems that consume and resolve DRS URIs may have certain security compliance requirements and regulations that prohibit relying on an external site for resolving compact identifiers. In this case, systems under these security and compliance constraints may wish to whitelist certain compact identifier resolvers and/or vet records from identifiers.org/n2t.net before enabling in their systems. ## Accession Encoding to Valid DRS IDs The compact identifier format used by identifiers.org/n2t.net does not percent-encode reserved URI characters but, instead, relies on the first ":" character to separate prefix from accession. Since these accessions can contain any characters, and characters like "/" will interfere with DRS API calls, you *must* percent encode the accessions extracted from DRS compact identifier-based URIs when using as DRS IDs in subsequent DRS GET requests. An easy way for a DRS client to handle this is to get the initial DRS object JSON response from whatever redirects the compact identifier resolves to, then look for the `self_uri` in the JSON, which will give you the correctly percent-encoded DRS ID for subsequent DRS API calls such as the `access` method. ## Additional Examples For additional examples, see the document [More Background on Compact Identifiers](./more-background-on-compact-identifiers.html). - name: Hostname-Based URIs description: > ## Encoding DRS IDs In hostname-based DRS URIs, the ID is always percent-encoded to ensure special characters do not interfere with subsequent DRS endpoint calls. As such, ":" is not allowed in the URI and is a convenient way of differentiating from a compact identifier-based DRS URI. Also, if a given DRS service implementation uses compact identifier accessions as their DRS IDs, they must be percent encoded before using them as DRS IDs in hostname-based DRS URIs and subsequent GET requests to a DRS service endpoint. - name: GA4GH Service Registry description: > The [GA4GH Service Registry API specification](https://github.com/ga4gh-discovery/ga4gh-service-registry) allows information about GA4GH-compliant web services, including DRS services, to be aggregated into registries and made available via a standard API. The following considerations should be followed when registering DRS services within a service registry. * The DRS service attributes returned by `/service-info` (i.e. `id`, `name`, `description`, etc.) should have the same values as the registry entry for that service. * The value of the `type` object's `artifact` property should be `drs` (i.e. the same as it appears in `service-info`) * Each entry in a Service Registry must have a `url`, indicating the base URL to the web service. For DRS services, the registered `url` must include everything up to the standardized `/ga4gh/drs/v1` path. Clients should be able to assume that: + Adding `/ga4gh/drs/v1/objects/{object_id}` to the registered `url` will hit the `DrsObject` endpoint + Adding `/ga4gh/drs/v1/service-info` to the registered `url` will hit the Service Info endpoint Example listing of a DRS API registration from a service registry's `/services` endpoint: ``` [ { "id": "com.example.drs", "name": "Example DRS API", "type": { "group": "org.ga4gh", "artifact": "drs", "version": "1.5.0" }, "description": "The Data Repository Service (DRS) API ...", "organization": { "id": "com.example", "name": "Example Company" }, "contactUrl": "mailto:support@example.com", "documentationUrl": "https://docs.example.com/docs/drs", "createdAt": "2021-08-09T00:00:00Z", "updatedAt": "2021-08-09T12:30:00Z", "environment": "production", "version": "1.13.4", "url": "https://drs-service.example.com" } ] ``` - name: Upload Requests and Object Registration description: > # Upload Requests and Object Registration > **Optional Functionality**: Upload and object registration are optional DRS extensions. Clients should check `/service-info` for `uploadRequestSupported` and `objectRegistrationSupported` before attempting to use these endpoints. The DRS upload and object registration endpoints allows clients to negotiate with servers on mutually convenient storage backends and then register uploads as DRS objects through a three-phase workflow: 1. **Request Upload URLs**: POST `/upload-request` with file metadata to receive upload methods and credentials 2. **Upload Files**: Use returned URLs and credentials to upload files to storage using storage provider specific upload mechanisms. DRS is not involved in this step at all, DRS simply enables clients and servers to agree on a mutually convenient storage service. 3. **Register Objects**: POST `/objects/register` to register DRS objects with the server This approach separates storage service and credential negotiation from file transfer and object registration, supporting a vendor-neutral means of sharing data in a DRS network. The `/objects/register` endpoint can be used independently to register existing data without using the `/upload-request` endpoint, and servers can choose to only support object registration and not uploads by setting the `uploadRequestSupported` and `objectRegistrationSupported` flags appropriately in `/service-info`. Upload requests and object registration endpoints only support bulk requests to simplify implementation and reflect real-world usage patterns. Bioinformatics workflows often involve uploading multiple related files together (e.g., BAM and VCF files with their indices, or analysis result sets), making bulk operations a natural fit. Single files/objects are handled as lists with one element. Implementations of the `/objects/register` endpoint SHOULD implement transaction semantics so that either all of the objects included in the request are successfully registered or none of them are, and clients should be robust to this behaviour. Transaction semantics for the `/upload-request` are encouraged but not required due to the variety and complexity of data transfer technologies. The `/upload-request` endpoint does not result in any state that needs to be maintained on the DRS server (intermediate DRS object IDs etc.) it is simply a means for a server to provide details of where a client can upload data, and the server should ensure that it trusts the client before providing such details (e.g. with appropriate authentication and authorisation before processing the request). This means that if uploads fail and there is no later call to `/objects/register` there is no DRS state to manage, simplifying server implementation. However, servers SHOULD ensure that any data from unsuccessful uploads (e.g. incomplete multi-part uploads) are cleaned up, for example by using lifecycle configuration in the backend storage. There is _no_ means of requiring that a client ultimately registers a DRS object pointing at data uploaded, and so servers should consider implementing some form of storage "garbage collection", a straightforward approach is to set a short lifecycle policy on the upload location and move uploaded data that is later registered as DRS objects to other locations, updating object `access_methods` accordingly. Servers should also implement some means of constraining upload size (quotas etc.) to protect against accidental or malicious unconstrained uploads. Servers can choose to validate that the uploads match the claimed object size when `/objects/register` is called, and should sdvertise this behaviour with the `validateFileSizes` flag in `/service-info`. The `/upload-request` endpoint can return one or more `upload_methods` of different types for each requested file, and backend specific details such as bucket names, object keys and credentials are supplied in a generic `upload_details` field. A straightforward implementation might return an single time-limited pre-signed POST URL as the `post_url` for an `upload_method` of type `https` which incorporates authentication into the URL, but because DRS is often used for large files such as BAMs and CRAMs this specification also supports more sophisticated upload approaches implemented by cloud storage backends such as multi-part uploads, automatic retries etc. The `upload_details` field can be used to include bucket names, keys and temporary credentials that can be used in native clients and SDKs. This offers a natural way to adapt this protocol to new storage technologies. Refer to the examples below for some suggested implementations. ## Service Discovery Check `/service-info` for upload capabilities: ```json { "drs": { "uploadRequestSupported": true, "objectRegistrationSupported": true, "supportedUploadMethodTypes": ["s3", "https"], "maxUploadSize": 5368709120, "maxUploadRequestLength": 50, "maxRegisterRequestLength": 50, "validateChecksums": true, "validateFileSizes": false, "relatedFileStorageSupported": true } } ``` Upload related fields: - `uploadRequestSupported`: Upload request operations available via `/upload-request` - `objectRegistrationSupported`: Object registration operations available via `/objects/register` - `supportedUploadMethodTypes`: Available storage backends - `maxUploadSize`: File size limit (bytes) - `maxUploadRequestLength`: Files per request limit for upload requests - `maxRegisterRequestLength`: Candidate objects per request limit for registration - `validateChecksums`/`validateFileSizes`: Server validation behavior - `relatedFileStorageSupported`: Files from same upload request will be stored under common prefixes ## Upload Methods Upon receipt of a request for an upload method for a specific file, the server will respond with an array of `upload_methods`, each with associated `type` and corresponding `upload_details` with upload locations, temporary credentials etc. These details are specific to backend implementations. Example storage backends: - **https**: Presigned POST URLs for HTTP uploads - **s3**: Direct S3 upload with temporary AWS credentials for an IAM session policy. - **gs**: Google Cloud Storage with OAuth2 tokens - **ftp/sftp**: Traditional file transfer protocols using negotiated credentials Servers may return a subset of advertised methods based on file characteristics, for example they may choose to store large objects such as WGS BAM files in different backends to small csv files. Clients can request specific upload methods in the initial request. ## Related File Storage (Optional) Servers MAY support storing files from the same upload request under common prefixes, enabling bioinformatics workflows that expect co-located files: - **CRAM + CRAI**: Alignment files with index files - **VCF + TBI**: Variant files with tabix indexes - **FASTQ.ora + ORADATA.tar.gz**: Compressed files with associated reference data Check `relatedFileStorageSupported` in service-info or examine upload URLs for common prefixes. ## Object Registration After upload, clients can register files in bulk as DRS objects using POST `/objects/register`. Registration is all-or-nothing. If any candidate object fails to be registered in the server, the entire request fails and no objects are registered. **Candidate DRS object equirements**: - Complete metadata (name, size, checksums, MIME type) - Access methods pointing to file locations - Valid authorization (if required) - Do not include fields managed by the DRS server (id, self_uri, timestamps) Upon receipt of candidate objects for registration the server will create unique object IDs and returns complete DRS objects. Note that the server is not obliged to retain the clients supplied `access_methods` and is free to move data to different locations/backends once the object is registered. This means that a server can choose to receive uploads in an untrusted "dropzone", with hard quotas and additional security, and then move them to more permanent storage once the DRS object is registered and any validation is successful. Clients SHOULD NOT cache the response from `/objects/register` as the `access_methods` may change after registration. The `/objects/register` endpoint can also be used independently to register existing data that is already stored in accessible locations, without using the `/upload-request` workflow. This is useful for registering pre-existing datasets or files uploaded through other means. Servers may choose only to support registration and not uploads, and should advertise this in `/service-info` ## Authentication & Validation **Authentication**: Supports GA4GH Passports, Basic auth, and Bearer tokens. **Checksums**: Required for all files (SHA-256, MD5, or IANA-registered algorithms). Servers MAY validate checksums and file sizes as advertised in service-info flags. ## Error Handling **Client Errors (4xx)**: - Invalid metadata (400) - Missing auth (401) - Insufficient permissions (403) **Server Errors (5xx)**: - Storage unavailable (500) - Capacity limits (503) ## Best Practices **Clients**: Check service-info first, calculate checksums, select supported upload methods, be robust to failed object registration **Servers**: Use short-lived tightly scoped credentials, support multiple upload methods, implement rate limiting, ensure unique storage backend names to avoid inadvertent overwrites (e.g. using UUIDs), ensure that quotas are enforced and incomplete or unregistered uploads are deleted **Security**: Time and scope-limited credentials, single-use URLs, logging for audit ## Security Considerations **Credential Scoping**: Implementers SHOULD scope upload credentials to the minimum necessary permissions and duration. Credentials should: - Allow write access only to the specific upload URL/path provided - Have the shortest practical expiration time (e.g. 15 minutes to 1 hour) - Be restricted to the specific file size and content type when possible - Not grant broader storage access beyond the intended upload location This principle of least privilege reduces security exposure if credentials are compromised or misused. ## Example Workflows ### Simple HTTPS Upload Upload Request: ```http POST /upload-request Content-Type: application/json { "requests": [ { "name": "variants.vcf", "size": 52428800, "mime_type": "text/plain", "checksums": [ { "checksum": "5d41402abc4b2a76b9719d911017c592", "type": "md5" } ], "upload_method_types": ["https"] } ] } ``` Response: ```json { "responses": [ { "name": "variants.vcf", "size": 52428800, "mime_type": "text/plain", "checksums": [ { "checksum": "5d41402abc4b2a76b9719d911017c592", "type": "md5" } ], "upload_methods": [ { "type": "https", "access_url": { "url": "https://uploads.example.org/variants.vcf" }, "upload_details": { "post_url": { "url": "https://uploads.example.org/presigned-upload?signature=FAKE_SIG", "headers": ["Header1", "Header2"] } } } ] } ] } ``` Upload via HTTPS: ```bash # Simple PUT upload to presigned POST URL curl -X PUT "https://uploads.example.org/presigned-upload?signature=FAKE_SIG" -H "Header1" -H "Header2" \ --data-binary @variants.vcf ``` Register DRS Object: ```http POST /objects/register Content-Type: application/json { "candidates": [ { "name": "variants.vcf", "size": 52428800, "mime_type": "text/plain", "checksums": [ { "checksum": "5d41402abc4b2a76b9719d911017c592", "type": "md5" } ], "access_methods": [ { "type": "https", "access_url": { "url": "https://uploads.example.org/variants.vcf" } } ], "description": "Variant calls in VCF format" } ] } ``` Response: ```json { "objects": [ { "id": "drs_obj_f6e5d4c3b2a1", "self_uri": "drs://drs.example.org/drs_obj_f6e5d4c3b2a1", "name": "variants.vcf", "size": 52428800, "mime_type": "text/plain", "created_time": "2024-01-15T10:45:00Z", "updated_time": "2024-01-15T10:45:00Z", "version": "1.0", "checksums": [ { "checksum": "5d41402abc4b2a76b9719d911017c592", "type": "md5" } ], "access_methods": [ { "type": "https", "access_url": { "url": "https://uploads.example.org/variants.vcf" } } ], "description": "Variant calls in VCF format" } ] } ``` ### S3 Bulk Upload (BAM + Index) Request Upload Methods for Related Files ```http POST /upload-request Content-Type: application/json { "requests": [ { "name": "sample.bam", "size": 1073741824, "mime_type": "application/octet-stream", "checksums": [ { "checksum": "d41d8cd98f00b204e9800998ecf8427e", "type": "md5" } ], "upload_method_types": ["s3"] }, { "name": "sample.bam.bai", "size": 2097152, "mime_type": "application/octet-stream", "checksums": [ { "checksum": "098f6bcd4621d373cade4e832627b4f6", "type": "md5" } ], "upload_method_types": ["s3"] } ] } ``` Response: ```json { "responses": [ { "name": "sample.bam", "size": 1073741824, "mime_type": "application/octet-stream", "checksums": [ { "checksum": "d41d8cd98f00b204e9800998ecf8427e", "type": "md5" } ], "upload_methods": [ { "type": "s3", "access_url": { "url": "s3://genomics-uploads/x7k9m/sample.bam" }, "upload_details": { "bucket": "genomics-uploads", "key": "x7k9m/sample.bam", "access_key_id": "FAKE_ACCESS_KEY_123", "secret_access_key": "FAKE_SECRET_KEY_456", "session_token": "FAKE_SESSION_TOKEN_789", "expires_at": "2024-01-15T12:00:00Z" } } ] }, { "name": "sample.bam.bai", "size": 2097152, "mime_type": "application/octet-stream", "checksums": [ { "checksum": "098f6bcd4621d373cade4e832627b4f6", "type": "md5" } ], "upload_methods": [ { "type": "s3", "access_url": { "url": "s3://genomics-uploads/x7k9m/sample.bam.bai" }, "upload_details": { "bucket": "genomics-uploads", "key": "x7k9m/sample.bam.bai", "access_key_id": "FAKE_ACCESS_KEY_123", "secret_access_key": "FAKE_SECRET_KEY_456", "session_token": "FAKE_SESSION_TOKEN_789", "expires_at": "2024-01-15T12:00:00Z" } } ] } ] } ``` Upload Both Files to S3: ```bash # Upload BAM and index files using the supplied credentials (note common prefix) export AWS_ACCESS_KEY_ID=... aws s3 cp sample.bam s3://genomics-uploads/x7k9m/sample.bam aws s3 cp sample.bam.bai s3://genomics-uploads/x7k9m/sample.bam.bai ``` Register Both DRS Objects: ```http POST /objects/register Content-Type: application/json { "candidates": [ { "name": "sample.bam", "size": 1073741824, "mime_type": "application/octet-stream", "checksums": [ { "checksum": "d41d8cd98f00b204e9800998ecf8427e", "type": "md5" } ], "access_methods": [ { "type": "s3", "access_id": "s3", "access_url": { "url": "s3://genomics-uploads/x7k9m/sample.bam" } } ], "description": "BAM alignment file" }, { "name": "sample.bam.bai", "size": 2097152, "mime_type": "application/octet-stream", "checksums": [ { "checksum": "098f6bcd4621d373cade4e832627b4f6", "type": "md5" } ], "access_methods": [ { "type": "s3", "access_id": "s3", "access_url": { "url": "s3://genomics-uploads/x7k9m/sample.bam.bai" } } ], "description": "BAM index file" } ] } ``` Response: ```json { "objects": [ { "id": "drs_obj_a1b2c3d4e5f6", "self_uri": "drs://drs.example.org/drs_obj_a1b2c3d4e5f6", "name": "sample.bam", "size": 1073741824, "mime_type": "application/octet-stream", "created_time": "2024-01-15T10:30:00Z", "updated_time": "2024-01-15T10:30:00Z", "version": "1.0", "checksums": [ { "checksum": "d41d8cd98f00b204e9800998ecf8427e", "type": "md5" } ], "access_methods": [ { "type": "s3", "access_id": "s3", "access_url": { "url": "s3://genomics-uploads/x7k9m/sample.bam" } } ], "description": "BAM alignment file" }, { "id": "drs_obj_b2c3d4e5f6a1", "self_uri": "drs://drs.example.org/drs_obj_b2c3d4e5f6a1", "name": "sample.bam.bai", "size": 2097152, "mime_type": "application/octet-stream", "created_time": "2024-01-15T10:30:00Z", "updated_time": "2024-01-15T10:30:00Z", "version": "1.0", "checksums": [ { "checksum": "098f6bcd4621d373cade4e832627b4f6", "type": "md5" } ], "access_methods": [ { "type": "s3", "access_id": "s3", "access_url": { "url": "s3://genomics-uploads/x7k9m/sample.bam.bai" } } ], "description": "BAM index file" } ] } ``` - name: Object Deletion description: > # Object Deletion > **Optional Functionality**: Delete support is an **optional** extension to the DRS API. Not all DRS servers are required to implement delete functionality. Clients should check for the availability of delete endpoints before attempting to use them. DRS delete functionality allows suitably authenticated clients to request that DRS objects are removed from the server and, optionally, to request that the server attempt to delete the underlying data. Servers should ensure that they trust clients from whom they receive delete requests, and may choose to implement "soft" deletes to minimise the risk of accidental or malicious requests. The DRS specification does not currently provide explicit support for soft deletes. Because delete support is optional, servers operating in untrusted environments may choose not to support delete operations at all. In combination with the `/objects/register` endpoint, metadata only delete requests offer a means for clients to update DRS metadata without affecting the underlying data, and without introducing additional update operations which would complicate server implementation. Clients can express a preference that the underlying data referred to by the deleted DRS object(s) is deleted with the `delete_storage_data` parameter. Servers are free to interpret this as they choose, and can advertise whether they support it at all with the `deleteStorageDataSupported` flag. Servers that choose to attempt to honour the request need not perform this operation synchronously and may, for example, register the file for later deletion or implement soft deletion with versioning etc. Implementations may also choose to ensure that no other DRS object registered in the server refers to the underlying data before deleting. Servers may not have the necessary permissions to delete the data from the backend even if they would like to do so, or may encounter errors when they attempt deletion. In the case that a DRS object refers to data stored in multiple backends (e.g. has multiple `access_methoda`) the server may attempt to delete the data from all or only some of the backends. For these reasons clients MUST NOT depend on the server deleting the underlying storage data even if the server advertises that `deleteStorageDataSupported` and the client sets the `delete_storage_data` flag. In situations where the DRS server controls the storage backend, DRS delete support offers a convenient vendor-neutral way for clients to update and delete DRS objects and corresponding data. For bulk deletes using the `/objects/delete` endpoint the server SHOULD implement transaction semantics: if any object fails validation or deletion, the entire request should fail and no objects are deleted and no attempt is made to delete from underlying storage for any object. ## Design principles - **Optional**: Delete support is completely optional - **Safety**: Preserves underlying data in storage unless explicitly requested - **Backward compatible**: No impact on existing DRS functionality - **Flexible authentication**: Supports GA4GH Passports, Bearer tokens, API keys - **Use PUT rather than DELETE**: GA4GH Passports require request bodies, which DELETE methods don't reliably support across all HTTP infrastructure. PUT ensures broad compatibility. ## Service Discovery Check `/service-info` for delete capabilities: ```json { "drs": { "uploadRequestSupported": true, "objectRegistrationSupported": true, "supportedUploadMethodTypes": ["s3", "https"], "relatedFileStorageSupported": true, "deleteSupported": true, "maxBulkDeleteLength": 100, "deleteStorageDataSupported": true } } ``` - **`deleteSupported`**: Whether server supports deletion - **`maxBulkDeleteLength`**: Maximum objects per bulk delete request - **`deleteStorageDataSupported`**: Whether server can attempt to delete underlying storage files ### Single Object Delete: `PUT /objects/{object_id}/delete` ```bash curl -X PUT "https://drs.example.org/objects/drs_object_123456/delete" \ -H "Content-Type: application/json" \ -d '{"passports": ["..."], "delete_storage_data": false}' # Response: 204 No Content (indicates metadata deletion success only) ``` **Note**: HTTP responses indicate metadata deletion status only. Storage deletion (`delete_storage_data: true`) is a best effort attempt with no guarantee of success. ### Bulk Object Delete: `PUT /objects/delete` ```bash curl -X PUT "https://drs.example.org/objects/delete" \ -H "Content-Type: application/json" \ -d '{ "bulk_object_ids": ["obj_1", "obj_2", "obj_3"], "passports": ["..."], "delete_storage_data": false }' # Response: 204 No Content (all metadata deleted) or 4xx error (no objects deleted) ``` ## Authentication **GA4GH Passports** (in request body): ```json {"passports": ["eyJhbGci..."], "delete_storage_data": false} ``` **Bearer Tokens** (in headers): ```bash curl -H "Authorization: Bearer token" -d '{"delete_storage_data": false}' ... ``` ## Underlying Storage Data **Important**: Storage data deletion is never guaranteed. Even when `delete_storage_data: true` is requested and the server supports it, the actual deletion may fail due to permissions, network issues, or storage service errors. Clients shoud not depend on storage deletion success. Clients can request that the server attempts to delete the underlying data referred to by the DRS object using the `delete_storage_data` parameter. **`delete_storage_data: false`** (default): Removes DRS object metadata only, preserves underlying storage files **`delete_storage_data: true`**: Removes metadata AND requests server attempt to delete underlying storage files (requires `deleteStorageDataSupported: true`, **success not guaranteed**) ## Update Pattern Rather than introducing additional operations and endpoints for updating DRS objects, servers can allow clients to use the metadata-only deletion and object registration endpoints to create a new DRS object with updated metadata while leaving the underlying data in place. Note that DRS only supports updating the `access_methods` and adding additional checksums to existing objects while maintaining the same DRS object ID. Any other changes to an object will require deletion and re-registration. **Metadata update steps:** 1. Delete metadata only: `PUT /objects/{id}/delete` with `delete_storage_data: false` 2. Re-register object: `POST /objects/register` with updated metadata ```bash # Delete metadata (preserves storage) curl -X PUT ".../objects/obj_123/delete" -d '{"delete_storage_data": false}' # Re-register with updates curl -X POST ".../objects/register" -d '{"candidates": [{"name": "updated.txt", ...}]}' ``` ## Error Responses - **400**: Unsupported storage deletion or invalid request parameters - **403**: Insufficient permissions for any object in the request - **404**: Any object not found or delete endpoints not supported by server - **413**: Bulk request exceeds `maxBulkDeleteLength` limit ## Examples **Metadata Update:** ```bash curl ".../service-info" # Check capabilities curl -X PUT ".../objects/obj_123/delete" -d '{"delete_storage_data": false}' curl -X POST ".../objects/register" -d '{"candidates": [{"name": "updated.vcf", ...}]}' ``` **Complete Removal:** ```bash curl -X PUT ".../objects/obj_456/delete" -H "Authorization: Bearer token" \ -d '{"delete_storage_data": true}' ``` **Bulk Delete (Atomic):** ```bash curl -X PUT ".../objects/delete" -d '{ "bulk_object_ids": ["obj_1", "obj_2"], "passports": ["..."], "delete_storage_data": false }' # All objects deleted or none deleted (transactional) ``` ## Best Practices **Clients:** Check service-info, default to safe deletion, handle transactional failures, respect limits, confirm destructive operations, do not rely on underlying storage deletion **Servers:** Advertise capabilities, validate permissions, implement atomic transactions, implement limits, use versioning to avoid inadvertent deletion or critical data. ## Security Considerations - **Authentication**: Validate GA4GH Passports and Bearer tokens - **HTTPS Required**: Protect credentials in transit - **Rate Limiting**: Prevent abuse of delete endpoints - **Input Validation**: Sanitize all request parameters ## Backward Compatibility Delete functionality is designed to be backward compatible: - **No Impact on Existing Endpoints**: All existing DRS endpoints remain unchanged - **Optional Implementation**: Servers can ignore delete functionality entirely - **Graceful Degradation**: Clients receive 404 responses when delete is not supported - **Safe Defaults**: New fields in service-info have safe default values, and requests default to leaving underlying data in place. - name: Access Method Update description: > # Access Method Updates > **Optional Functionality**: Access method updates are optional extensions to the DRS API. Not all DRS servers implement this functionality. Clients should check `/service-info` for `accessMethodUpdateSupported` before attempting to use these endpoints. Access method update endpoints allows authorized clients to modify how existing DRS objects can be accessed without changing the core object metadata (size, checksums, name). This is useful for storage migrations, adding mirrors, or updating URLs. These endpoints will overwrite existing access methods for an object, if clients want to add access methods in addition to existing ones for objects they should first retrieve the current access methods and include them in the update request along with the new methods. ## Use Cases - **Storage Migration**: Move data between storage providers while keeping same DRS object - **Mirror Addition**: Add additional regional access points, or alternative protocols - **URL Refresh**: Update changed domain names - **Access Optimization**: Add or remove access methods based on performance or cost ## Design Principles - **Optional**: Access method update support is completely optional - **Immutable Core**: Only access methods can be updated - size, checksums, name remain unchanged - **Atomic Bulk Operations**: All updates succeed or all fail (transactional) - **Optional Validation**: Servers MAY validate new access methods point to same data - **Flexible Authentication**: Supports GA4GH Passports, Bearer tokens, API keys ## Service Discovery Check `/service-info` for access method update capabilities: ```json { "drs": { "accessMethodUpdateSupported": true, "maxBulkAccessMethodUpdateLength": 100, "validateAccessMethods": false } } ``` - **`accessMethodUpdateSupported`**: Whether the server supports access method updates - **`maxBulkAccessMethodUpdateLength`**: Maximum objects per bulk update request - **`validateAccessMethods`**: Whether the server validates access methods ## Single Object Update Update access methods for a single DRS object: ```bash curl -X PUT "https://drs.example.org/objects/obj_123/access-methods" \ -H "Content-Type: application/json" \ -d '{ "access_methods": [ { "type": "https", "access_url": { "url": "https://new-location.com/data/file.bam" } }, { "type": "s3", "access_id": "s3, "access_url": { "url": "s3://new-bucket/migrated/file.bam" } } ] }' ``` ## Bulk Object Update Update access methods for multiple objects atomically: ```bash curl -X PUT "https://drs.example.org/objects/access-methods" \ -H "Content-Type: application/json" \ -d '{ "updates": [ { "object_id": "obj_123", "access_methods": [ { "type": "https", "access_url": {"url": "https://new-location.com/file1.bam"} } ] }, { "object_id": "obj_456", "access_methods": [ { "type": "s3", "access_url": {"url": "s3://new-bucket/file2.vcf"} } ] } ] }' ``` ## Authentication **GA4GH Passports** (in request body): ```json { "access_methods": [...], "passports": ["eyJhbGci..."] } ``` **Bearer Tokens** (in headers): ```bash curl -H "Authorization: Bearer token" -d '{"access_methods": [...]}' ... ``` ## Validation Servers MAY validate that new access methods point to the same data by checking file availability, checksums or file content. Validation behavior is advertised in `validateAccessMethods` service-info field. ## Error Responses - **400**: Invalid access methods or validation failure - **401**: Authentication required - **403**: Insufficient permissions for object(s) - **404**: Object not found or access method updates not supported - **413**: Bulk request exceeds `maxBulkAccessMethodUpdateLength` limit ## Examples **Storage Migration:** ```bash # Update single object after migration curl -X PUT "https://drs.example.org/objects/obj_123/access-methods" \ -d '{"access_methods": [{"type": "s3", "access_url": {"url": "s3://new-bucket/file.bam"}}]}' ``` **Bulk Migration:** ```bash # Migrate multiple objects atomically curl -X PUT "https://drs.example.org/objects/access-methods" \ -d '{ "updates": [ {"object_id": "obj_1", "access_methods": [...]}, {"object_id": "obj_2", "access_methods": [...]} ] }' ``` ## Best Practices **Clients**: Check service-info first, handle atomic transaction failures, respect bulk limits, verify permissions **Servers**: Advertise capabilities clearly, implement atomic transactions for bulk operations, validate permissions, consider optional validation for data integrity ## Backward Compatibility Access method update functionality is designed to be backward compatible: - **No Impact on Existing Endpoints**: All existing DRS endpoints remain unchanged - **Optional Implementation**: Servers can ignore this functionality entirely - **Graceful Degradation**: Clients receive 404 responses when not supported - **Safe Defaults**: New service-info fields have safe default values - name: Checksum Additions description: > # Add additional checksum > **Optional Functionality**: Checksum additions are optional extensions to the DRS API. Not all DRS servers are required to implement this functionality. Clients should check `/service-info` for `checksumAdditionSupported` before attempting to use these endpoints. Checksum addition endpoints allows authorized clients to add additional checksums to existing DRS objects. This is useful for servers that rely on objects using a specific checksum type, e.g. SHA-256, and where objects are not guaranteed to have this checksum at creation time, e.g. objects may be created with an MD5 checksum only. The server MAY choose to validate checksums and return errors for mismatches, this behaviour is advertised in the `validateChecksums` field in `/service-info`. These endpoints only support the addition of additional checksums, a client SHOULD NOT attempt to update the value of an existing checksum or to remove a checksum. If a client attempts to update an existing checksum the server behaviour is implementation dependent but servers MAY simply ignore the request, or MAY return a 4XX error to the client. Servers MUST NOT change any existing checksums. If an incorrent checksum has been registered then clients should delete the existing DRS object (if supported by the server) and register a new DRS object with the correct metadata. This ensures that a single DRS object ID _always_ points to the same object. ## Design Principles - **Optional**: Checksum addition support is completely optional - **Object Immutability**: Existing checksums cannote be changed - **Atomic Bulk Operations**: All additions succeed or all fail (transactional) - **Flexible Authentication**: Supports GA4GH Passports, Bearer tokens, API keys ## Service Discovery Check `/service-info` for checksum addition capabilities: ```json { "drs": { "checksumAdditionSupported": true, "maxBulkChecksumAdditionLength": 100, "verifyChecksums": true } } ``` - **`checksumAdditionSupported`**: Whether server supports checksum addition - **`maxBulkChecksumAdditionLength`**: Maximum objects per bulk addition request - **`verifyChecksums`**: Whether server validates new checksums ## Single Object Checksum Addition Add checksum for a single DRS object: ```bash curl -X PUT "https://drs.example.org/objects/obj_123/checksums" \ -H "Content-Type: application/json" \ -d '{ "checksums": [ { "checksum": "2320831154385267afee81d0d837473280117763f4acd426b3735c37a0500482" "type": "sha256" } ] }' ``` ## Bulk Object Checksum Addition Add checksums for multiple objects atomically: ```bash curl -X PUT "https://drs.example.org/objects/checksums" \ -H "Content-Type: application/json" \ -d '{ "additions": [ { "object_id": "obj_123", "checksums": [ { "checksum": "2320831154385267afee81d0d837473280117763f4acd426b3735c37a0500482" "type": "sha256" } ] }, { "object_id": "obj_456", "checksums": [ { "checksum": "23d50c6804a8b198f7fe4ff11d4518fb46d8d8d1337c6b9aa0fbad7bb90b3d32" "type": "sha256" } ] } ] }' ``` ## Authentication **GA4GH Passports** (in request body): ```json { "additions": [...], "passports": ["eyJhbGci..."] } ``` **Bearer Tokens** (in headers): ```bash curl -H "Authorization: Bearer token" -d '{"additions": [...]}' ... ``` ## Validation Servers MAY validate that new checksums match the underlying objects, this behaviour is adevrtised in the `validateChecksums` service-info field. ## Error Responses - **400**: Invalid checksums or validation failure - **401**: Authentication required - **403**: Insufficient permissions for object(s) - **404**: Object not found or checksum additions not supported - **413**: Bulk request exceeds `maxBulkChecksumAdditionLength` limit ## Best Practices **Clients**: Check service-info first, handle atomic transaction failures, respect bulk limits, verify permissions **Servers**: Advertise capabilities clearly, implement atomic transactions for bulk operations, validate permissions, consider optional validation for data integrity ## Backward Compatibility Checksum addition functionality is designed to be backward compatible: - **No Impact on Existing Endpoints**: All existing DRS endpoints remain unchanged - **Optional Implementation**: Servers can ignore this functionality entirely - **Graceful Degradation**: Clients receive 404 responses when not supported x-tagGroups: - name: Overview tags: - Introduction - DRS API Principles - Authorization & Authentication - name: Operations tags: - Objects - Upload Request - Service Info - name: Models tags: - AccessMethodModel - AccessURLModel - ChecksumModel - ContentsObjectModels - DrsObjectModel - DrsObjectCandidateModel - ErrorModel - UploadRequestModel - UploadResponseModel - UploadRequestObjectModel - UploadResponseObjectModel - UploadMethodModel - DeleteRequestModel - BulkDeleteRequestModel - DeleteResultModel - BulkDeleteResponseModel - name: Appendices tags: - Motivation - Working With Compound Objects - Background Notes on DRS URIs - Compact Identifier-Based URIs - Hostname-Based URIs - GA4GH Service Registry - Upload Requests and Object Registration - Object Deletion - Access Method Update - Checksum Addition paths: /service-info: get: summary: Retrieve information about this service description: >- Returns information about the DRS service along with stats pertaning to total object count and cumulative size in bytes. Also indicates whether the server supports optional upload and delete operations and which methods are available. Extends the [v1.0.0 GA4GH Service Info specification](https://github.com/ga4gh-discovery/ga4gh-service-info) as the standardized format for GA4GH web services to self-describe. According to the [service-info type registry](https://github.com/ga4gh/TASC/blob/master/service-info/ga4gh-service-info.json) maintained by the [Technical Alignment Sub Committee (TASC)](https://github.com/ga4gh/TASC), a DRS service MUST have: * a `type.group` value of `org.ga4gh` * a `type.artifact` value of `drs` **Example 1: Server with upload and delete capabilities** ``` { "id": "com.example.drs", "description": "Serves data according to DRS specification", ... "type": { "group": "org.ga4gh", "artifact": "drs", "version": "1.5" } ... "drs":{ "maxBulkRequestLength": 200, "objectCount": 774560, "totalObjectSize": 4018437188907752, "uploadRequestSupported": true, "objectRegistrationSupported": true, "supportedUploadMethodTypes": ["s3", "https", "gs"], "maxUploadSize": 5368709120, "maxUploadRequestLength": 50, "validateChecksums": true, "validateFileSizes": false, "relatedFileStorageSupported": true, "deleteSupported": true, "maxBulkDeleteLength": 100, "deleteStorageDataSupported": true } } ``` **Example 2: Read-only server (no upload or delete)** ``` { "id": "com.example.readonly-drs", "description": "Read-only DRS service", ... "type": { "group": "org.ga4gh", "artifact": "drs", "version": "1.5" } ... "drs":{ "maxBulkRequestLength": 500, "objectCount": 1250000, "totalObjectSize": 8500000000000000 } } ``` **Example 3: Server with metadata-only delete capability** ``` { "id": "com.example.metadata-drs", "description": "DRS service with metadata-only delete", ... "type": { "group": "org.ga4gh", "artifact": "drs", "version": "1.5" } ... "drs":{ "maxBulkRequestLength": 200, "objectCount": 500000, "totalObjectSize": 2500000000000000, "deleteSupported": true, "maxBulkDeleteLength": 50, "deleteStorageDataSupported": false } } ``` See the [Service Registry Appendix](#tag/GA4GH-Service-Registry) for more information on how to register a DRS service with a service registry. operationId: GetServiceInfo responses: '200': $ref: '#/components/responses/200ServiceInfo' '500': $ref: '#/components/responses/500InternalServerError' tags: - Service Info /objects/{object_id}: options: summary: Get Authorization info about a DrsObject. security: - {} description: >- Returns a list of `Authorizations` that can be used to determine how to authorize requests to `GetObject` or `PostObject`. operationId: OptionsObject parameters: - $ref: '#/components/parameters/ObjectId' responses: '200': $ref: '#/components/responses/200OkAuthorizations' '204': $ref: '#/components/responses/AuthorizationsNotSupported' '400': $ref: '#/components/responses/400BadRequest' '404': $ref: '#/components/responses/404NotFoundDrsObject' '405': $ref: '#/components/responses/AuthorizationsNotSupported' '500': $ref: '#/components/responses/500InternalServerError' tags: - Objects x-swagger-router-controller: ga4gh.drs.server get: summary: Get info about a DrsObject. description: >- Returns object metadata, and a list of access methods that can be used to fetch object bytes. operationId: GetObject parameters: - $ref: '#/components/parameters/ObjectId' - $ref: '#/components/parameters/Expand' responses: '200': $ref: '#/components/responses/200OkDrsObject' '202': $ref: '#/components/responses/202Accepted' '400': $ref: '#/components/responses/400BadRequest' '401': $ref: '#/components/responses/401Unauthorized' '403': $ref: '#/components/responses/403Forbidden' '404': $ref: '#/components/responses/404NotFoundDrsObject' '500': $ref: '#/components/responses/500InternalServerError' tags: - Objects x-swagger-router-controller: ga4gh.drs.server post: summary: Get info about a DrsObject through POST'ing a Passport. description: >- Returns object metadata and a list of access methods that can be used to fetch object bytes. Method is a POST to accommodate a JWT GA4GH Passport sent in the request body in order to authorize access. **Note**: To upload new files and register them as DRS objects, use the `/upload-request` endpoint to obtain upload methods and temporary credentials, then use POST `/objects/register` endpoint to register multiple objects at once. Note that upload functionality is optional and not all DRS servers implement the upload endpoints. operationId: PostObject security: - PassportAuth: [] responses: '200': $ref: '#/components/responses/200OkDrsObject' '202': $ref: '#/components/responses/202Accepted' '400': $ref: '#/components/responses/400BadRequest' '401': $ref: '#/components/responses/401Unauthorized' '403': $ref: '#/components/responses/403Forbidden' '404': $ref: '#/components/responses/404NotFoundAccess' '500': $ref: '#/components/responses/500InternalServerError' tags: - Objects x-swagger-router-controller: ga4gh.drs.server parameters: - $ref: '#/components/parameters/ObjectId' requestBody: $ref: '#/components/requestBodies/PostObjectBody' /objects: options: summary: Get Authorization info about multiple DrsObjects. security: - {} description: >- Returns a structure that contains for each DrsObjects a list of `Authorizations` that can be used to determine how to authorize requests to `GetObject` or `PostObject` (or bulk equivalents). operationId: OptionsBulkObject responses: '200': $ref: '#/components/responses/200OkBulkAuthorizations' '204': $ref: '#/components/responses/AuthorizationsNotSupported' '400': $ref: '#/components/responses/400BadRequest' '404': $ref: '#/components/responses/404NotFoundDrsObject' '405': $ref: '#/components/responses/AuthorizationsNotSupported' '413': $ref: '#/components/responses/413RequestTooLarge' '500': $ref: '#/components/responses/500InternalServerError' tags: - Objects x-swagger-router-controller: ga4gh.drs.server post: summary: Get info about multiple DrsObjects with an optional Passport(s). description: >- Returns an array of object metadata and access methods for the specified object IDs. The request is limited to use passports (one or more) or a single bearer token, so make sure your bulk request is for objects that all use the same passports/token. **Note**: To register new DRS objects, use the dedicated `/objects/register` endpoint. operationId: GetBulkObjects security: - PassportAuth: [] parameters: - $ref: '#/components/parameters/Expand' responses: '200': $ref: '#/components/responses/200OkDrsObjects' '202': $ref: '#/components/responses/202Accepted' '400': $ref: '#/components/responses/400BadRequest' '401': $ref: '#/components/responses/401Unauthorized' '403': $ref: '#/components/responses/403Forbidden' '404': $ref: '#/components/responses/404NotFoundDrsObject' '413': $ref: '#/components/responses/413RequestTooLarge' '500': $ref: '#/components/responses/500InternalServerError' tags: - Objects x-swagger-router-controller: ga4gh.drs.server requestBody: $ref: '#/components/requestBodies/BulkObjectBody' /objects/{object_id}/access/{access_id}: get: summary: Get a URL for fetching bytes description: >- Returns a URL that can be used to fetch the bytes of a `DrsObject`. This method only needs to be called when using an `AccessMethod` that contains an `access_id` (e.g., for servers that use signed URLs for fetching object bytes). operationId: GetAccessURL responses: '200': $ref: '#/components/responses/200OkAccess' '202': $ref: '#/components/responses/202Accepted' '400': $ref: '#/components/responses/400BadRequest' '401': $ref: '#/components/responses/401Unauthorized' '403': $ref: '#/components/responses/403Forbidden' '404': $ref: '#/components/responses/404NotFoundAccess' '500': $ref: '#/components/responses/500InternalServerError' tags: - Objects x-swagger-router-controller: ga4gh.drs.server parameters: - $ref: '#/components/parameters/ObjectId' - $ref: '#/components/parameters/AccessId' post: summary: Get a URL for fetching bytes through POST'ing a Passport description: >- Returns a URL that can be used to fetch the bytes of a `DrsObject`. This method only needs to be called when using an `AccessMethod` that contains an `access_id` (e.g., for servers that use signed URLs for fetching object bytes). Method is a POST to accommodate a JWT GA4GH Passport sent in the formData in order to authorize access. operationId: PostAccessURL security: - PassportAuth: [] responses: '200': $ref: '#/components/responses/200OkAccess' '202': $ref: '#/components/responses/202Accepted' '400': $ref: '#/components/responses/400BadRequest' '401': $ref: '#/components/responses/401Unauthorized' '403': $ref: '#/components/responses/403Forbidden' '404': $ref: '#/components/responses/404NotFoundAccess' '500': $ref: '#/components/responses/500InternalServerError' tags: - Objects x-swagger-router-controller: ga4gh.drs.server parameters: - $ref: '#/components/parameters/ObjectId' - $ref: '#/components/parameters/AccessId' requestBody: $ref: '#/components/requestBodies/Passports' /objects/access: post: summary: >- Get URLs for fetching bytes from multiple objects with an optional Passport(s). description: >- Returns an array of URL objects that can be used to fetch the bytes of multiple `DrsObject`s. This method only needs to be called when using an `AccessMethod` that contains an `access_id` (e.g., for servers that use signed URLs for fetching object bytes). Currently this is limited to use passports (one or more) or a single bearer token, so make sure your bulk request is for objects that all use the same passports/token. operationId: GetBulkAccessURL security: - PassportAuth: [] responses: '200': $ref: '#/components/responses/200OkAccesses' '202': $ref: '#/components/responses/202Accepted' '400': $ref: '#/components/responses/400BadRequest' '401': $ref: '#/components/responses/401Unauthorized' '403': $ref: '#/components/responses/403Forbidden' '404': $ref: '#/components/responses/404NotFoundAccess' '413': $ref: '#/components/responses/413RequestTooLarge' '500': $ref: '#/components/responses/500InternalServerError' tags: - Objects x-swagger-router-controller: ga4gh.drs.server requestBody: required: true content: application/json: schema: $ref: '#/components/schemas/BulkObjectAccessId' /objects/register: post: summary: Register DRS objects description: >- **Optional Endpoint**: This endpoint is not required for DRS server implementations. Not all DRS servers support object registration. Registers one or more "candidate" DRS objects with the server. If it accepts the request, the server will create unique object IDs for each registered object and return them in fully-formed DRS objects in response. This endpoint can be used after uploading files using methods negotiated with the `/upload-request` endpoint to register the uploaded files as DRS objects, or to register existing data. The request body should contain candidate DRS objects with all required metadata including access methods that correspond to the upload methods used during file upload. **RECOMMENDED - Transactional Behavior**: Registration operations SHOULD be atomic transactions. If ANY candidate object fails validation or registration, the ENTIRE request SHOULD fail and NO objects SHOULD be registered. Servers SHOULD implement this as an all-or-nothing operation to ensure data consistency, but MAY implement partial registration with appropriate error reporting if transactional behavior is not feasible. **Authentication**: GA4GH Passports can be provided in the request body for authorization. Bearer tokens can be supplied in headers. **Server Responsibilities**: - SHOULD treat registration as an atomic transaction (all succeed or all fail) - SHOULD validate ALL candidate objects before registering ANY - Create unique object IDs for each registered object - Add timestamps (created_time, updated_time) - SHOULD roll back any partial changes if any candidate fails validation **Client Responsibilities**: - Provide required DRS object metadata for all candidates - Include access methods corresponding to uploaded file locations - Ensure checksums match uploaded file content - Handle potential failure of entire batch if any single object is invalid operationId: RegisterObjects security: - {} - BasicAuth: [] - BearerAuth: [] - PassportAuth: [] requestBody: $ref: '#/components/requestBodies/RegisterObjectsBody' responses: '201': $ref: '#/components/responses/201ObjectsCreated' '400': $ref: '#/components/responses/400BadRequest' '401': $ref: '#/components/responses/401Unauthorized' '403': $ref: '#/components/responses/403Forbidden' '413': $ref: '#/components/responses/413RequestTooLarge' '500': $ref: '#/components/responses/500InternalServerError' tags: - Register Objects x-swagger-router-controller: ga4gh.drs.server x-codegen-request-body-name: body /objects/{object_id}/access-methods: put: summary: Update access methods for a DRS object description: >- **Optional Endpoint**: Not all DRS servers support access method updates. Update the access methods for an existing DRS object. Only access methods are modified - core object metadata (size, checksums, name) remains unchanged. Servers MAY validate that new access methods point to the same data. Note that existing access methods are overwritten, if clients want to add additional access methods they should first retrieve the current methods and include them along with the new methods in this request. **Authentication**: GA4GH Passports can be provided in the request body. operationId: updateObjectAccessMethods parameters: - name: object_id in: path required: true schema: type: string description: DRS object identifier requestBody: $ref: '#/components/requestBodies/AccessMethodUpdateBody' responses: '200': $ref: '#/components/responses/200AccessMethodUpdate' '400': $ref: '#/components/responses/400BadRequest' '401': $ref: '#/components/responses/401Unauthorized' '403': $ref: '#/components/responses/403Forbidden' '404': $ref: '#/components/responses/404NotFoundDrsObject' '500': $ref: '#/components/responses/500InternalServerError' security: - {} - BasicAuth: [] - BearerAuth: [] - PassportAuth: [] tags: - Objects /objects/access-methods: put: summary: Bulk update access methods for multiple DRS objects description: >- **Optional Endpoint**: Not all DRS servers support access method updates. Update access methods for multiple DRS objects in a single atomic transaction. If ANY object fails to update, the ENTIRE request fails and NO objects are updated. Only access methods are modified - core object metadata remains unchanged. Note that existing access methods are overwritten, if clients want to add additional access methods they should first retrieve the current methods and include them along with the new methods in this request. **Authentication**: GA4GH Passports can be provided in the request body. operationId: bulkUpdateAccessMethods requestBody: $ref: '#/components/requestBodies/BulkAccessMethodUpdateBody' responses: '200': $ref: '#/components/responses/200BulkAccessMethodUpdate' '400': $ref: '#/components/responses/400BadRequest' '401': $ref: '#/components/responses/401Unauthorized' '403': $ref: '#/components/responses/403Forbidden' '404': $ref: '#/components/responses/404NotFoundDrsObject' '413': $ref: '#/components/responses/413RequestTooLarge' '500': $ref: '#/components/responses/500InternalServerError' security: - {} - BasicAuth: [] - BearerAuth: [] - PassportAuth: [] tags: - Objects /objects/{object_id}/checksums: put: summary: Add additional checksums for a DRS object description: >- **Optional Endpoint**: Not all DRS servers support checksum addition. Add additional checksums for a DRS object. This endpoint does not support altering existing checksums, only adding additional checksum types that aren't already present in the DRS object. If a client attempts to update an existing checksum type the server MAY ignore the update or MAY return an error. **Authentication**: GA4GH Passports can be provided in the request body. operationId: addChecksums parameters: - name: object_id in: path required: true schema: type: string description: DRS object identifier requestBody: $ref: '#/components/requestBodies/ChecksumAdditionBody' responses: '200': $ref: '#/components/responses/200ChecksumAddition' '400': $ref: '#/components/responses/400BadRequest' '401': $ref: '#/components/responses/401Unauthorized' '403': $ref: '#/components/responses/403Forbidden' '404': $ref: '#/components/responses/404NotFoundDrsObject' '413': $ref: '#/components/responses/413RequestTooLarge' '500': $ref: '#/components/responses/500InternalServerError' security: - {} - BasicAuth: [] - BearerAuth: [] - PassportAuth: [] tags: - Objects /objects/checksum/{checksum}: get: summary: Get DRS objects that are a match for the checksum. description: >- Returns an array of `DRSObjects` that match a given checksum. The checksum type is not provide, the checksum check is done against all checksum types. operationId: GetObjectsByChecksum security: - PassportAuth: [] parameters: - $ref: '#/components/parameters/Checksum' responses: '200': $ref: '#/components/responses/200OkDrsObjects' '202': $ref: '#/components/responses/202Accepted' '400': $ref: '#/components/responses/400BadRequest' '401': $ref: '#/components/responses/401Unauthorized' '403': $ref: '#/components/responses/403Forbidden' '404': $ref: '#/components/responses/404NotFoundDrsObject' '500': $ref: '#/components/responses/500InternalServerError' tags: - Objects x-swagger-router-controller: ga4gh.drs.server /objects/checksums: put: summary: Add additional checksums for multiple DRS objects description: >- **Optional Endpoint**: Not all DRS servers support checksum addition. Add additional checksums for multiple DRS objects in a single atomic transaction. If ANY object fails to update, the ENTIRE request fails and NO objects are updated. This endpoint does not support altering existing checksums, only adding additional checksum types that aren't already present in the DRS object. If a client attempts to update an existing checksum type the server MAY ignore the update or MAY return an error. **Authentication**: GA4GH Passports can be provided in the request body. operationId: bulkAddChecksums requestBody: $ref: '#/components/requestBodies/BulkChecksumAdditionBody' responses: '200': $ref: '#/components/responses/200BulkChecksumAddition' '400': $ref: '#/components/responses/400BadRequest' '401': $ref: '#/components/responses/401Unauthorized' '403': $ref: '#/components/responses/403Forbidden' '404': $ref: '#/components/responses/404NotFoundDrsObject' '413': $ref: '#/components/responses/413RequestTooLarge' '500': $ref: '#/components/responses/500InternalServerError' security: - {} - BasicAuth: [] - BearerAuth: [] - PassportAuth: [] tags: - Objects /objects/{object_id}/delete: put: summary: Delete a DRS object (optional endpoint) description: >- **Optional Endpoint**: This endpoint is not required for DRS server implementations. Not all DRS servers support delete functionality. Deletes a DRS object by ID. This operation removes the DRS object metadata and optionally attempts to delete the underlying storage data based on the `delete_storage_data` parameter and server capabilities. By default, only DRS object metadata is deleted while preserving underlying storage data. To attempt storage data deletion, clients must explicitly set delete_storage_data to true and the server must support storage data deletion (advertised via `deleteStorageDataSupported` in service-info). Servers will make a best effort attempt to delete storage data, but success is not guaranteed. This endpoint uses an HTTP PUT rather than DELETE to accommodate GA4GH Passport authentication in the request body, ensuring compatibility across all HTTP clients and proxies. **Important**: HTTP responses (204 No Content) indicate metadata deletion success only, not storage deletion success (which are not guaranteed to complete synchronously if they occur at all) operationId: DeleteObject security: - {} - BasicAuth: [] - BearerAuth: [] - PassportAuth: [] parameters: - $ref: '#/components/parameters/ObjectId' requestBody: $ref: '#/components/requestBodies/DeleteBody' responses: '204': $ref: '#/components/responses/204DeleteSuccess' '400': $ref: '#/components/responses/400BadRequestDelete' '401': $ref: '#/components/responses/401Unauthorized' '403': $ref: '#/components/responses/403ForbiddenDelete' '404': $ref: '#/components/responses/404NotFoundDelete' '500': $ref: '#/components/responses/500InternalServerError' tags: - Objects x-swagger-router-controller: ga4gh.drs.server x-codegen-request-body-name: body /objects/delete: put: summary: Delete multiple DRS objects description: >- **Optional Endpoint**: This endpoint is not required for DRS server implementations. Not all DRS servers support delete functionality. Delete multiple DRS objects in a single atomic transaction. If ANY object fails to be deleted, the ENTIRE request fails and NO objects are deleted. This ensures data consistency and prevents partial deletion scenarios. **RECOMMENDED - Transactional Behavior**: Deletion operations SHOULD be atomic transactions. If ANY object fails validation or deletion, the ENTIRE request SHOULD fail and NO objects SHOULD be deleted. Servers SHOULD implement this as an all-or-nothing operation to ensure data consistency, but MAY implement partial deletion with appropriate error reporting if transactional behavior is not feasible. **Authentication**: GA4GH Passports can be provided in the request body for authorization. **Storage Data Deletion**: The `delete_storage_data` parameter controls whether the server will attempt to delete underlying storage files along with DRS metadata. This defaults to false for safety. Servers will make a best effort attempt to delete storage data, but success is not guaranteed. **Server Responsibilities**: - SHOULD treat deletion as an atomic transaction (all succeed or all fail) - SHOULD validate ALL object IDs exist and are accessible before deleting ANY - SHOULD roll back any partial changes if any object fails deletion - SHOULD return 400 if any object ID is invalid or inaccessible when using transactional behavior **Client Responsibilities**: - Provide valid object IDs for all objects to be deleted - Handle potential failure of entire batch if any single object cannot be deleted - Check service-info for `maxBulkDeleteLength` limits before making requests operationId: bulkDeleteObjects tags: - Objects requestBody: $ref: '#/components/requestBodies/BulkDeleteBody' responses: '204': $ref: '#/components/responses/204DeleteSuccess' '400': $ref: '#/components/responses/400BadRequestDelete' '401': $ref: '#/components/responses/401Unauthorized' '403': $ref: '#/components/responses/403ForbiddenDelete' '404': $ref: '#/components/responses/404NotFoundDelete' '413': $ref: '#/components/responses/413RequestTooLarge' '500': $ref: '#/components/responses/500InternalServerError' security: - {} - BasicAuth: [] - BearerAuth: [] - PassportAuth: [] x-codegen-request-body-name: body /upload-request: post: summary: Request upload methods for files description: >- **Optional Endpoint**: This endpoint is not required for DRS server implementations. Not all DRS servers support upload functionality. Request upload method details and temporary credentials for uploading one or more files to an underlying storage service. This endpoint allows clients to obtain the necessary information to upload files before they are registered as DRS objects. **Discovery**: Before using this endpoint, clients should check the `/service-info` endpoint to determine if upload operations are supported. Look for `drs.uploadRequestSupported: true` and `drs.supportedUploadMethodTypes` to understand which upload methods are available. Also check `drs.maxUploadSize` and `drs.maxUploadRequestLength` for server limits. **Usage Flow:** 1. **Discovery**: Client checks `/service-info` endpoint to confirm upload support (`drs.uploadRequestSupported: true`) and available methods (`drs.supportedUploadMethods`) 2. Client sends an upload request with file metadata (name, size, checksums, MIME type) and preferred upload method 3. Server responds with available upload methods (S3, HTTPS, Google Cloud Storage, etc.) and temporary credentials 4. Client selects one or more upload methods from the response and uses the corresponding credentials to upload the file to the storage service 5. Once uploaded, the client registers the files as DRS objects including access methods that correspond to the upload methods used with a POST request to `/objects/register`, the server will return fully formed DRS objects with server minted unique IDs. 6. The registered DRS object is accessible through DRS API read endpoints **Authentication:** The endpoint supports multiple authentication methods including GA4GH Passport tokens sent in the request body. Passport tokens enable fine-grained authorization based on data access policies. **Upload Methods**: Response may include multiple options (s3, https, gs, ftp/sftp) for flexibility. Note that servers may return a subset of their advertised `supportedUploadMethodTypes` based on file-specific factors such as file type, size, or server policies. **File Integrity**: All requests must include at least one checksum per file (SHA-256, MD5, or other IANA-registered algorithms). **Server Validation**: Servers MAY validate checksums/sizes but are not required to. Check service-info for validation behavior. Servers do not validate MIME types against actual file content - clients are responsible for providing accurate MIME type information. operationId: PostUploadRequest security: - {} - BasicAuth: [] - BearerAuth: [] - PassportAuth: [] requestBody: $ref: '#/components/requestBodies/UploadRequestBody' responses: '200': $ref: '#/components/responses/200UploadRequest' '400': $ref: '#/components/responses/400BadRequest' '401': $ref: '#/components/responses/401Unauthorized' '403': $ref: '#/components/responses/403Forbidden' '500': $ref: '#/components/responses/500InternalServerError' tags: - Upload Request components: securitySchemes: BasicAuth: type: http scheme: basic description: > A valid authorization token must be passed in the 'Authorization' header, e.g. "Basic ${token_string}" BearerAuth: type: http scheme: bearer description: >- A valid authorization token must be passed in the 'Authorization' header, e.g. "Bearer ${token_string}" PassportAuth: type: http scheme: bearer x-in: body bearerFormat: JWT description: >- A valid GA4GH Passport must be passed in the body of an HTTP POST request as a tokens[] array. schemas: ServiceType: description: Type of a GA4GH service type: object required: - group - artifact - version properties: group: type: string description: >- Namespace in reverse domain name format. Use `org.ga4gh` for implementations compliant with official GA4GH specifications. For services with custom APIs not standardized by GA4GH, or implementations diverging from official GA4GH specifications, use a different namespace (e.g. your organization's reverse domain name). example: org.ga4gh artifact: type: string description: >- Name of the API or GA4GH specification implemented. Official GA4GH types should be assigned as part of standards approval process. Custom artifacts are supported. example: beacon version: type: string description: >- Version of the API or specification. GA4GH specifications use semantic versioning. example: 1.0.0 Service: description: GA4GH service type: object required: - id - name - type - organization - version properties: id: type: string description: >- Unique ID of this service. Reverse domain name notation is recommended, though not required. The identifier should attempt to be globally unique so it can be used in downstream aggregator services e.g. Service Registry. example: org.ga4gh.myservice name: type: string description: Name of this service. Should be human readable. example: My project type: $ref: '#/components/schemas/ServiceType' description: type: string description: >- Description of the service. Should be human readable and provide information about the service. example: This service provides... organization: type: object description: Organization providing the service required: - name - url properties: name: type: string description: Name of the organization responsible for the service example: My organization url: type: string format: uri description: URL of the website of the organization (RFC 3986 format) example: https://example.com contactUrl: type: string format: uri description: >- URL of the contact for the provider of this service, e.g. a link to a contact form (RFC 3986 format), or an email (RFC 2368 format). example: mailto:support@example.com documentationUrl: type: string format: uri description: >- URL of the documentation of this service (RFC 3986 format). This should help someone learn how to use your service, including any specifics required to access data, e.g. authentication. example: https://docs.myservice.example.com createdAt: type: string format: date-time description: >- Timestamp describing when the service was first deployed and available (RFC 3339 format) example: '2019-06-04T12:58:19Z' updatedAt: type: string format: date-time description: >- Timestamp describing when the service was last updated (RFC 3339 format) example: '2019-06-04T12:58:19Z' environment: type: string description: >- Environment the service is running in. Use this to distinguish between production, development and testing/staging deployments. Suggested values are prod, test, dev, staging. However this is advised and not enforced. example: test version: type: string description: >- Version of the service being described. Semantic versioning is recommended, but other identifiers, such as dates or commit hashes, are also allowed. The version should be changed whenever the service is updated. example: 1.0.0 DrsService: type: object required: - type - maxBulkRequestLength properties: maxBulkRequestLength: type: integer description: >- DEPRECATED - In 2.0 this will move to under the drs section of service info and not at the root level. The max length the bulk request endpoints can handle (>= 1) before generating a 413 error e.g. how long can the arrays bulk_object_ids and bulk_object_access_ids be for this server. type: type: object required: - artifact properties: artifact: type: string enum: - drs example: drs drs: type: object required: - maxBulkRequestLength properties: maxBulkRequestLength: type: integer description: >- The max length the bulk request endpoints can handle (>= 1) before generating a 413 error e.g. how long can the arrays bulk_object_ids and bulk_object_access_ids be for this server. objectCount: type: integer description: The total number of objects in this DRS service. totalObjectSize: type: integer description: >- The total size of all objects in this DRS service in bytes. As a general best practice, file bytes are counted for each unique file and not cloud mirrors or other redundant copies. uploadRequestSupported: type: boolean description: >- Indicates whether this DRS server supports upload request operations via the `/upload-request` endpoint. If true, clients can request upload methods and credentials for uploading files. If false or missing, the server does not support upload request coordination. default: false objectRegistrationSupported: type: boolean description: >- Indicates whether this DRS server supports object registration operations via the `/objects/register` endpoint. If true, clients can register uploaded files or existing data as DRS objects. If false or missing, the server does not support object registration. default: false supportedUploadMethodTypes: type: array items: type: string enum: - s3 - gs - https - ftp - sftp description: >- List of upload methods supported by this DRS server. Only present when uploadRequestSupported is true. Clients can use this information to determine which upload methods are available before making upload requests. - **s3**: Direct S3 upload with temporary AWS credentials - **gs**: Google Cloud Storage upload with access tokens - **https**: Presigned POST URL for HTTP uploads - **ftp**: File Transfer Protocol uploads - **sftp**: Secure File Transfer Protocol uploads - **gsiftp**: GridFTP secure file transfer - **globus**: Globus transfer service for high-performance data movement maxUploadSize: type: integer format: int64 description: >- Maximum file size in bytes that can be uploaded via the upload endpoints. Only present when uploadRequestSupported is true. If not specified, there is no explicit size limit. maxUploadRequestLength: type: integer description: >- Maximum number of files that can be included in a single upload request. Only present when uploadRequestSupported is true. If not specified, defaults to the same value as maxBulkRequestLength. maxRegisterRequestLength: type: integer description: >- Maximum number of candidate objects that can be included in a single registration request. Only present when objectRegistrationSupported is true. If not specified, defaults to the same value as maxBulkRequestLength. validateChecksums: type: boolean description: >- Indicates whether this DRS server validates file checksums against the provided metadata. If true, the server will verify that uploaded and registered files match their declared checksums and may reject objects with mismatches. If false or missing, the server does not perform checksum validation and relies on client-provided metadata. Only present when at least one of uploadRequestSupported or objectRegistrationSupported or checksumAdditionSupported are true. default: false validateFileSizes: type: boolean description: >- Indicates whether this DRS server validates file sizes against the provided metadata. If true, the server will verify that uploaded files match their declared sizes and may reject uploads with mismatches. If false or missing, the server does not perform file size validation and relies on client-provided metadata. Only present when uploadRequestSupported or objectRegistrationSupported is true. default: false validateAccessMethods: type: boolean description: >- Indicates whether this DRS server validates access methods by following the URLs to check that they resolve to the expected objects (e.g. by checking that the file sizes and checksums match) If true, the server will attempt to verify checksums/content before accepting access methods. If false or missing, the server trusts client-provided access methods without validation. Only present when at least one of objectRegistrationSupported or accessMethodUpdateSupported are true. default: false relatedFileStorageSupported: type: boolean description: >- Indicates whether this DRS server supports storing files from the same upload request under a common prefix or folder structure. If true, the server will organize related files together in storage, enabling bioinformatics workflows that expect co-located files (e.g., CRAM + CRAI, VCF + TBI). If false or missing, the server may distribute files across different storage locations or prefixes. Only present when uploadRequestSupported is true. This feature is particularly valuable for genomics tools like samtools that expect index files to be co-located with data files. default: false deleteSupported: type: boolean description: >- Indicates whether this DRS server supports delete operations via the delete endpoints. If true, clients can delete DRS objects using PUT requests to `/objects/{object_id}/delete` and `/objects/delete`. If false or missing, the server does not support delete operations and will return 404 for delete endpoint requests. Like upload functionality, delete support is entirely optional and servers remain DRS compliant without it. default: false maxBulkDeleteLength: type: integer description: >- Maximum number of objects that can be deleted in a single bulk delete request via `/objects/delete`. Only present when deleteSupported is true. If not specified when delete is supported, defaults to the same value as maxBulkRequestLength. Servers may enforce lower limits for delete operations compared to other bulk operations for safety reasons. deleteStorageDataSupported: type: boolean description: >- Indicates whether this DRS server supports attempting to delete underlying storage data when clients request it. If true, the server will attempt to delete both metadata and storage files when `delete_storage_data: true` is specified in delete requests. If false or missing, the server only supports metadata deletion regardless of client request, preserving underlying storage data. Only present when deleteSupported is true. This is a capability flag indicating what the server can attempt, not a default behavior setting. Note: Storage deletion attempts may fail due to permissions, network issues, or storage service errors. default: false accessMethodUpdateSupported: type: boolean description: >- Indicates whether this DRS server supports updating access methods for existing objects. If true, clients can update access methods using `/objects/{object_id}/access-methods` and `/objects/access-methods` endpoints. If false or missing, the server does not support access method updates. default: false maxBulkAccessMethodUpdateLength: type: integer description: >- Maximum number of objects that can be updated in a single bulk access method update request. Only present when accessMethodUpdateSupported is true. If not specified, defaults to maxBulkRequestLength. checksumAdditionSupported: type: boolean description: >- Indicates whether this DRS server supports adding new checksums for for existing objects. If true, clients can update access methods using `/objects/{object_id}/checksums` and `/objects/checksums` endpoints. If false or missing, the server does not support checksum addition. default: false maxBulkChecksumAdditionLength: type: integer description: >- Maximum number of objects that can be updated in a single bulk checksum addition request. Only present when checksumAdditionSupported is true. If not specified, defaults to maxBulkRequestLength. fetchByChecksumSupported: type: boolean description: >- Indicates whether this DRS server supports fetching objects by checksum. If true, clients can fetch DRS objects using `/objects/checksum/{checksum}`, noting that it is possible for multiple objects to have the same checksum. If false or missing, the server does not support fetching by cejcsum. default: false Error: type: object description: An object that can optionally include information about the error. properties: msg: type: string description: A detailed error message. status_code: type: integer description: The integer representing the HTTP status code (e.g. 200, 404). Checksum: type: object required: - checksum - type properties: checksum: type: string description: The hex-string encoded checksum for the data type: type: string description: >- The digest method used to create the checksum. The value (e.g. `sha-256`) SHOULD be listed as `Hash Name String` in the https://www.iana.org/assignments/named-information/named-information.xhtml#hash-alg[IANA Named Information Hash Algorithm Registry]. Other values MAY be used, as long as implementors are aware of the issues discussed in https://tools.ietf.org/html/rfc6920#section-9.4[RFC6920]. GA4GH may provide more explicit guidance for use of non-IANA-registered algorithms in the future. Until then, if implementers do choose such an algorithm (e.g. because it's implemented by their storage provider), they SHOULD use an existing standard `type` value such as `md5`, `etag`, `crc32c`, `trunc512`, or `sha1`. example: sha-256 AccessURL: type: object required: - url properties: url: type: string description: >- A fully resolvable URL that can be used to fetch the actual object bytes. headers: type: array items: type: string description: >- An optional list of headers to include in the HTTP request to `url`. These headers can be used to provide auth tokens required to fetch the object bytes. example: - 'Authorization: Basic Z2E0Z2g6ZHJz' Authorizations: type: object properties: drs_object_id: type: string supported_types: type: array items: type: string enum: - None - BasicAuth - BearerAuth - PassportAuth description: >- An Optional list of support authorization types. More than one can be supported and tried in sequence. Defaults to `None` if empty or missing. passport_auth_issuers: type: array items: type: string description: >- If authorizations contain `PassportAuth` this is a required list of visa issuers (as found in a visa's `iss` claim) that may authorize access to this object. The caller must only provide passports that contain visas from this list. It is strongly recommended that the caller validate that it is appropriate to send the requested passport/visa to the DRS server to mitigate attacks by malicious DRS servers requesting credentials they should not have. bearer_auth_issuers: type: array items: type: string description: >- If authorizations contain `BearerAuth` this is an optional list of issuers that may authorize access to this object. The caller must provide a token from one of these issuers. If this is empty or missing it assumed the caller knows which token to send via other means. It is strongly recommended that the caller validate that it is appropriate to send the requested token to the DRS server to mitigate attacks by malicious DRS servers requesting credentials they should not have. AccessMethod: type: object required: - type properties: type: type: string enum: - s3 - gs - ftp - gsiftp - globus - htsget - https - file description: Type of the access method. access_url: allOf: - $ref: '#/components/schemas/AccessURL' - description: >- An `AccessURL` that can be used to fetch the actual object bytes. Note that at least one of `access_url` and `access_id` must be provided. access_id: type: string description: >- An arbitrary string to be passed to the `/access` method to get an `AccessURL`. This string must be unique within the scope of a single object. Note that at least one of `access_url` and `access_id` must be provided. cloud: type: string description: >- Name of the cloud service provider that the object belongs to. If the cloud service is Amazon Web Services, Google Cloud Platform or Azure the values should be `aws`, `gcp`, or `azure` respectively. example: aws, gcp, or azure region: type: string description: >- Name of the region in the cloud service provider that the object belongs to. example: us-east-1 available: type: boolean description: >- Availablity of file in the cloud. This label defines if this file is immediately accessible via DRS. Any delay or requirement of thawing mechanism if the file is in offline/archival storage is classified as false, meaning it is unavailable. example: true authorizations: allOf: - $ref: '#/components/schemas/Authorizations' - description: >- When `access_id` is provided, `authorizations` provides information about how to authorize the `/access` method. ContentsObject: type: object required: - name properties: name: type: string description: >- A name declared by the bundle author that must be used when materialising this object, overriding any name directly associated with the object itself. The name must be unique within the containing bundle. This string is made up of uppercase and lowercase letters, decimal digits, hyphen, period, and underscore [A-Za-z0-9.-_]. See http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_282[portable filenames]. id: type: string description: >- A DRS identifier of a `DrsObject` (either a single blob or a nested bundle). If this ContentsObject is an object within a nested bundle, then the id is optional. Otherwise, the id is required. drs_uri: type: array description: >- A list of full DRS identifier URI paths that may be used to obtain the object. These URIs may be external to this DRS instance. example: - drs://drs.example.org/314159 items: type: string contents: type: array description: >- If this ContentsObject describes a nested bundle and the caller specified "?expand=true" on the request, then this contents array must be present and describe the objects within the nested bundle. items: $ref: '#/components/schemas/ContentsObject' DrsObject: type: object required: - id - self_uri - size - created_time - checksums properties: id: type: string description: An identifier unique to this `DrsObject` name: type: string description: >- A string that can be used to name a `DrsObject`. This string is made up of uppercase and lowercase letters, decimal digits, hyphen, period, and underscore [A-Za-z0-9.-_]. See http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_282[portable filenames]. self_uri: type: string description: >- A drs:// hostname-based URI, as defined in the DRS documentation, that tells clients how to access this object. The intent of this field is to make DRS objects self-contained, and therefore easier for clients to store and pass around. For example, if you arrive at this DRS JSON by resolving a compact identifier-based DRS URI, the `self_uri` presents you with a hostname and properly encoded DRS ID for use in subsequent `access` endpoint calls. example: drs://drs.example.org/314159 size: type: integer format: int64 description: >- For blobs, the blob size in bytes. For bundles, the cumulative size, in bytes, of items in the `contents` field. created_time: type: string format: date-time description: >- Timestamp of content creation in RFC3339. (This is the creation time of the underlying content, not of the JSON object.) updated_time: type: string format: date-time description: >- Timestamp of content update in RFC3339, identical to `created_time` in systems that do not support updates. (This is the update time of the underlying content, not of the JSON object.) version: type: string description: >- A string representing a version. (Some systems may use checksum, a RFC3339 timestamp, or an incrementing version number.) mime_type: type: string description: A string providing the mime-type of the `DrsObject`. example: application/json checksums: type: array minItems: 1 items: $ref: '#/components/schemas/Checksum' description: >- The checksum of the `DrsObject`. At least one checksum must be provided. For blobs, the checksum is computed over the bytes in the blob. For bundles, the checksum is computed over a sorted concatenation of the checksums of its top-level contained objects (not recursive, names not included). The list of checksums is sorted alphabetically (hex-code) before concatenation and a further checksum is performed on the concatenated checksum value. For example, if a bundle contains blobs with the following checksums: md5(blob1) = 72794b6d md5(blob2) = 5e089d29 Then the checksum of the bundle is: md5( concat( sort( md5(blob1), md5(blob2) ) ) ) = md5( concat( sort( 72794b6d, 5e089d29 ) ) ) = md5( concat( 5e089d29, 72794b6d ) ) = md5( 5e089d2972794b6d ) = f7a29a04 access_methods: type: array minItems: 1 items: $ref: '#/components/schemas/AccessMethod' description: >- The list of access methods that can be used to fetch the `DrsObject`. Required for single blobs; optional for bundles. contents: type: array description: >- If not set, this `DrsObject` is a single blob. If set, this `DrsObject` is a bundle containing the listed `ContentsObject` s (some of which may be further nested). items: $ref: '#/components/schemas/ContentsObject' description: type: string description: A human readable description of the `DrsObject`. aliases: type: array items: type: string description: >- A list of strings that can be used to find other metadata about this `DrsObject` from external metadata sources. These aliases can be used to represent secondary accession numbers or external GUIDs. summary: type: object description: A summary of what was resolved. properties: requested: type: integer description: Number of items requested. resolved: type: integer description: Number of objects resolved. unresolved: type: integer description: Number of objects not resolved. unresolved: type: array description: Error codes for each unresolved drs objects. items: type: object properties: error_code: type: integer object_ids: type: array items: type: string BulkObjectAccessId: type: object description: The object that contains object_id/access_id tuples properties: passports: type: array items: type: string bulk_object_access_ids: type: array items: type: object properties: bulk_object_id: type: string description: DRS object ID bulk_access_ids: type: array description: DRS object access ID items: type: string BulkAccessURL: type: object required: - url properties: drs_object_id: type: string drs_access_id: type: string url: type: string description: >- A fully resolvable URL that can be used to fetch the actual object bytes. headers: type: array items: type: string description: >- An optional list of headers to include in the HTTP request to `url`. These headers can be used to provide auth tokens required to fetch the object bytes. example: 'Authorization: Basic Z2E0Z2g6ZHJz' DrsObjectCandidate: type: object required: - size - checksums properties: name: type: string description: >- A string that can be used to name a `DrsObject`. This string is made up of uppercase and lowercase letters, decimal digits, hyphen, period, and underscore [A-Za-z0-9.-_]. See http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_282[portable filenames]. size: type: integer format: int64 description: >- For blobs, the blob size in bytes. For bundles, the cumulative size, in bytes, of items in the `contents` field. version: type: string description: >- A string representing a version. (Some systems may use checksum, a RFC3339 timestamp, or an incrementing version number.) mime_type: type: string description: A string providing the mime-type of the `DrsObject`. example: application/json checksums: type: array minItems: 1 items: $ref: '#/components/schemas/Checksum' description: >- The checksum of the `DrsObject`. At least one checksum must be provided. For blobs, the checksum is computed over the bytes in the blob. For bundles, the checksum is computed over a sorted concatenation of the checksums of its top-level contained objects (not recursive, names not included). The list of checksums is sorted alphabetically (hex-code) before concatenation and a further checksum is performed on the concatenated checksum value. For example, if a bundle contains blobs with the following checksums: md5(blob1) = 72794b6d md5(blob2) = 5e089d29 Then the checksum of the bundle is: md5( concat( sort( md5(blob1), md5(blob2) ) ) ) = md5( concat( sort( 72794b6d, 5e089d29 ) ) ) = md5( concat( 5e089d29, 72794b6d ) ) = md5( 5e089d2972794b6d ) = f7a29a04 access_methods: type: array minItems: 1 items: $ref: '#/components/schemas/AccessMethod' description: >- The list of access methods that can be used to fetch the `DrsObject`. Required for single blobs; optional for bundles. contents: type: array description: >- If not set, this `DrsObject` is a single blob. If set, this `DrsObject` is a bundle containing the listed `ContentsObject` s (some of which may be further nested). items: $ref: '#/components/schemas/ContentsObject' description: type: string description: A human readable description of the `DrsObject`. aliases: type: array items: type: string description: >- A list of strings that can be used to find other metadata about this `DrsObject` from external metadata sources. These aliases can be used to represent secondary accession numbers or external GUIDs. AccessMethodUpdateRequest: type: object required: - access_methods properties: access_methods: type: array items: $ref: '#/components/schemas/AccessMethod' minItems: 1 description: New access methods for the DRS object passports: type: array items: type: string description: Optional GA4GH Passport JWTs for authorization BulkAccessMethodUpdateRequest: type: object required: - updates properties: updates: type: array items: type: object required: - object_id - access_methods properties: object_id: type: string description: DRS object ID to update access_methods: type: array items: $ref: '#/components/schemas/AccessMethod' minItems: 1 description: New access methods for this object minItems: 1 description: Array of access method updates to perform passports: type: array items: type: string description: Optional GA4GH Passport JWTs for authorization ChecksumAdditionRequest: type: object required: - checksums properties: checksums: type: array items: $ref: '#/components/schemas/Checksum' minItems: 1 description: New checksums for the DRS object passports: type: array items: type: string description: Optional GA4GH Passport JWTs for authorization BulkChecksumAdditionRequest: type: object required: - additions properties: updates: type: array items: type: object required: - object_id - checksums properties: object_id: type: string description: DRS object ID to add checksum for checksums: type: array items: $ref: '#/components/schemas/Checksum' minItems: 1 description: New checksums for this object minItems: 1 description: Array of checksum additions passports: type: array items: type: string description: Optional GA4GH Passport JWTs for authorization DeleteRequest: type: object description: Request body for single object delete operations properties: passports: type: array items: type: string example: >- eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.JJ5rN0ktP0qwyZmIPpxmF_p7JsxAZH6L6brUxtad3CM description: >- the encoded JWT GA4GH Passport that contains embedded Visas. The overall JWT is signed as are the individual Passport Visas. delete_storage_data: type: boolean default: false description: >- If true, delete both DRS object metadata and underlying storage data (follows server's deleteStorageDataSupported capability). If false (default), only delete DRS object metadata while preserving underlying storage data. Clients must explicitly set this to true to enable storage data deletion, ensuring intentional choice for this potentially destructive operation. BulkDeleteRequest: type: object description: Request body for bulk delete operations required: - bulk_object_ids properties: bulk_object_ids: type: array items: type: string description: Array of DRS object IDs to delete example: - drs_object_123456 - drs_object_789012 - drs_object_345678 passports: type: array items: type: string example: >- eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.JJ5rN0ktP0qwyZmIPpxmF_p7JsxAZH6L6brUxtad3CM description: >- the encoded JWT GA4GH Passport that contains embedded Visas. The overall JWT is signed as are the individual Passport Visas. delete_storage_data: type: boolean default: false description: >- If true, delete both DRS object metadata and underlying storage data (follows server's deleteStorageDataSupported capability). If false (default), only delete DRS object metadata while preserving underlying storage data. Clients must explicitly set this to true to enable storage data deletion, ensuring intentional choice for this potentially destructive operation. UploadRequestObject: type: object required: - name - size - mime_type - checksums properties: name: type: string description: The name of the file to upload size: type: integer format: int64 description: Size of the file in bytes mime_type: type: string description: MIME type of the file checksums: type: array items: $ref: '#/components/schemas/Checksum' minItems: 1 description: Array of checksums for file integrity verification description: type: string description: Optional description of the file aliases: type: array items: type: string description: Optional array of alternative names for the file upload_method_types: type: array items: type: string description: >- Optional array of requested upload method types which should match those published by the server in the `supportedUploadMethodsTypes` field in the `/service-info` response. Servers SHALL try to honour requests, but the server may not be able to offer the requested upload type for specific file types/sizes etc. The server MAY return a 400 error if it cannot honour the request, or MAY return an alternative supported upload method. UploadRequest: type: object required: - requests properties: requests: type: array items: $ref: '#/components/schemas/UploadRequestObject' minItems: 1 description: Array of upload requests for files passports: type: array items: type: string description: Optional array of GA4GH Passport JWTs for authorization UploadMethod: type: object required: - type - access_url properties: type: type: string enum: - s3 - gs - https - ftp - sftp - gsiftp - globus description: >- Type of upload method. Implementations MAY support any subset of these types. The 'https' type can be used to return a presigned POST URL and is expected to be the most common implementation for typical file uploads. This method provides a simple HTTP POST interface that works with standard web clients. The 's3' type is primarily intended to support uploads of large files that want to take advantage of multipart uploads and automatic retries implemented in AWS libraries. This method provides direct access to S3-specific upload capabilities. Other common implementations include 'gs' for Google Cloud Storage and 'sftp' for secure FTP uploads. access_url: allOf: - $ref: '#/components/schemas/AccessURL' - description: >- An `AccessURL` that specifies where the file will be accessible after upload. This URL will be used as the access_url in the eventual DRS object, ensuring consistency between upload and retrieval operations. Note that this `upload_details` may contain additional URLs, such as pre-signed POST URLs to support uploading files, these may not be the same as the `access_url`. region: type: string description: >- Cloud region for the upload location. Optional for non-cloud storage types. example: us-east-1 upload_details: type: object additionalProperties: true description: >- A dictionary of storage provider specific configuration details that vary by upload method type. The contents and structure depend on the specific upload method being used. UploadResponseObject: type: object required: - name - size - mime_type - checksums properties: name: type: string description: The name of the file size: type: integer format: int64 description: Size of the file in bytes mime_type: type: string description: MIME type of the file checksums: type: array items: $ref: '#/components/schemas/Checksum' minItems: 1 description: Array of checksums for file integrity verification description: type: string description: Optional description of the file aliases: type: array items: type: string description: Optional array of alternative names upload_methods: type: array items: $ref: '#/components/schemas/UploadMethod' description: Available methods for uploading this file UploadResponse: type: object required: - responses properties: responses: type: array items: $ref: '#/components/schemas/UploadResponseObject' description: List of upload responses for the requested files responses: 200ServiceInfo: description: Retrieve info about the DRS service content: application/json: schema: allOf: - $ref: '#/components/schemas/Service' - $ref: '#/components/schemas/DrsService' 500InternalServerError: description: An unexpected error occurred. content: application/json: schema: $ref: '#/components/schemas/Error' 200OkDrsObject: description: The `DrsObject` was found successfully content: application/json: schema: $ref: '#/components/schemas/DrsObject' 202Accepted: description: > The operation is delayed and will continue asynchronously. The client should retry this same request after the delay specified by Retry-After header. headers: Retry-After: description: > Delay in seconds. The client should retry this same request after waiting for this duration. To simplify client response processing, this must be an integral relative time in seconds. This value SHOULD represent the minimum duration the client should wait before attempting the operation again with a reasonable expectation of success. When it is not feasible for the server to determine the actual expected delay, the server may return a brief, fixed value instead. schema: type: integer format: int64 400BadRequest: description: The request is malformed. content: application/json: schema: $ref: '#/components/schemas/Error' 401Unauthorized: description: The request is unauthorized. content: application/json: schema: $ref: '#/components/schemas/Error' 403Forbidden: description: The requester is not authorized to perform this action. content: application/json: schema: $ref: '#/components/schemas/Error' 404NotFoundDrsObject: description: The requested `DrsObject` wasn't found. content: application/json: schema: $ref: '#/components/schemas/Error' 404NotFoundAccess: description: The requested `AccessURL` wasn't found. content: application/json: schema: $ref: '#/components/schemas/Error' 200OkAuthorizations: description: '`Authorizations` were found successfully' content: application/json: schema: $ref: '#/components/schemas/Authorizations' AuthorizationsNotSupported: description: '`Authorizations` are not supported for this object. Default to `None`.' 200OkDrsObjects: description: The `DrsObjects` were found successfully content: application/json: schema: type: object properties: summary: $ref: '#/components/schemas/summary' unresolved_drs_objects: $ref: '#/components/schemas/unresolved' resolved_drs_object: type: array items: $ref: '#/components/schemas/DrsObject' 413RequestTooLarge: description: The bulk request is too large. content: application/json: schema: $ref: '#/components/schemas/Error' examples: bulk_limit_exceeded: summary: Bulk delete limit exceeded description: >- Request contains more objects than server's maximum bulk delete limit value: msg: >- Bulk delete request contains 150 objects but server maximum is 100. Check maxBulkDeleteLength in service-info. status_code: 413 request_size_too_large: summary: Request payload too large description: The overall request payload exceeds server limits value: msg: Request payload size exceeds server limit of 1MB status_code: 413 200OkBulkAuthorizations: description: '`Authorizations` were found successfully' content: application/json: schema: type: object properties: summary: $ref: '#/components/schemas/summary' unresolved_drs_objects: $ref: '#/components/schemas/unresolved' resolved_drs_object: type: array items: $ref: '#/components/schemas/Authorizations' 200OkAccess: description: The `AccessURL` was found successfully content: application/json: schema: $ref: '#/components/schemas/AccessURL' 200OkAccesses: description: The `AccessURL` was found successfully content: application/json: schema: type: object properties: summary: $ref: '#/components/schemas/summary' unresolved_drs_objects: $ref: '#/components/schemas/unresolved' resolved_drs_object_access_urls: type: array items: $ref: '#/components/schemas/BulkAccessURL' 201ObjectsCreated: description: >- DRS objects were successfully registered as an atomic transaction. Returns the complete DRS objects with server-minted IDs and timestamps. All candidate objects were validated and registered together - if any had failed, none would have been registered. content: application/json: schema: type: object required: - objects properties: objects: type: array items: $ref: '#/components/schemas/DrsObject' description: >- Array of registered DRS objects in the same order as the candidates in the request examples: single_object_created: summary: Single object registered description: Response after registering one DRS object value: objects: - id: drs_obj_a1b2c3d4e5f6 self_uri: drs://drs.example.org/drs_obj_a1b2c3d4e5f6 name: sample_data.vcf size: 1048576 mime_type: text/plain created_time: '2024-01-15T10:30:00Z' updated_time: '2024-01-15T10:30:00Z' version: '1.0' checksums: - checksum: >- e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 type: sha-256 access_methods: - type: s3 access_url: url: s3://my-bucket/uploads/sample_data.vcf description: Variant call format file for sample analysis multiple_objects_created: summary: Multiple objects registered description: Response after registering multiple DRS objects value: objects: - id: drs_obj_a1b2c3d4e5f6 self_uri: drs://drs.example.org/drs_obj_a1b2c3d4e5f6 name: genome_assembly.fasta size: 3221225472 mime_type: text/plain created_time: '2024-01-15T09:00:00Z' updated_time: '2024-01-15T09:00:00Z' version: '1.0' checksums: - checksum: >- a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3 type: sha-256 access_methods: - type: s3 access_url: url: s3://genomics-bucket/assemblies/hg38.fasta description: Human genome reference assembly - id: drs_obj_f6e5d4c3b2a1 self_uri: drs://drs.example.org/drs_obj_f6e5d4c3b2a1 name: annotations.gff3 size: 524288000 mime_type: text/plain created_time: '2024-01-15T09:15:00Z' updated_time: '2024-01-15T09:15:00Z' version: '1.0' checksums: - checksum: >- b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9 type: sha-256 access_methods: - type: https access_url: url: https://data.example.org/files/annotations.gff3 description: Gene annotations in GFF3 format 200AccessMethodUpdate: description: >- Access methods successfully updated. Returns the updated DRS object with new access methods and updated timestamp. content: application/json: schema: $ref: '#/components/schemas/DrsObject' 200BulkAccessMethodUpdate: description: >- Access methods successfully updated for all objects. Returns updated DRS objects with new access methods and updated timestamps. content: application/json: schema: type: object required: - objects properties: objects: type: array items: $ref: '#/components/schemas/DrsObject' description: Array of updated DRS objects 200ChecksumAddition: description: >- Checksums successfully added. Returns the updated DRS object with new checksums and updated timestamp. content: application/json: schema: $ref: '#/components/schemas/DrsObject' 200BulkChecksumAddition: description: >- Checksums added for all objects. Returns updated DRS objects with new checksums and updated timestamps. content: application/json: schema: type: object required: - objects properties: objects: type: array items: $ref: '#/components/schemas/DrsObject' description: Array of updated DRS objects 204DeleteSuccess: description: >- All DRS objects were successfully deleted. For bulk operations, this indicates that the entire atomic transaction completed successfully - all requested objects have been deleted. Storage data deletion (if requested) was attempted but success is not guaranteed. 400BadRequestDelete: description: >- The delete request is malformed or contains unsupported parameters (e.g., delete_storage_data: true when server doesn't support storage data deletion). content: application/json: schema: $ref: '#/components/schemas/Error' examples: unsupported_storage_deletion: summary: Storage data deletion not supported description: >- Client requested storage data deletion but server doesn't support it value: msg: >- Server does not support storage data deletion. Set delete_storage_data to false or omit the parameter. status_code: 400 invalid_request_format: summary: Malformed request body description: Request body contains invalid JSON or missing required fields value: msg: >- Invalid request body: bulk_object_ids is required for bulk delete operations status_code: 400 empty_object_list: summary: Empty object ID list description: Bulk delete request with empty object ID array value: msg: bulk_object_ids cannot be empty status_code: 400 403ForbiddenDelete: description: The client is not authorized to delete the requested DRS object. content: application/json: schema: $ref: '#/components/schemas/Error' examples: insufficient_permissions: summary: Insufficient delete permissions description: Client lacks permission to delete the specified object value: msg: Client lacks delete permission for object drs_object_123456 status_code: 403 invalid_passport: summary: Invalid GA4GH Passport description: Provided GA4GH Passport is invalid or expired value: msg: Invalid or expired GA4GH Passport provided status_code: 403 missing_visa: summary: Missing required visa description: GA4GH Passport lacks required visa for delete operation value: msg: >- GA4GH Passport does not contain required visa for delete operation on this object status_code: 403 404NotFoundDelete: description: >- The requested DRS object for deletion wasn't found, or delete endpoints are not supported by this server. content: application/json: schema: $ref: '#/components/schemas/Error' examples: object_not_found: summary: DRS object not found description: The specified DRS object does not exist value: msg: DRS object drs_object_123456 does not exist status_code: 404 delete_not_supported: summary: Delete operations not supported description: This server does not support delete operations value: msg: Delete operations are not supported by this server status_code: 404 endpoint_not_found: summary: Delete endpoint not available description: Delete endpoints are not implemented on this server value: msg: >- The requested endpoint /objects/delete is not available on this server status_code: 404 200UploadRequest: description: >- Upload request processed successfully. Returns upload methods and temporary credentials for the requested files. content: application/json: schema: $ref: '#/components/schemas/UploadResponse' examples: s3_upload: summary: S3 upload method response description: Response with S3 upload method and temporary credentials value: responses: - name: sample_data.vcf size: 1048576 mime_type: text/plain checksums: - checksum: >- e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 type: sha-256 description: Variant call format file for sample analysis aliases: - sample_001_variants - vcf_batch_2024 upload_methods: - type: s3 access_url: url: >- https://my-bucket.s3.amazonaws.com/uploads/drs_object_123456 region: us-east-1 upload_details: bucket: my-bucket key: uploads/drs_object_123456 access_key_id: AKIAIOSFODNN7EXAMPLE secret_access_key: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY session_token: >- AQoEXAMPLEH4aoAH0gNCAPyJxz4BlCFFxWNE1OPTgk5TthT+FvwqnKwRcOIfrRh3c/LTo6UDdyJwOOvEVPvLXCrrrUtdnniCEXAMPLE/IvU1dYUg2RVAJBanLiHb4IgRmpRV3zrkuWJOgQs8IZZaIv2BXIa2R4OlgkBN9bkUDNCJiBeb/AXlzBBko7b15fjrBs2+cTQtpZ3CYWFXG8C5zqx37wnOE49mRl/+OtkIKGO7fAE expires_at: '2024-01-01T12:00:00Z' https_upload: summary: HTTPS upload method response description: Response with HTTPS presigned POST URL for direct upload value: responses: - name: genome_assembly.fasta size: 3221225472 mime_type: text/plain checksums: - checksum: >- a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3 type: sha-256 - checksum: 098f6bcd4621d373cade4e832627b4f6 type: md5 description: Human genome reference assembly aliases: - hg38_reference upload_methods: - type: https access_url: url: >- https://upload.example.org/v1/files/drs_object_789012 upload_details: post_url: >- https://upload.example.org/v1/files/drs_object_789012?signature=abc123 multiple_methods: summary: Multiple upload methods response description: Response offering multiple upload method options for flexibility value: responses: - name: annotations.gff3 size: 524288000 mime_type: text/plain checksums: - checksum: >- b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9 type: sha-256 description: Gene annotations in GFF3 format upload_methods: - type: s3 access_url: url: >- https://genomics-bucket.s3.us-west-2.amazonaws.com/uploads/drs_object_345678 region: us-west-2 upload_details: bucket: genomics-bucket key: uploads/drs_object_345678 access_key_id: AKIAI44QH8DHBEXAMPLE secret_access_key: je7MtGbClwBF/2Zp9Utk/h3yCo8nvbEXAMPLEKEY session_token: temporary_session_token_here expires_at: '2024-01-01T12:00:00Z' - type: https access_url: url: >- https://upload-api.example.org/files/drs_object_345678 upload_details: post_url: >- https://upload-api.example.org/files/drs_object_345678?token=upload_token_12345 - type: gs access_url: url: >- https://storage.googleapis.com/genomics-uploads/drs_object_345678 region: us-central1 upload_details: bucket: genomics-uploads key: drs_object_345678 access_token: >- ya29.AHES6ZRVmB7fkLtd1XTmq6mo0S1wqZZi3-Lh_s-6Uw7p8vtgSwg expires_at: '2024-01-01T12:00:00Z' parameters: ObjectId: in: path name: object_id required: true description: '`DrsObject` identifier' schema: type: string Expand: in: query name: expand schema: type: boolean example: false description: >- If false and the object_id refers to a bundle, then the ContentsObject array contains only those objects directly contained in the bundle. That is, if the bundle contains other bundles, those other bundles are not recursively included in the result. If true and the object_id refers to a bundle, then the entire set of objects in the bundle is expanded. That is, if the bundle contains other bundles, then those other bundles are recursively expanded and included in the result. Recursion continues through the entire sub-tree of the bundle. If the object_id refers to a blob, then the query parameter is ignored. AccessId: in: path name: access_id required: true description: An `access_id` from the `access_methods` list of a `DrsObject` schema: type: string Checksum: in: path name: access_id required: true description: An `access_id` from the `access_methods` list of a `DrsObject` schema: type: string requestBodies: PostObjectBody: required: true content: application/json: schema: type: object properties: expand: type: boolean example: false description: >- If false and the object_id refers to a bundle, then the ContentsObject array contains only those objects directly contained in the bundle. That is, if the bundle contains other bundles, those other bundles are not recursively included in the result. If true and the object_id refers to a bundle, then the entire set of objects in the bundle is expanded. That is, if the bundle contains other bundles, then those other bundles are recursively expanded and included in the result. Recursion continues through the entire sub-tree of the bundle. If the object_id refers to a blob, then the query parameter is ignored. passports: type: array items: type: string example: >- eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.JJ5rN0ktP0qwyZmIPpxmF_p7JsxAZH6L6brUxtad3CM description: >- the encoded JWT GA4GH Passport that contains embedded Visas. The overall JWT is signed as are the individual Passport Visas. examples: retrieve_with_auth: summary: Retrieve object with authentication description: Request object metadata with passport authentication value: expand: false passports: - >- eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.JJ5rN0ktP0qwyZmIPpxmF_p7JsxAZH6L6brUxtad3CM retrieve_expanded_bundle: summary: Retrieve expanded bundle with authentication description: Request expanded bundle contents with passport authentication value: expand: true passports: - >- eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.JJ5rN0ktP0qwyZmIPpxmF_p7JsxAZH6L6brUxtad3CM - >- eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.additional_passport_signature BulkObjectBody: required: true content: application/json: schema: type: object required: - bulk_object_ids properties: passports: type: array items: type: string example: >- eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.JJ5rN0ktP0qwyZmIPpxmF_p7JsxAZH6L6brUxtad3CM description: >- the encoded JWT GA4GH Passport that contains embedded Visas. The overall JWT is signed as are the individual Passport Visas. bulk_object_ids: type: array items: type: string minItems: 1 description: An array of ObjectIDs to retrieve metadata for examples: bulk_retrieve: summary: Bulk retrieve objects description: >- Retrieve metadata for multiple existing DRS objects using their IDs value: passports: - >- eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.JJ5rN0ktP0qwyZmIPpxmF_p7JsxAZH6L6brUxtad3CM bulk_object_ids: - drs_object_123456 - drs_object_789012 - drs_object_345678 bulk_retrieve_no_auth: summary: Bulk retrieve without authentication description: Retrieve metadata for public DRS objects value: bulk_object_ids: - drs_object_public_123 - drs_object_public_456 Passports: required: true content: application/json: schema: type: object properties: passports: type: array items: type: string example: >- eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.JJ5rN0ktP0qwyZmIPpxmF_p7JsxAZH6L6brUxtad3CM description: >- the encoded JWT GA4GH Passport that contains embedded Visas. The overall JWT is signed as are the individual Passport Visas. RegisterObjectsBody: description: Request body for registering DRS objects after upload required: true content: application/json: schema: type: object required: - candidates properties: candidates: type: array items: $ref: '#/components/schemas/DrsObjectCandidate' minItems: 1 description: >- Array of DRS object candidates to register (server will mint IDs and timestamps) passports: type: array items: type: string description: Optional array of GA4GH Passport JWTs for authorization examples: single_object_registration: summary: Register a single object description: Register one DRS object after upload value: candidates: - name: sample_data.vcf size: 1048576 mime_type: text/plain checksums: - checksum: >- e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 type: sha-256 description: Variant call format file for sample analysis access_methods: - type: s3 access_url: url: s3://my-bucket/uploads/sample_data.vcf passports: - eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9... bulk_object_registration: summary: Register multiple objects description: Register multiple DRS objects in a single request value: candidates: - name: genome_assembly.fasta size: 3221225472 mime_type: text/plain checksums: - checksum: >- a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3 type: sha-256 description: Human genome reference assembly access_methods: - type: s3 access_url: url: s3://genomics-bucket/assemblies/hg38.fasta - name: annotations.gff3 size: 524288000 mime_type: text/plain checksums: - checksum: >- b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9 type: sha-256 description: Gene annotations in GFF3 format access_methods: - type: https access_url: url: https://data.example.org/files/annotations.gff3 AccessMethodUpdateBody: description: Request body for updating access methods of a DRS object required: true content: application/json: schema: $ref: '#/components/schemas/AccessMethodUpdateRequest' BulkAccessMethodUpdateBody: description: Request body for bulk updating access methods of multiple DRS objects required: true content: application/json: schema: $ref: '#/components/schemas/BulkAccessMethodUpdateRequest' ChecksumAdditionBody: description: Request body for adding checksums to a DRS object required: true content: application/json: schema: $ref: '#/components/schemas/ChecksumAdditionRequest' BulkChecksumAdditionBody: description: Request body for bulk adding checksums to multiple DRS objects required: true content: application/json: schema: $ref: '#/components/schemas/BulkChecksumAdditionRequest' DeleteBody: required: false content: application/json: schema: $ref: '#/components/schemas/DeleteRequest' examples: metadata_only_delete: summary: Delete metadata only (default) description: >- Delete DRS object metadata while preserving underlying storage data. This is the default and safest option. value: passports: - >- eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.JJ5rN0ktP0qwyZmIPpxmF_p7JsxAZH6L6brUxtad3CM delete_storage_data: false full_delete: summary: Delete metadata and storage data description: >- Delete both DRS object metadata and underlying storage data (requires server support via deleteStorageDataSupported) value: passports: - >- eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.JJ5rN0ktP0qwyZmIPpxmF_p7JsxAZH6L6brUxtad3CM delete_storage_data: true no_auth_delete: summary: Delete without authentication description: >- Delete operation without GA4GH Passport authentication (for public objects or when using Bearer token in headers) value: delete_storage_data: false minimal_request: summary: Minimal delete request description: >- Simplest delete request with no authentication and default behavior (metadata only) value: {} multiple_passports: summary: Multiple GA4GH Passports description: >- Delete request with multiple GA4GH Passports for complex authorization scenarios value: passports: - >- eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.JJ5rN0ktP0qwyZmIPpxmF_p7JsxAZH6L6brUxtad3CM - >- eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.AbCdEfGhIjKlMnOpQrStUvWxYz delete_storage_data: false update_workflow: summary: Safe update workflow description: >- Delete metadata only to enable safe update pattern (delete metadata, then re-register with new metadata) value: passports: - >- eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.JJ5rN0ktP0qwyZmIPpxmF_p7JsxAZH6L6brUxtad3CM delete_storage_data: false BulkDeleteBody: required: true content: application/json: schema: $ref: '#/components/schemas/BulkDeleteRequest' examples: bulk_metadata_delete: summary: Bulk delete metadata only description: >- Delete multiple DRS objects metadata while preserving underlying storage data (default and safest option) value: bulk_object_ids: - drs_object_123456 - drs_object_789012 - drs_object_345678 passports: - >- eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.JJ5rN0ktP0qwyZmIPpxmF_p7JsxAZH6L6brUxtad3CM delete_storage_data: false bulk_full_delete: summary: Bulk delete metadata and storage data description: >- Delete both metadata and storage data for multiple objects (requires server support via deleteStorageDataSupported) value: bulk_object_ids: - drs_object_123456 - drs_object_789012 passports: - >- eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.JJ5rN0ktP0qwyZmIPpxmF_p7JsxAZH6L6brUxtad3CM delete_storage_data: true bulk_no_auth_delete: summary: Bulk delete without authentication description: >- Bulk delete operation without GA4GH Passport authentication (for public objects or when using Bearer token in headers) value: bulk_object_ids: - drs_object_123456 - drs_object_789012 delete_storage_data: false large_bulk_delete: summary: Large bulk delete operation description: >- Delete many objects in a single request (check maxBulkDeleteLength in service-info for limits) value: bulk_object_ids: - drs_object_001 - drs_object_002 - drs_object_003 - drs_object_004 - drs_object_005 - drs_object_006 - drs_object_007 - drs_object_008 - drs_object_009 - drs_object_010 passports: - >- eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.JJ5rN0ktP0qwyZmIPpxmF_p7JsxAZH6L6brUxtad3CM delete_storage_data: false mixed_object_types: summary: Mixed object types deletion description: >- Delete objects with different ID formats and types in a single request value: bulk_object_ids: - drs://example.org/123456 - local_object_789 - uuid:550e8400-e29b-41d4-a716-446655440000 - compact:prefix:identifier passports: - >- eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.JJ5rN0ktP0qwyZmIPpxmF_p7JsxAZH6L6brUxtad3CM delete_storage_data: false minimal_bulk_request: summary: Minimal bulk delete request description: Simplest bulk delete request with required fields only value: bulk_object_ids: - drs_object_123456 - drs_object_789012 UploadRequestBody: required: true content: application/json: schema: $ref: '#/components/schemas/UploadRequest' examples: single_file: summary: Single file upload request description: Request upload methods for a single file value: requests: - name: sample_data.vcf size: 1048576 mime_type: text/plain checksums: - checksum: >- e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 type: sha-256 description: Variant call format file for sample analysis aliases: - sample_001_variants - vcf_batch_2024 passports: - >- eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJnYTRnaF9wYXNzcG9ydF92MSI6W119.JJ5rN0ktP0qwyZmIPpxmF_p7JsxAZH6L6brUxtad3CM multiple_files: summary: Multiple files upload request description: Request upload methods for multiple files with different types value: requests: - name: genome_assembly.fasta size: 3221225472 mime_type: text/plain checksums: - checksum: >- a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3 type: sha-256 - checksum: 098f6bcd4621d373cade4e832627b4f6 type: md5 description: Human genome reference assembly aliases: - hg38_reference - name: annotations.gff3 size: 524288000 mime_type: text/plain checksums: - checksum: >- b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9 type: sha-256 description: Gene annotations in GFF3 format - name: metadata.json size: 2048 mime_type: application/json checksums: - checksum: >- c89e4c5c7f2c8c8e8c8c8c8c8c8c8c8c8c8c8c8c8c8c8c8c8c8c8c8c8c8c8c8c type: sha-256 description: Sample metadata and experimental conditions no_passports: summary: Upload request without authentication description: >- Request for public upload endpoints that don't require authentication value: requests: - name: public_dataset.csv size: 10240 mime_type: text/csv checksums: - checksum: >- d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35 type: sha-256 description: Public research dataset