# OpenSearch GraphQL Schema This directory contains a conceptual GraphQL schema for the OpenSearch search, analytics, and observability platform — the open-source, community-driven suite forked from Elasticsearch and Kibana and maintained under the Linux Foundation's OpenSearch Software Foundation. ## Source APIs | API | Base URL | Docs | |-----|----------|------| | OpenSearch Search & Indexing REST API | `https://{cluster-host}:9200` | https://opensearch.org/docs/latest/api-reference/ | | OpenSearch Security Plugin REST API | `https://{cluster-host}:9200/_plugins/_security/api` | https://docs.opensearch.org/latest/security/access-control/api/ | | ML Commons Plugin REST API | `https://{cluster-host}:9200/_plugins/_ml` | https://docs.opensearch.org/latest/ml-commons-plugin/api/ | | k-NN Plugin REST API | `https://{cluster-host}:9200/_plugins/_knn` | https://docs.opensearch.org/latest/search-plugins/knn/api/ | | Search Pipelines REST API | `https://{cluster-host}:9200/_search/pipeline` | https://docs.opensearch.org/latest/search-plugins/search-pipelines/index/ | ## Schema file [opensearch-schema.graphql](opensearch-schema.graphql) ## Type inventory ### Index management | Type | Description | |------|-------------| | `Index` | An OpenSearch index with health, status, settings, mappings, aliases, stats, shards, and segments | | `IndexDetails` | Creation date, UUID, shard/replica counts, and provided name | | `IndexMapping` | Field mappings, dynamic mapping mode, and dynamic templates | | `IndexSettings` | Shard count, replicas, refresh interval, codec, analysis config, and sort configuration | | `IndexAlias` | Named alias pointing to one or more indices with optional filter and routing | | `IndexStats` | Doc counts, store size, and operation counters for indexing, search, merges, refreshes, and flushes | ### Shard and segment management | Type | Description | |------|-------------| | `Shard` | A single shard allocation including node assignment, state, and doc counts | | `ShardDetails` | Routing state, relocation info, and unassigned reason | | `ShardStats` | Per-shard operation counters for indexing, search, merges, translog, and flush | | `Segment` | A Lucene segment within a shard | | `SegmentDetails` | Generation, size, committed/search flags, compound state, and Lucene version | ### Document operations | Type | Description | |------|-------------| | `Document` | A stored document with index, id, version, sequence number, and source | | `DocumentMeta` | Identity metadata: index, id, version, seqNo, primaryTerm, routing | | `DocumentSource` | Raw JSON fields of a document | ### Search | Type | Description | |------|-------------| | `SearchResult` | Top-level search response with timing, shard stats, hits, and aggregations | | `SearchHits` | The hits container with total count, max score, and individual hits | | `SearchHit` | A single matched document with score, highlight, sort values, and inner hits | | `HitDetails` | Score explanation, matched query names, and nested path metadata | | `SearchShardStats` | Total, successful, skipped, and failed shard counts for a search | | `SearchQuery` | The full query DSL body: query, size, from, sort, source filtering, aggregations, etc. | | `PointInTimeRef` | A point-in-time (PIT) token used for consistent pagination | ### Query DSL | Type | Description | |------|-------------| | `QueryClause` | Union-style container for any query type (bool, term, match, range, geo, nested, knn, neural, etc.) | | `BoolQuery` | Compound query with must, should, must_not, and filter clauses | | `TermQuery` | Exact-value term query on a keyword or numeric field | | `MatchQuery` | Full-text match query with fuzziness, operator, and analyzer options | | `RangeQuery` | Range query supporting gte, gt, lte, lt with optional format and timezone | | `GeoQuery` | Geospatial query (geo_distance, geo_bounding_box) with location and distance | | `NestedQuery` | Query scoped to a nested object path with score mode | | `KNNQuery` | k-nearest-neighbor vector search query with optional pre-filter | | `NeuralQuery` | Semantic neural search query using an ML model for embedding generation | ### Aggregations | Type | Description | |------|-------------| | `AggregationResult` | A named aggregation result that may contain buckets, a metric value, or raw output | | `BucketResult` | A single bucket (term, date histogram, range, etc.) with doc count and sub-aggregations | | `MetricResult` | A single-value or multi-value metric result (avg, sum, min, max, stats, percentiles, etc.) | ### Ingest | Type | Description | |------|-------------| | `Pipeline` | An ingest pipeline with id, description, processors, and on-failure handlers | | `PipelineDetails` | Version, processor list, and on-failure processor list | | `Ingest` | Container for all pipelines and global ingest statistics | | `IngestDetails` | Node-level ingest statistics including preprocessing time and per-pipeline stats | ### Cluster and nodes | Type | Description | |------|-------------| | `Cluster` | Top-level cluster object with name, details, health, nodes, and settings | | `ClusterDetails` | UUID, version, Lucene version, and wire/index compatibility versions | | `ClusterHealth` | Status (green/yellow/red), shard counts, pending tasks, and in-flight fetch count | | `NodeDetails` | Per-node information including roles, OS, JVM, thread pools, HTTP, and storage | | `StorageInfo` | Total, free, and available disk bytes plus usage percentage | ### Snapshots | Type | Description | |------|-------------| | `Snapshot` | A point-in-time backup of indices and data streams in a repository | | `SnapshotDetails` | UUID, version, indices, state, start/end times, duration, shard results, and failures | ### Data streams | Type | Description | |------|-------------| | `DataStream` | A time-series abstraction over rolling backing indices | | `DataStreamDetails` | Timestamp field, generation counter, ILM policy, hidden/system/replicated flags | ### Index templates | Type | Description | |------|-------------| | `Template` | A composable or legacy index template with patterns, mappings, settings, and aliases | | `TemplateDetails` | Version, priority, composed-of component list, data stream config, and deprecation flag | ### Security (Security Plugin) | Type | Description | |------|-------------| | `SecurityRole` | An OpenSearch security role with cluster and index-level permissions | | `RolePermission` | Index permission entry with patterns, DLS filter, FLS fields, masked fields, and allowed actions | | `User` | An internal user with backend roles, attributes, and description | | `UserDetails` | Password hash, mapped roles, tenants, and legacy opendistro roles | | `APIKey` | An API key with expiration, invalidation status, and metadata | | `Token` | An authentication token (JWT-style) returned by the auth endpoint | ### Search Pipelines | Type | Description | |------|-------------| | `SearchPipeline` | A named search pipeline applied at query time | | `SearchPipelineDetails` | Version, description, request processors, response processors, and phase-result processors | ### ML Commons | Type | Description | |------|-------------| | `MLModel` | A machine-learning model registered in the ML Commons plugin | | `MLTask` | An asynchronous ML task (train, register, deploy, predict) with state and response | | `MLConnector` | An external model connector (e.g., Bedrock, SageMaker, OpenAI) with protocol and credentials | ### k-NN | Type | Description | |------|-------------| | `KNNIndex` | A k-NN vector field with engine config, dimension, and runtime statistics | ## Operations overview ### Query (read operations) - Index: `index`, `indices`, `indexStats`, `indexMapping`, `indexSettings`, `indexAliases` - Shards/segments: `shards`, `segments` - Documents: `document`, `mget` - Search: `search`, `msearch`, `explain`, `fieldCaps`, `rankEval`, `aggregations` - Ingest: `pipeline`, `pipelines`, `ingestStats` - Cluster: `cluster`, `clusterHealth`, `clusterSettings`, `nodes`, `nodeStats` - Snapshots: `snapshot`, `snapshots`, `snapshotRepositories`, `snapshotStatus` - Data streams: `dataStream`, `dataStreams` - Templates: `template`, `templates`, `componentTemplate`, `componentTemplates` - Security: `role`, `roles`, `user`, `users`, `apiKey`, `apiKeys`, `actionGroup`, `actionGroups`, `tenant`, `tenants`, `roleMapping`, `roleMappings` - Search pipelines: `searchPipeline`, `searchPipelines` - ML Commons: `mlModel`, `mlModels`, `mlTask`, `mlConnector`, `mlConnectors` - k-NN: `knnIndex`, `knnIndices`, `knnWarmup` - Cat APIs: `catIndices`, `catShards`, `catNodes`, `catHealth`, `catAliases`, `catTemplates`, `catSegments`, `catCount`, `catPendingTasks` ### Mutation (write operations) - Index management: create, delete, close, open, clone, split, shrink, rollover, put mapping, put settings, put/delete alias, refresh, flush, forcemerge, clear cache - Documents: index, update, delete, bulk, updateByQuery, deleteByQuery, reindex - Ingest: create/delete/simulate pipeline - Cluster: update settings, reroute shards, allocation explain - Snapshots: create/delete repository, create/delete/restore/clone snapshot - Data streams: create, delete, rollover - Templates: put/delete index template, put/delete component template - Security: create/update/delete users, roles, role mappings, action groups, tenants; create/invalidate API keys; change password; patch security config - Search pipelines: create/delete search pipeline - ML Commons: register, deploy, undeploy, delete, predict, train models; create/update/delete connectors ## Design notes - All mutation inputs accept `JSON` scalars for the request body to mirror the OpenSearch REST API's flexible document model. - Aggregation results are modeled with both `BucketResult` and `MetricResult` to cover the two major aggregation families; `raw: JSON` is provided as an escape hatch for complex nested aggregations. - The `QueryClause` type intentionally mirrors the OpenSearch DSL union structure. In a production schema, each clause type would be split into separate input types per field. - Authentication is assumed to be handled at the transport layer (Basic Auth, API key header, or JWT bearer token via the Security plugin). - The ML Commons types cover the remote connector pattern used to invoke foundation models on AWS Bedrock, SageMaker, and third-party providers.