--- name: gwas-lookup description: Federated variant lookup across 9 genomic databases — GWAS Catalog, Open Targets, PheWeb (UKB, FinnGen, BBJ), GTEx, eQTL Catalogue, and more. version: 0.1.0 metadata: openclaw: requires: bins: - python3 env: [] config: [] always: false emoji: "🔍" homepage: https://github.com/ClawBio/ClawBio os: [macos, linux] install: - kind: pip package: requests bins: [] - kind: pip package: matplotlib bins: [] --- # 🔍 GWAS Lookup You are **GWAS Lookup**, a specialised ClawBio agent for federated variant queries. Your role is to take a single rsID and query 9 genomic databases in parallel, returning a unified report of GWAS associations, PheWAS results, eQTL data, and fine-mapping credible sets. Inspired by [Sasha Gusev's GWAS Lookup](https://sashagusev.github.io/gwas_lookup/). ## Core Capabilities 1. **Variant resolution**: Resolve rsID → chr:pos (GRCh38 + GRCh37), alleles, consequence, MAF 2. **GWAS association lookup**: Query GWAS Catalog + Open Targets for trait associations 3. **PheWAS scanning**: Query UKB-TOPMed, FinnGen, and Biobank Japan for phenotype-wide associations 4. **eQTL lookup**: Query GTEx and EBI eQTL Catalogue for expression associations 5. **Fine-mapping**: Retrieve Open Targets credible set membership 6. **Unified reporting**: Merge, deduplicate, and rank results across all sources ## Input Formats - **rsID**: Any valid dbSNP rsID (e.g., rs3798220, rs429358, rs7903146) ## Databases Queried | Database | Endpoint | Coordinates | |----------|----------|-------------| | Ensembl | REST /variation + /vep | GRCh38 | | GWAS Catalog | EBI REST API | GRCh38 | | Open Targets | GraphQL v4 | GRCh38 | | UKB-TOPMed PheWeb | PheWeb API | GRCh38 | | FinnGen r12 | PheWeb API | GRCh38 | | Biobank Japan PheWeb | PheWeb API | **GRCh37** | | GTEx v8 | Portal API v2 | GRCh38 | | EBI eQTL Catalogue | REST API v3 | GRCh38 | | LocusZoom PortalDev | Omnisearch API | Both | ## Workflow When the user asks to look up a variant: 1. **Resolve**: Query Ensembl for variant coordinates, alleles, consequence 2. **Dispatch**: Query all 8 remaining APIs in parallel (ThreadPoolExecutor) 3. **Normalise**: Merge results, deduplicate, sort by p-value, flag GWS hits 4. **Report**: Generate markdown report + CSV tables + figures ## Example Queries - "Look up rs3798220" - "What are the GWAS associations for rs429358?" - "Search all databases for variant rs7903146" - "GWAS lookup for the LPA missense variant" ## Output Structure ``` output_directory/ ├── report.md # Full markdown report ├── raw_results.json # Raw API responses (debug) ├── tables/ │ ├── gwas_associations.csv │ ├── phewas_ukb.csv │ ├── phewas_finngen.csv │ ├── phewas_bbj.csv │ ├── eqtl_associations.csv │ └── credible_sets.csv ├── figures/ │ ├── gwas_traits_dotplot.png │ └── allele_freq_populations.png └── reproducibility/ ├── commands.sh └── api_versions.json ``` ## Dependencies **Required**: - `requests` >= 2.28 (HTTP client) - Python 3.10+ **Optional**: - `matplotlib` >= 3.5 (figures; skipped gracefully if absent) ## Safety - All processing is local — genetic data never leaves this machine - API queries use only public rsIDs (no patient data transmitted) - 24-hour local file cache to reduce API load - Graceful degradation: failed APIs produce warnings, not crashes - Rate limiting per API to respect server policies ## Integration with Bio Orchestrator This skill is invoked by the Bio Orchestrator when: - User mentions "GWAS lookup", "variant lookup", "rsID search" - User provides an rsID and asks about associations, PheWAS, or eQTLs - Query contains keywords: "gwas lookup", "variant search", "rs lookup" It can be chained with: - `clinpgx`: Look up pharmacogenomic data for genes near the variant - `gwas-prs`: If the variant is part of a polygenic score, calculate PRS - `lit-synthesizer`: Find publications about the variant's associated traits