---
name: tooluniverse-precision-medicine-stratification
description: Patient stratification for precision medicine — integrate genomic, clinical, and therapeutic data to split patients into responder/non-responder groups, risk tiers, or treatment-decision groups. Use for stratification-by-biomarker, treatment-selection logic, and personalized therapeutic strategy reports per patient subgroup.
disable-model-invocation: true
---

# Precision Medicine Patient Stratification

Transform patient genomic and clinical profiles into actionable risk stratification, treatment recommendations, and personalized therapeutic strategies.

## Reasoning Before Searching

Stratification means splitting patients into groups that respond differently to a treatment or have different prognoses. Ask these questions before running any tools:

1. **What molecular feature predicts response?** Candidates: somatic mutation (e.g., EGFR L858R), germline variant (e.g., BRCA1 LoF), expression level (e.g., HER2 overexpression), germline pharmacogenomic variant (e.g., CYP2C19 PM), or composite biomarker (e.g., TMB-H + MSI-H).
2. **Is the predictive feature actionable?** Knowing it must change treatment — either the drug choice, dose, or monitoring plan. A variant with prognostic value but no therapeutic consequence is not a stratification biomarker.
3. **What is the evidence level for the stratifier?** FDA-approved companion diagnostic (T1) vs. exploratory (T4) changes how much weight to place on the finding.

Route to the correct Phase 3 path BEFORE running Phase 2 tools — cancer, metabolic, CVD, rare disease, and autoimmune pipelines require different stratifiers.

**LOOK UP DON'T GUESS**: Never assume a variant is pathogenic, never assume a gene is relevant to a disease, never assign metabolizer status without PharmGKB or CPIC evidence.

**KEY PRINCIPLES**:
1. **Report-first** - Create report file FIRST, then populate progressively
2. **Disease-specific logic** - Cancer vs metabolic vs rare disease pipelines diverge at Phase 3
3. **Multi-level integration** - Germline + somatic + expression + clinical data layers
4. **Evidence-graded** - Every finding has an evidence tier (T1-T4)
5. **Quantitative output** - Precision Medicine Risk Score (0-100)
6. **Source-referenced** - Every statement cites the tool/database source
7. **English-first queries** - Always use English terms in tool calls

**Reference files** (same directory):
- `TOOLS_REFERENCE.md` - Tool parameters, response formats, phase-by-phase tool lists
- `SCORING_REFERENCE.md` - Scoring matrices, risk tiers, pathogenicity tables, PGx tables
- `REPORT_TEMPLATE.md` - Output report template, treatment algorithms, completeness requirements
- `EXAMPLES.md` - Six worked examples (cancer, metabolic, NSCLC, CVD, rare, neuro)
- `QUICK_START.md` - Sample prompts and output summary

---

## COMPUTE, DON'T DESCRIBE
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.

## When to Use

Apply when user asks about patient risk stratification, treatment selection, prognosis prediction, or personalized therapeutic strategy for any disease with genomic/clinical data.

**NOT for** (use other skills instead):
- Single variant interpretation -> `tooluniverse-variant-interpretation`
- Immunotherapy-specific prediction -> `tooluniverse-immunotherapy-response-prediction`
- Drug safety profiling only -> `tooluniverse-adverse-event-detection`
- Target validation -> `tooluniverse-drug-target-validation`
- Clinical trial search only -> `tooluniverse-clinical-trial-matching`
- Drug-drug interaction only -> `tooluniverse-drug-drug-interaction`
- PRS calculation only -> `tooluniverse-polygenic-risk-score`

---

## Input Parsing

### Required
- **Disease/condition**: Free-text disease name
- **At least one of**: Germline variants, somatic mutations, gene list, or clinical biomarkers

### Optional (improves stratification)
- Age, sex, ethnicity, disease stage, comorbidities, prior treatments, family history
- Current medications (for DDI and PGx), stratification goal

### Disease Type Classification

Classify into one category (determines Phase 3 routing):

| Category | Examples |
|----------|----------|
| **CANCER** | Breast, lung, colorectal, melanoma |
| **METABOLIC** | Type 2 diabetes, obesity, NAFLD |
| **CARDIOVASCULAR** | CAD, heart failure, AF |
| **NEUROLOGICAL** | Alzheimer, Parkinson, epilepsy |
| **RARE/MONOGENIC** | Marfan, CF, sickle cell, Huntington |
| **AUTOIMMUNE** | RA, lupus, MS, Crohn's |

---

## Critical Tool Parameter Notes

See `TOOLS_REFERENCE.md` for full details. Key gotchas:

- **MyGene_query_genes**: param is `query` (NOT `q`)
- **EnsemblVEP_annotate_rsid**: param is `variant_id` (NOT `rsid`)
- **ensembl_lookup_gene**: REQUIRES `species='homo_sapiens'`
- **DrugBank tools**: ALL require 4 params: `query`, `case_sensitive`, `exact_match`, `limit`
- **cBioPortal_get_mutations**: `gene_list` is a STRING (space-separated), not array
- **PubMed_search_articles**: Returns a plain list of dicts, NOT `{articles: [...]}`
- **fda_pharmacogenomic_biomarkers**: Use `limit=1000` for all results
- **gnomAD**: May return "Service overloaded" - skip gracefully
- **OpenTargets**: Always nested `{data: {entity: {field: ...}}}` structure

---

## Workflow Overview

```
Phase 1: Disease Disambiguation & Profile Standardization
Phase 2: Genetic Risk Assessment
Phase 3: Disease-Specific Molecular Stratification (routes by disease type)
Phase 4: Pharmacogenomic Profiling
Phase 5: Comorbidity & Drug Interaction Risk
Phase 6: Molecular Pathway Analysis
Phase 7: Clinical Evidence & Guidelines
Phase 8: Clinical Trial Matching
Phase 9: Integrated Scoring & Recommendations
```

---

## Phase 1: Disease Disambiguation & Profile Standardization

1. **Resolve disease to EFO ID** using `OpenTargets_get_disease_id_description_by_name`
2. **Classify disease type** (CANCER/METABOLIC/CVD/NEUROLOGICAL/RARE/AUTOIMMUNE)
3. **Parse genomic data** into structured format (gene, variant, type)
4. **Resolve gene IDs** using `MyGene_query_genes` to get Ensembl/Entrez IDs

## Phase 2: Genetic Risk Assessment

1. **Germline variant pathogenicity**: `ClinVar_search_variants`, `EnsemblVEP_annotate_rsid`/`_hgvs`
2. **Gene-disease association**: `OpenTargets_target_disease_evidence`
3. **GWAS polygenic risk**: `gwas_get_associations_for_trait`, `OpenTargets_search_gwas_studies_by_disease`
4. **Population frequency**: `gnomad_get_variant`
5. **Gene constraint**: `gnomad_get_gene_constraints` (pLI, LOEUF scores)

Scoring: See `SCORING_REFERENCE.md` for genetic risk score component (0-35 points).

## Phase 3: Disease-Specific Molecular Stratification

### CANCER PATH
1. **Molecular subtyping**: `cBioPortal_get_mutations`, `HPA_get_cancer_prognostics_by_gene`
2. **TMB/MSI/HRD**: `fda_pharmacogenomic_biomarkers` for FDA cutoffs
3. **Prognostic stratification**: Combine stage + molecular features

### METABOLIC PATH
1. **Genetic risk integration**: `GWAS_search_associations_by_gene`, `OpenTargets_target_disease_evidence`
2. **Complication risk**: Based on HbA1c, duration, existing complications

### CVD PATH
1. **FH gene check**: `ClinVar_search_variants` for LDLR, APOB, PCSK9
2. **Statin PGx**: `PharmGKB_get_clinical_annotations` for SLCO1B1

### RARE DISEASE PATH
1. **Causal variant identification**: `ClinVar_search_variants`
2. **Genotype-phenotype**: `UniProt_get_disease_variants_by_accession`

Scoring: See `SCORING_REFERENCE.md` for disease-specific tables.

## Phase 4: Pharmacogenomic Profiling

1. **Drug-metabolizing enzymes**: `PharmGKB_get_clinical_annotations`, `PharmGKB_get_dosing_guidelines`
2. **FDA PGx biomarkers**: `fda_pharmacogenomic_biomarkers` (use `limit=1000`)
3. **Treatment-specific PGx**: `PharmGKB_get_drug_details`

Scoring: See `SCORING_REFERENCE.md` for PGx risk score (0-10 points).

## Phase 5: Comorbidity & Drug Interaction Risk

1. **Disease overlap**: `OpenTargets_get_associated_targets_by_disease_efoId`
2. **DDI check**: `drugbank_get_drug_interactions_by_drug_name_or_id`, `FDA_get_drug_interactions_by_drug_name`
3. **PGx-amplified DDI**: If PM genotype + CYP inhibitor, flag compounded risk

## Phase 6: Molecular Pathway Analysis

1. **Pathway enrichment**: `enrichr_gene_enrichment_analysis` (libs: `KEGG_2021_Human`, `Reactome_2022`, `GO_Biological_Process_2023`)
2. **Reactome mapping**: `ReactomeAnalysis_pathway_enrichment`, `Reactome_map_uniprot_to_pathways`
3. **Network analysis**: `STRING_get_interaction_partners`, `STRING_functional_enrichment`
4. **Druggable targets**: `OpenTargets_get_target_tractability_by_ensemblID`

## Phase 7: Clinical Evidence & Guidelines

1. **Guidelines search**: `PubMed_Guidelines_Search` (fallback: `PubMed_search_articles`)
2. **FDA-approved therapies**: `OpenTargets_get_associated_drugs_by_disease_efoId`, `FDA_get_indications_by_drug_name`
3. **Biomarker-drug evidence**: `civic_search_evidence_items`, `civic_search_assertions`

## Phase 8: Clinical Trial Matching

1. **Biomarker-driven trials**: `search_clinical_trials` with condition + intervention
2. **Precision medicine trials**: `search_clinical_trials` for basket/umbrella trials

## Phase 9: Integrated Scoring & Recommendations

### Score Components (total 0-100)
- **Genetic Risk** (0-35): Pathogenicity + gene-disease association + PRS
- **Clinical Risk** (0-30): Stage/biomarkers/comorbidities
- **Molecular Features** (0-25): Driver mutations, subtypes, actionable targets
- **Pharmacogenomic Risk** (0-10): Metabolizer status, HLA alleles

### Risk Tiers
| Score | Tier | Management |
|-------|------|------------|
| 75-100 | VERY HIGH | Intensive treatment, subspecialty referral, clinical trial |
| 50-74 | HIGH | Aggressive treatment, close monitoring |
| 25-49 | INTERMEDIATE | Standard guideline-based care, PGx-guided dosing |
| 0-24 | LOW | Surveillance, prevention, risk factor modification |

### Output
Generate report per `REPORT_TEMPLATE.md`. See `SCORING_REFERENCE.md` for detailed scoring matrices.

---

## Common Use Patterns

See `EXAMPLES.md` for six detailed worked examples:
1. **Cancer + actionable mutation**: Breast cancer, BRCA1, ER+/HER2- -> Score ~55-65 (HIGH)
2. **Metabolic + PGx concern**: T2D, CYP2C19 PM on clopidogrel -> Score ~55-65 (HIGH)
3. **NSCLC comprehensive**: EGFR L858R, TMB 25, PD-L1 80% -> Score ~75-85 (VERY HIGH)
4. **CVD risk**: LDL 190, SLCO1B1*5, family hx MI -> Score ~50-60 (HIGH)
5. **Rare disease**: Marfan, FBN1 variant -> Score ~55-65 (HIGH)
6. **Neurological risk**: APOE e4/e4, family hx Alzheimer's -> Score ~60-72 (HIGH)