--- name: tooluniverse-chemical-safety description: Comprehensive chemical safety and toxicology assessment integrating ADMET-AI predictions, CTD toxicogenomics, FDA label safety data, DrugBank safety profiles, and STITCH chemical-protein interactions. Performs predictive toxicology (AMES, DILI, LD50, carcinogenicity), organ/system toxicity profiling, chemical-gene-disease relationship mapping, regulatory safety extraction, and environmental hazard assessment. Use when asked about chemical toxicity, drug safety profiling, ADMET properties, environmental health risks, chemical hazard assessment, or toxicogenomic analysis. --- # Chemical Safety & Toxicology Assessment Comprehensive chemical safety and toxicology analysis integrating predictive AI models, curated toxicogenomics databases, regulatory safety data, and chemical-biological interaction networks. Generates structured risk assessment reports with evidence grading. ## When to Use This Skill **Triggers**: - "Is this chemical toxic?" / "What are the toxicity endpoints for [compound]?" - "Assess the safety profile of [drug/chemical]" - "What are the ADMET properties of [SMILES]?" - "What genes does [chemical] interact with?" - "What diseases are linked to [chemical] exposure?" - "Predict toxicity for these molecules" - "Drug safety assessment for [drug name]" - "Environmental health risk of [chemical]" - "Chemical hazard profiling" - "Toxicogenomic analysis of [compound]" **Use Cases**: 1. **Predictive Toxicology**: AI-predicted toxicity endpoints (AMES mutagenicity, DILI, LD50, carcinogenicity, skin reactions) for novel compounds via SMILES 2. **ADMET Profiling**: Full absorption, distribution, metabolism, excretion, toxicity characterization 3. **Toxicogenomics**: Chemical-gene interaction mapping, gene-disease associations from CTD 4. **Regulatory Safety**: FDA label warnings, boxed warnings, contraindications, adverse reactions 5. **Drug Safety Assessment**: Combined DrugBank safety + FDA labels + adverse event data 6. **Chemical-Protein Interactions**: STITCH-based chemical-protein binding and interaction networks 7. **Environmental Toxicology**: Chemical-disease associations for environmental contaminants --- ## KEY PRINCIPLES 1. **Report-first approach** - Create report file FIRST, then populate progressively 2. **Tool parameter verification** - Verify params via `get_tool_info` before calling unfamiliar tools 3. **Evidence grading** - Grade all safety claims by evidence strength (T1-T4) 4. **Citation requirements** - Every toxicity finding must have inline source attribution 5. **Mandatory completeness** - All sections must exist with data minimums or explicit "No data" notes 6. **Disambiguation first** - Resolve compound identity (name -> SMILES, CID, ChEMBL ID) before analysis 7. **Negative results documented** - "No toxicity signals found" is data; empty sections are failures 8. **Conservative risk assessment** - When evidence is ambiguous, flag as "requires further investigation" 9. **English-first queries** - Always use English chemical/drug names in tool calls --- ## Evidence Grading System (MANDATORY) Grade every toxicity claim by evidence strength: | Tier | Symbol | Criteria | Examples | |------|--------|----------|----------| | **T1** | [T1] | Direct human evidence, regulatory finding | FDA boxed warning, clinical trial toxicity, human case reports | | **T2** | [T2] | Animal studies, validated in vitro | Nonclinical toxicology, AMES positive, animal LD50 | | **T3** | [T3] | Computational prediction, association data | ADMET-AI prediction, CTD association, QSAR model | | **T4** | [T4] | Database annotation, text-mined | Literature mention, database entry without validation | ### Required Evidence Grading Locations Evidence grades MUST appear in: 1. **Executive Summary** - Key toxicity findings graded 2. **Toxicity Predictions** - Every ADMET-AI endpoint with confidence note 3. **Regulatory Safety** - FDA findings marked [T1] 4. **Chemical-Gene Interactions** - CTD data marked by curation status 5. **Risk Assessment** - Final risk classification with supporting evidence tiers --- ## Core Strategy: 8 Research Dimensions ``` Chemical/Drug Query | +-- PHASE 0: Compound Disambiguation (ALWAYS FIRST) | +-- Resolve name -> SMILES, PubChem CID, ChEMBL ID | +-- Get molecular formula, weight, canonical structure | +-- PHASE 1: Predictive Toxicology (ADMET-AI) | +-- Mutagenicity (AMES) | +-- Hepatotoxicity (DILI, ClinTox) | +-- Carcinogenicity | +-- Acute toxicity (LD50) | +-- Skin reactions | +-- Stress response pathways | +-- Nuclear receptor activity | +-- PHASE 2: ADMET Properties | +-- Absorption: BBB penetrance, bioavailability | +-- Distribution: clearance, volume of distribution | +-- Metabolism: CYP interactions (1A2, 2C9, 2C19, 2D6, 3A4) | +-- Physicochemical: solubility, lipophilicity, pKa | +-- PHASE 3: Toxicogenomics (CTD) | +-- Chemical-gene interactions | +-- Chemical-disease associations | +-- Affected biological pathways | +-- PHASE 4: Regulatory Safety (FDA Labels) | +-- Boxed warnings (Black Box) | +-- Contraindications | +-- Adverse reactions | +-- Warnings and precautions | +-- Nonclinical toxicology | +-- PHASE 5: Drug Safety Profile (DrugBank) | +-- Toxicity data | +-- Contraindications | +-- Drug interactions affecting safety | +-- PHASE 6: Chemical-Protein Interactions (STITCH) | +-- Direct chemical-protein binding | +-- Interaction confidence scores | +-- Off-target effects | +-- PHASE 7: Structural Alerts (ChEMBL) | +-- Known toxic substructures (PAINS, Brenk) | +-- Structural alert flags | +-- SYNTHESIS: Integrated Risk Assessment +-- Aggregate all evidence tiers +-- Risk classification (Low/Medium/High/Critical) +-- Data gaps and recommendations ``` --- ## Phase 0: Compound Disambiguation (ALWAYS FIRST) **CRITICAL**: Resolve compound identity before any analysis. ### Input Types Handled | Input Format | Resolution Strategy | |-------------|---------------------| | Drug name (e.g., "Aspirin") | PubChem_get_CID_by_compound_name -> get SMILES from properties | | SMILES string | Use directly for ADMET-AI; resolve to CID for other tools | | PubChem CID | PubChem_get_compound_properties_by_CID -> get SMILES + name | | ChEMBL ID | ChEMBL_get_molecule -> get SMILES + properties | ### Resolution Steps 1. **Input detection**: Determine if input is name, SMILES, CID, or ChEMBL ID - SMILES: contains typical SMILES characters (=, #, [, ], (, ), c, n, o and no spaces in middle) - CID: numeric only - ChEMBL: starts with "CHEMBL" - Otherwise: treat as compound name 2. **Name to CID**: `PubChem_get_CID_by_compound_name(name=)` 3. **CID to properties**: `PubChem_get_compound_properties_by_CID(cid=)` 4. **Extract SMILES**: Get SMILES from PubChem properties (field: `ConnectivitySMILES`, `CanonicalSMILES`, or `IsomericSMILES` depending on response format) 5. **Store resolved IDs**: Maintain dict with `name`, `smiles`, `cid`, `formula`, `weight`, `inchi` ### Disambiguation Output ```markdown ## Compound Identity | Property | Value | |----------|-------| | **Name** | Acetaminophen | | **PubChem CID** | 1983 | | **SMILES** | CC(=O)Nc1ccc(O)cc1 | | **Formula** | C8H9NO2 | | **Molecular Weight** | 151.16 | | **InChI** | InChI=1S/C8H9NO2/... | ``` --- ## Phase 1: Predictive Toxicology (ADMET-AI) **When**: SMILES is available (from Phase 0 or provided directly) **Objective**: Run comprehensive AI-predicted toxicity endpoints ### Tools Used All ADMET-AI tools take the same parameter format: | Tool | Predicted Endpoints | Parameter | |------|---------------------|-----------| | `ADMETAI_predict_toxicity` | AMES, Carcinogens_Lagunin, ClinTox, DILI, LD50_Zhu, Skin_Reaction, hERG | `smiles`: list[str] | | `ADMETAI_predict_stress_response` | Stress response pathway activation (ARE, ATAD5, HSE, MMP, p53) | `smiles`: list[str] | | `ADMETAI_predict_nuclear_receptor_activity` | AhR, AR, ER, PPARg, Aromatase nuclear receptor activity | `smiles`: list[str] | ### Workflow 1. Call `ADMETAI_predict_toxicity(smiles=[resolved_smiles])` 2. Call `ADMETAI_predict_stress_response(smiles=[resolved_smiles])` 3. Call `ADMETAI_predict_nuclear_receptor_activity(smiles=[resolved_smiles])` 4. For each endpoint, interpret prediction: - Classification endpoints: Active (1) = toxic signal, Inactive (0) = no signal - Regression endpoints (LD50): Report numerical value with context - All predictions graded [T3] (computational prediction) ### Decision Logic - **Multiple SMILES**: Can batch up to ~10 SMILES in single call - **Failed prediction**: If ADMET-AI fails, note "prediction unavailable" (don't fail entire report) - **Confidence**: Note that AI predictions are [T3] evidence, not definitive - **hERG flag**: If hERG = Active, flag prominently (cardiac safety risk) - **AMES flag**: If AMES = Active, flag prominently (mutagenicity concern) - **DILI flag**: If DILI = Active, flag prominently (liver toxicity concern) ### Output Table ```markdown ### Toxicity Predictions [T3] | Endpoint | Prediction | Interpretation | Concern Level | |----------|-----------|---------------|---------------| | AMES Mutagenicity | Inactive | No mutagenic signal | Low | | Carcinogenicity | Inactive | No carcinogenic signal | Low | | ClinTox | Active | Clinical toxicity signal | HIGH | | DILI | Active | Drug-induced liver injury risk | HIGH | | LD50 (Zhu) | 2.45 log(mg/kg) | ~282 mg/kg (moderate) | Medium | | Skin Reaction | Inactive | No skin sensitization signal | Low | | hERG Inhibition | Active | Cardiac arrhythmia risk | HIGH | *All predictions from ADMET-AI. Evidence tier: [T3] (computational prediction)* ``` --- ## Phase 2: ADMET Properties **When**: SMILES is available **Objective**: Full ADMET characterization beyond toxicity ### Tools Used | Tool | Properties Predicted | Parameter | |------|---------------------|-----------| | `ADMETAI_predict_BBB_penetrance` | Blood-brain barrier crossing probability | `smiles`: list[str] | | `ADMETAI_predict_bioavailability` | Oral bioavailability (F20%, F30%) | `smiles`: list[str] | | `ADMETAI_predict_clearance_distribution` | Clearance, VDss, half-life, PPB | `smiles`: list[str] | | `ADMETAI_predict_CYP_interactions` | CYP1A2, 2C9, 2C19, 2D6, 3A4 inhibition/substrate | `smiles`: list[str] | | `ADMETAI_predict_physicochemical_properties` | LogP, LogD, LogS, MW, pKa | `smiles`: list[str] | | `ADMETAI_predict_solubility_lipophilicity_hydration` | Aqueous solubility, lipophilicity, hydration free energy | `smiles`: list[str] | ### Workflow 1. Call all 6 ADMET tools in parallel (independent calls) 2. Compile results into Absorption / Distribution / Metabolism / Excretion sections 3. Assess Lipinski Rule of 5 compliance from physicochemical properties 4. Flag drug-drug interaction risks from CYP inhibition profiles ### Decision Logic - **BBB penetrant + toxicity**: If BBB = Yes and any CNS toxicity endpoint active, flag as neurotoxicity risk - **Low bioavailability**: If F20% = Low, note absorption concerns - **CYP inhibitor**: If CYP3A4 inhibitor = Yes, flag high DDI risk - **Lipinski violations**: Count violations and report drug-likeness assessment ### Output Format ```markdown ### ADMET Profile [T3] #### Absorption | Property | Value | Interpretation | |----------|-------|----------------| | BBB Penetrance | Yes | Crosses blood-brain barrier | | Bioavailability (F20%) | 85% | Good oral absorption | #### Distribution | Property | Value | Interpretation | |----------|-------|----------------| | VDss | 1.2 L/kg | Moderate tissue distribution | | PPB | 92% | Highly protein bound | #### Metabolism | CYP Enzyme | Substrate | Inhibitor | |------------|-----------|-----------| | CYP1A2 | No | No | | CYP2C9 | Yes | No | | CYP2C19 | No | No | | CYP2D6 | No | No | | CYP3A4 | Yes | Yes (DDI risk) | #### Excretion | Property | Value | Interpretation | |----------|-------|----------------| | Clearance | 8.5 mL/min/kg | Moderate clearance | | Half-life | 6.2 h | Moderate half-life | ``` --- ## Phase 3: Toxicogenomics (CTD) **When**: Compound name is resolved **Objective**: Map chemical-gene-disease relationships from curated CTD data ### Tools Used | Tool | Function | Parameter | |------|----------|-----------| | `CTD_get_chemical_gene_interactions` | Genes affected by chemical | `input_terms`: str (chemical name) | | `CTD_get_chemical_diseases` | Diseases linked to chemical exposure | `input_terms`: str (chemical name) | ### Workflow 1. Call `CTD_get_chemical_gene_interactions(input_terms=compound_name)` 2. Call `CTD_get_chemical_diseases(input_terms=compound_name)` 3. Parse gene interactions: extract gene symbols, interaction types (increases/decreases expression, binding, etc.) 4. Parse disease associations: extract disease names, evidence types (marker/mechanism/therapeutic) 5. Identify most affected biological processes from gene list ### Decision Logic - **Direct evidence vs inferred**: CTD separates curated direct evidence from inferred associations - **Therapeutic vs toxic**: Disease associations can be therapeutic (drug treats disease) or adverse (chemical causes disease) - **Gene interaction types**: Distinguish between expression changes, binding, and activity modulation - **Prioritize marker/mechanism**: These indicate stronger causal evidence than simple associations - **Grade curated as [T2]**: Direct curated CTD evidence from literature - **Grade inferred as [T3]**: Computationally inferred associations ### Output Format ```markdown ### Toxicogenomics (CTD) [T2/T3] #### Chemical-Gene Interactions (Top 20) | Gene | Interaction | Type | Evidence | |------|------------|------|----------| | CYP1A2 | increases expression | mRNA | [T2] curated | | TP53 | affects activity | protein | [T2] curated | | ... | ... | ... | ... | **Total interactions found**: 156 **Top affected pathways**: Xenobiotic metabolism, Apoptosis, DNA damage response #### Chemical-Disease Associations (Top 10) | Disease | Association Type | Evidence | |---------|-----------------|----------| | Liver Neoplasms | marker/mechanism | [T2] curated | | Contact Dermatitis | therapeutic | [T2] curated | | ... | ... | ... | ``` --- ## Phase 4: Regulatory Safety (FDA Labels) **When**: Compound has an approved drug name **Objective**: Extract regulatory safety information from FDA drug labels ### Tools Used | Tool | Information Retrieved | Parameter | |------|---------------------|-----------| | `FDA_get_boxed_warning_info_by_drug_name` | Black box warnings (most serious) | `drug_name`: str | | `FDA_get_contraindications_by_drug_name` | Absolute contraindications | `drug_name`: str | | `FDA_get_adverse_reactions_by_drug_name` | Known adverse reactions | `drug_name`: str | | `FDA_get_warnings_by_drug_name` | Warnings and precautions | `drug_name`: str | | `FDA_get_nonclinical_toxicology_info_by_drug_name` | Animal toxicology data | `drug_name`: str | | `FDA_get_carcinogenic_mutagenic_fertility_by_drug_name` | Carcinogenicity/mutagenicity/fertility data | `drug_name`: str | ### Workflow 1. Call all 6 FDA tools in parallel (independent queries by drug name) 2. Parse and structure each response 3. Prioritize: Boxed Warnings > Contraindications > Warnings > Adverse Reactions 4. All FDA label data is [T1] evidence (regulatory finding based on human/animal data) ### Decision Logic - **Boxed warning present**: Flag as CRITICAL safety concern in executive summary - **No FDA data**: Chemical may not be an approved drug; note "Not an FDA-approved drug" and continue with other phases - **Multiple warnings**: Categorize by organ system (hepatic, cardiac, renal, CNS, etc.) - **Nonclinical toxicology**: Grade as [T2] (animal data supporting human risk) ### Output Format ```markdown ### Regulatory Safety (FDA) [T1] #### Boxed Warning **PRESENT** - Hepatotoxicity risk with doses >4g/day. Liver failure reported. [T1] #### Contraindications - Severe hepatic impairment [T1] - Known hypersensitivity [T1] #### Adverse Reactions (by frequency) | Reaction | Frequency | Severity | |----------|-----------|----------| | Nausea | Common (>1%) | Mild | | Hepatotoxicity | Rare (<0.1%) | Severe | | ... | ... | ... | #### Nonclinical Toxicology [T2] - **Carcinogenicity**: No carcinogenic potential in 2-year rat/mouse studies - **Mutagenicity**: Negative in Ames assay and in vivo micronucleus test - **Fertility**: No effects on fertility at doses up to 10x human dose ``` --- ## Phase 5: Drug Safety Profile (DrugBank) **When**: Compound is a known drug **Objective**: Retrieve curated drug safety data from DrugBank ### Tools Used | Tool | Information | Parameters | |------|------------|------------| | `drugbank_get_safety_by_drug_name_or_drugbank_id` | Toxicity, contraindications | `query`: str, `case_sensitive`: bool, `exact_match`: bool, `limit`: int | ### Workflow 1. Call `drugbank_get_safety_by_drug_name_or_drugbank_id(query=drug_name, case_sensitive=False, exact_match=False, limit=5)` 2. Parse toxicity information, overdose data, contraindications 3. Cross-reference with FDA data from Phase 4 ### Decision Logic - **Toxicity field**: Contains LD50 values, overdose symptoms, organ toxicity data - **DrugBank ID**: Note if found for cross-referencing - **Conflict with FDA**: If DrugBank and FDA disagree, note discrepancy and defer to FDA [T1] - **Not found**: Chemical may not be in DrugBank; continue with other phases --- ## Phase 6: Chemical-Protein Interactions (STITCH) **When**: Compound can be identified by name or SMILES **Objective**: Map chemical-protein interaction network for off-target assessment ### Tools Used | Tool | Function | Parameters | |------|----------|------------| | `STITCH_resolve_identifier` | Resolve chemical name to STITCH ID | `identifier`: str, `species`: int (9606=human) | | `STITCH_get_chemical_protein_interactions` | Get chemical-protein interactions | `identifiers`: list[str], `species`: int, `required_score`: int | | `STITCH_get_interaction_partners` | Get interaction network | `identifiers`: list[str], `species`: int, `limit`: int | ### Workflow 1. Resolve compound: `STITCH_resolve_identifier(identifier=compound_name, species=9606)` 2. Get interactions: `STITCH_get_chemical_protein_interactions(identifiers=[stitch_id], species=9606, required_score=700)` 3. Identify off-target proteins (not the intended drug target) 4. Flag safety-relevant targets: hERG (cardiac), CYP enzymes (metabolism), nuclear receptors (endocrine) ### Decision Logic - **High confidence (>900)**: Well-established interaction [T2] - **Medium confidence (700-900)**: Probable interaction [T3] - **Low confidence (400-700)**: Possible interaction, needs validation [T4] - **Safety-relevant targets**: Flag interactions with known safety targets - **No STITCH data**: Chemical may be too novel; note and continue --- ## Phase 7: Structural Alerts (ChEMBL) **When**: ChEMBL molecule ID is available (from Phase 0) **Objective**: Check for known toxic substructures ### Tools Used | Tool | Function | Parameters | |------|----------|------------| | `ChEMBL_search_compound_structural_alerts` | Find structural alert matches | `molecule_chembl_id`: str, `limit`: int | ### Workflow 1. If ChEMBL ID available: `ChEMBL_search_compound_structural_alerts(molecule_chembl_id=chembl_id, limit=20)` 2. Parse alert types: PAINS (pan-assay interference), Brenk (medicinal chemistry), Glaxo (GSK structural alerts) 3. Categorize severity: Some alerts are informational, others indicate likely toxicity ### Decision Logic - **PAINS alerts**: May cause false positives in screening; note for medicinal chemistry - **Brenk alerts**: Known problematic substructures; flag if present - **No alerts**: Good sign but not definitive proof of safety - **No ChEMBL ID**: Skip this phase gracefully; note "structural alert analysis not available" --- ## Synthesis: Integrated Risk Assessment (MANDATORY) **Always the final section**. Integrates all evidence into actionable risk classification. ### Risk Classification Matrix | Risk Level | Criteria | |-----------|----------| | **CRITICAL** | FDA boxed warning present OR multiple [T1] toxicity findings OR active DILI + active hERG | | **HIGH** | FDA warnings present OR [T2] animal toxicity OR multiple active ADMET endpoints | | **MEDIUM** | Some [T3] predictions positive OR CTD disease associations OR structural alerts | | **LOW** | All ADMET endpoints negative AND no FDA/DrugBank safety flags AND no CTD concerns | | **INSUFFICIENT DATA** | Fewer than 3 phases returned data; cannot make confident assessment | ### Synthesis Template ```markdown ## Integrated Risk Assessment ### Overall Risk Classification: [HIGH] ### Evidence Summary | Dimension | Finding | Evidence Tier | Concern | |-----------|---------|--------------|---------| | ADMET Toxicity | DILI active, hERG active | [T3] | HIGH | | FDA Label | Boxed warning for hepatotoxicity | [T1] | CRITICAL | | CTD Toxicogenomics | 156 gene interactions, liver neoplasms | [T2] | HIGH | | DrugBank | Known hepatotoxicity at high doses | [T2] | HIGH | | STITCH | Binds CYP3A4, hERG | [T3] | MEDIUM | | Structural Alerts | 2 Brenk alerts | [T3] | MEDIUM | ### Key Safety Concerns 1. **Hepatotoxicity** [T1]: FDA boxed warning + ADMET-AI DILI prediction + CTD liver disease associations 2. **Cardiac Risk** [T3]: ADMET-AI hERG prediction + STITCH hERG interaction 3. **Drug Interactions** [T3]: CYP3A4 substrate/inhibitor, potential DDI risk ### Data Gaps - [ ] No in vivo genotoxicity data available - [ ] STITCH interaction scores moderate (700-900) - [ ] No environmental exposure data ### Recommendations 1. Avoid doses >4g/day (hepatotoxicity threshold) [T1] 2. Monitor liver function in chronic use [T1] 3. Screen for CYP3A4 interactions before co-administration [T3] 4. Consider cardiac monitoring for at-risk patients [T3] ``` --- ## Mandatory Completeness Checklist Before finalizing any report, verify: - [ ] **Phase 0**: Compound fully disambiguated (SMILES + CID at minimum) - [ ] **Phase 1**: At least 5 toxicity endpoints reported or "prediction unavailable" noted - [ ] **Phase 2**: ADMET profile with A/D/M/E sections or "not available" noted - [ ] **Phase 3**: CTD queried; gene interactions and disease associations reported or "no data in CTD" - [ ] **Phase 4**: FDA labels queried; results or "not an FDA-approved drug" noted - [ ] **Phase 5**: DrugBank queried; results or "not found in DrugBank" noted - [ ] **Phase 6**: STITCH queried; results or "no STITCH data available" noted - [ ] **Phase 7**: Structural alerts checked or "ChEMBL ID not available" noted - [ ] **Synthesis**: Risk classification provided with evidence summary - [ ] **Evidence Grading**: All findings have [T1]-[T4] annotations - [ ] **Data Gaps**: Explicitly listed in synthesis section --- ## Tool Parameter Reference **Critical Parameter Notes** (verified from source code): | Tool | Parameter Name | Type | Notes | |------|---------------|------|-------| | All ADMETAI tools | `smiles` | `list[str]` | Always a list, even for single compound | | All CTD tools | `input_terms` | `str` | Chemical name, MeSH name, CAS RN, or MeSH ID | | All FDA tools | `drug_name` | `str` | Brand or generic drug name | | drugbank_get_safety_* | `query`, `case_sensitive`, `exact_match`, `limit` | str, bool, bool, int | All 4 required | | STITCH_resolve_identifier | `identifier`, `species` | str, int | species=9606 for human | | STITCH_get_chemical_protein_interactions | `identifiers`, `species`, `required_score` | list[str], int, int | required_score=400 default | | PubChem_get_CID_by_compound_name | `name` | `str` | Compound name (not SMILES) | | PubChem_get_compound_properties_by_CID | `cid` | `int` | Numeric CID | | ChEMBL_search_compound_structural_alerts | `molecule_chembl_id` | `str` | ChEMBL ID (e.g., "CHEMBL112") | ### Response Format Notes - **ADMET-AI**: Returns `{status: "success", data: {...}}` with prediction values - **CTD**: Returns list of interaction/association objects - **FDA**: Returns `{status, data}` with label text - **DrugBank**: Returns `{data: [...]}` with drug records - **STITCH**: Returns list of interaction objects with scores - **PubChem CID lookup**: Returns `{IdentifierList: {CID: [...]}}` (may or may not have `data` wrapper) - **PubChem properties**: Returns dict with `CID`, `MolecularWeight`, `ConnectivitySMILES`, `IUPACName` --- ## Fallback Strategies ### Compound Resolution - **Primary**: PubChem by name -> CID -> properties -> SMILES - **Fallback 1**: ChEMBL search by name -> molecule -> SMILES - **Fallback 2**: If SMILES provided directly, skip name resolution ### Toxicity Prediction - **Primary**: All 9 ADMET-AI endpoints - **Fallback**: If ADMET-AI fails for a compound, note "prediction failed" and continue with database evidence - **Note**: ADMET-AI may fail for very large or unusual SMILES ### Regulatory Data - **Primary**: FDA labels by drug name - **Fallback**: If FDA returns no data, try alternative drug names (brand vs generic) - **Note**: Non-drug chemicals (pesticides, industrial) will not have FDA labels ### CTD Data - **Primary**: Search by common chemical name - **Fallback**: Try MeSH name if common name fails - **Note**: Novel compounds may not be in CTD --- ## Common Use Patterns ### Pattern 1: Novel Compound Assessment ``` Input: SMILES string for new molecule Workflow: Phase 0 (SMILES->CID) -> Phase 1 (toxicity) -> Phase 2 (ADMET) -> Phase 7 (structural alerts) -> Synthesis Output: Predictive safety profile for novel compound ``` ### Pattern 2: Approved Drug Safety Review ``` Input: Drug name (e.g., "Acetaminophen") Workflow: All phases (0-7 + Synthesis) Output: Complete safety dossier with regulatory + predictive + database evidence ``` ### Pattern 3: Environmental Chemical Risk ``` Input: Chemical name (e.g., "Bisphenol A") Workflow: Phase 0 -> Phase 1 -> Phase 2 -> Phase 3 (CTD, key for env chemicals) -> Phase 6 -> Synthesis Output: Environmental health risk assessment focused on gene-disease associations ``` ### Pattern 4: Batch Toxicity Screening ``` Input: Multiple SMILES strings Workflow: Phase 0 -> Phase 1 (batch) -> Phase 2 (batch) -> Comparative table -> Synthesis Output: Comparative toxicity table ranking compounds by safety ``` ### Pattern 5: Toxicogenomic Deep-Dive ``` Input: Chemical name + specific gene or disease interest Workflow: Phase 0 -> Phase 3 (CTD expanded) -> Literature search -> Synthesis Output: Detailed chemical-gene-disease mechanistic analysis ``` --- ## Output Report Structure All analyses generate a structured markdown report with progressive sections: ```markdown # Chemical Safety & Toxicology Report: [Compound Name] **Generated**: YYYY-MM-DD HH:MM **Compound**: [Name] | SMILES: [SMILES] | CID: [CID] ## Executive Summary [2-3 sentence overview with risk classification and key findings, all graded] ## 1. Compound Identity [Phase 0 results - disambiguation table] ## 2. Predictive Toxicology [Phase 1 results - ADMET-AI toxicity endpoints] ## 3. ADMET Profile [Phase 2 results - absorption, distribution, metabolism, excretion] ## 4. Toxicogenomics [Phase 3 results - CTD chemical-gene-disease relationships] ## 5. Regulatory Safety [Phase 4 results - FDA label information] ## 6. Drug Safety Profile [Phase 5 results - DrugBank data] ## 7. Chemical-Protein Interactions [Phase 6 results - STITCH network] ## 8. Structural Alerts [Phase 7 results - ChEMBL alerts] ## 9. Integrated Risk Assessment [Synthesis - risk classification, evidence summary, data gaps, recommendations] ## Appendix: Methods and Data Sources [Tool versions, databases queried, date of access] ``` --- ## Limitations & Known Issues ### Tool-Specific - **ADMET-AI**: Predictions are computational [T3]; should not replace experimental testing - **CTD**: Curated but may lag behind latest literature by 6-12 months - **FDA**: Only covers FDA-approved drugs; not applicable to environmental chemicals or supplements - **DrugBank**: Primarily drugs; limited coverage of industrial chemicals - **STITCH**: Score thresholds affect sensitivity; lower scores increase false positives - **ChEMBL**: Structural alerts require ChEMBL ID; not all compounds have one ### Analysis - **Novel compounds**: May only have ADMET-AI predictions (no database evidence) - **Environmental chemicals**: FDA/DrugBank phases will be empty; rely on CTD and ADMET-AI - **Batch mode**: ADMET-AI can handle batches; other tools require individual queries - **Species specificity**: Most data is human-centric; animal data noted where applicable ### Technical - **SMILES validity**: Invalid SMILES will cause ADMET-AI failures - **Name ambiguity**: Chemical names can be ambiguous; always verify with CID - **Rate limits**: Some FDA endpoints may rate-limit for rapid queries --- ## Summary **Chemical Safety & Toxicology Assessment Skill** provides comprehensive safety evaluation by integrating: 1. **Predictive toxicology** (ADMET-AI) - 9 tools covering toxicity, ADMET, physicochemical properties 2. **Toxicogenomics** (CTD) - Chemical-gene-disease relationship mapping 3. **Regulatory safety** (FDA) - 6 tools for label-based safety extraction 4. **Drug safety** (DrugBank) - Curated toxicity and contraindication data 5. **Chemical interactions** (STITCH) - Chemical-protein interaction networks 6. **Structural alerts** (ChEMBL) - Known toxic substructure detection **Outputs**: Structured markdown report with risk classification, evidence grading, and actionable recommendations **Best for**: Drug safety assessment, chemical hazard profiling, environmental toxicology, ADMET characterization, toxicogenomic analysis **Total tools integrated**: 25+ tools across 6 databases