--- name: tooluniverse-protein-modification-analysis description: Post-translational modification (PTM) analysis — phosphorylation, ubiquitination, acetylation, glycosylation, methylation. Uses iPTMnet (sites + enzymes), ProtVar (functional consequences), UniProt (baseline), STRING, ELM (linear motifs), MassIVE/ProteomeXchange (experimental). Use for PTM site annotation, kinase-substrate identification, and PTM-disease associations. disable-model-invocation: true --- # Protein Post-Translational Modification Analysis Comprehensive PTM analysis using iPTMnet (primary), ProtVar (functional context), UniProt (baseline), STRING (interactions), ELM (linear motifs), and MassIVE/ProteomeXchange (experimental data). ## LOOK UP DON'T GUESS - PTM sites/enzymes: `iPTMnet_get_ptm_sites` - Functional consequence: `ProtVar_get_function` + `iPTMnet_get_ptm_ppi` - Proteoforms: `iPTMnet_get_proteoforms` - Linear motifs: `ELM_get_instances` ## COMPUTE, DON'T DESCRIBE When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it. ## Domain Reasoning PTMs are context-dependent: same phosphorylation site can activate or inhibit depending on kinase and effectors. Always check: which enzyme, what functional consequence, in what cell context. --- ## KEY PRINCIPLES 1. **Disambiguation first** -- resolve to UniProt accession before iPTMnet calls 2. **iPTMnet is SOAP-style** -- every call requires `operation` parameter 3. **Evidence-graded** -- distinguish experimental (T1) from predicted (T4) 4. **English-first queries** --- ## Workflow ``` Phase 0: Protein Disambiguation → UniProt accession Phase 1: PTM Site Inventory → iPTMnet_get_ptm_sites Phase 2: Proteoform Analysis → iPTMnet_get_proteoforms Phase 3: PTM-Dependent Interactions → iPTMnet_get_ptm_ppi Phase 4: Functional Context → ProtVar_get_function at key sites Phase 4b: Linear Motif Context → ELM_get_instances for SLiM overlap Phase 4c: Experimental Data → MassIVE/ProteomeXchange Phase 5: Synthesis & Report ``` --- ## Phase 0: Disambiguation - `iPTMnet_search(operation="search", search_term="TP53", role="Substrate")` -- find UniProt IDs - If user provides UniProt accession directly, use it - Select human entry if multiple hits ## Phase 1: PTM Sites `iPTMnet_get_ptm_sites(operation="get_ptm_sites", uniprot_id="P04637")` -- returns position, residue, modification type, enzyme, evidence. Group by modification type. Fallback: `UniProt_get_entry_by_accession` PTM annotations. ## Phase 2: Proteoforms `iPTMnet_get_proteoforms(operation="get_proteoforms", uniprot_id=...)` -- distinct PTM combinations. Focus on those with functional/disease annotations if >20. ## Phase 3: PTM-Dependent Interactions `iPTMnet_get_ptm_ppi(operation="get_ptm_ppi", uniprot_id=...)` -- interacting protein, PTM site, effect (enables/disrupts). Supplement with `STRING_get_interaction_partners(identifiers=gene, species=9606, required_score=700)`. ## Phase 4: Functional Context `ProtVar_get_function(accession=..., position=N, variant_aa=AA)` -- domain, active site, binding site, conservation. Grade: active-site PTM > domain-core > disordered region. ## Phase 4b: Linear Motifs (ELM) `ELM_get_instances(operation="get_instances", uniprot_id=..., motif_type="MOD")` -- MOD = modification sites, DEG = degradation signals. Cross-reference with Phase 1 PTM positions. `ELM_list_classes(operation="list_classes")` for motif details. ## Phase 4c: Experimental Data `MassIVE_search_datasets(species="9606")`, `MassIVE_get_dataset(accession="MSV...")` for public MS datasets. --- ## Evidence Grading | Tier | Criteria | |------|----------| | T1 | PTM at validated active/binding site with functional data | | T2 | PTM in structured domain with ProtVar annotation | | T3 | Correlation data only (mass spec detection) | | T4 | Predicted, no experimental validation | --- ## Tool Parameter Reference | Tool | Key Params | |------|-----------| | `iPTMnet_search` | `operation="search"`, `search_term`, `role` | | `iPTMnet_get_ptm_sites` | `operation="get_ptm_sites"`, `uniprot_id` | | `iPTMnet_get_proteoforms` | `operation="get_proteoforms"`, `uniprot_id` | | `iPTMnet_get_ptm_ppi` | `operation="get_ptm_ppi"`, `uniprot_id` | | `ELM_get_instances` | `operation="get_instances"`, `uniprot_id`, `motif_type` | | `ELM_list_classes` | `operation="list_classes"` | | `MassIVE_search_datasets` | `page_size`, `species` | **Critical**: All iPTMnet and ELM tools require `operation` as first parameter (SOAP-style). --- ## Fallbacks | Situation | Fallback | |-----------|----------| | Not in iPTMnet | UniProt PTM/processing annotations | | No PTM-PPI data | STRING general PPI | | No ProtVar data | UniProt domain annotations | | No ELM data | Proceed with iPTMnet/UniProt only | ## Limitations - iPTMnet biased toward well-studied proteins - Proteoform data covers observed combinations only - PTM-PPI: only PTM-specific evidence; more PPIs exist in STRING