--- name: ipsae description: > Binder design ranking using ipSAE (interprotein Score from Aligned Errors). Use this skill when: (1) Ranking binder designs for experimental testing, (2) Filtering BindCraft or RFdiffusion outputs, (3) Comparing AF2/AF3/Boltz predictions, (4) Predicting binding success rates, (5) Need better ranking than ipTM or iPAE. For structure prediction, use chai or alphafold. For QC thresholds, use protein-qc. license: MIT category: evaluation tags: [ranking, scoring, binding] --- # ipSAE Binder Ranking ## Prerequisites | Requirement | Minimum | Recommended | |-------------|---------|-------------| | Python | 3.8+ | 3.10 | | NumPy | 1.20+ | Latest | | RAM | 8GB | 16GB | ## Overview ipSAE (interprotein Score from Aligned Errors) is a scoring function for ranking protein-protein interactions predicted by AlphaFold2, AlphaFold3, and Boltz1. It outperforms ipTM and iPAE for binder design ranking with **1.4x higher precision** in identifying true binders. **Paper**: [What's wrong with AlphaFold's ipTM score](https://www.biorxiv.org/content/10.1101/2025.02.10.637595v2) ## How to run ### Installation ```bash git clone https://github.com/DunbrackLab/IPSAE.git cd IPSAE pip install numpy ``` ### AlphaFold2 ```bash python ipsae.py scores_rank_001.json unrelaxed_rank_001.pdb 15 15 ``` ### AlphaFold3 ```bash python ipsae.py fold_model_full_data_0.json fold_model_0.cif 10 10 ``` ### Boltz1 ```bash python ipsae.py pae_model_0.npz model_0.cif 10 10 ``` ## Key parameters | Parameter | Description | Recommended | |-----------|-------------|-------------| | PAE file | JSON (AF2/AF3) or NPZ (Boltz) | Match predictor | | Structure file | PDB or CIF structure | Match PAE | | PAE cutoff | Threshold for contacts | 10-15 | | Distance cutoff | Max CA-CA distance (A) | 10-15 | ## Output format Two output files are generated: **Chain-pair scores** (`_chains.csv`): ``` chain_A,chain_B,ipSAE_min,pDockQ,pDockQ2,LIS,n_contacts,interface_dist A,B,0.72,0.65,0.58,0.45,42,8.5 ``` **Residue-level scores** (`_residues.csv`): ``` chain,resnum,pSAE,pLDDT A,45,0.85,92.3 A,67,0.78,88.1 ``` ## Sample output ### Successful run ``` $ python ipsae.py scores_rank_001.json design_0.pdb 10 10 Processing design_0... Found 2 chains: A, B Computing ipSAE scores... Results written to: design_0_chains.csv design_0_residues.csv Summary: ipSAE_min: 0.72 pDockQ: 0.65 LIS: 0.45 Interface contacts: 42 ``` **What good output looks like:** - ipSAE_min > 0.61 (primary filter) - pDockQ > 0.5 (supporting metric) - Reasonable number of interface contacts (20-100) ## Decision tree ``` Should I use ipSAE? │ ├─ What are you ranking? │ ├─ Designed binders → ipSAE ✓ │ ├─ Natural complexes → ipTM is fine │ └─ Single proteins → Not applicable │ ├─ What predictor did you use? │ ├─ AlphaFold2 → ipSAE ✓ │ ├─ AlphaFold3 → ipSAE ✓ │ ├─ Boltz1 → ipSAE ✓ │ ├─ Chai → ipSAE (use PAE output) │ └─ ESMFold → Not applicable (no PAE) │ └─ Why ipSAE over ipTM? ├─ Different length constructs → ipSAE ✓ ├─ Designs with disordered regions → ipSAE ✓ └─ Standard complexes → Either works ``` ## Recommended thresholds | Metric | Standard | Stringent | Use Case | |--------|----------|-----------|----------| | ipSAE_min | > 0.61 | > 0.70 | Primary filter | | LIS | > 0.35 | > 0.45 | Interface quality | | pDockQ | > 0.5 | > 0.6 | Supporting | ## Batch processing ```python import subprocess import os from pathlib import Path def score_designs(pae_dir, struct_dir, output_dir): """Score all designs in a directory.""" Path(output_dir).mkdir(exist_ok=True) for pae_file in Path(pae_dir).glob("*_scores*.json"): name = pae_file.stem.replace("_scores_rank_001", "") struct_file = Path(struct_dir) / f"{name}.pdb" if struct_file.exists(): subprocess.run([ "python", "ipsae.py", str(pae_file), str(struct_file), "10", "10" ]) ``` --- ## Verify ```bash ls *_chains.csv | wc -l # Should match number of predictions ``` --- ## Troubleshooting **Low scores for good designs**: Check PAE/distance cutoffs **Missing output**: Verify PAE file format matches predictor **Inconsistent scores**: Use same cutoffs across all designs ### Error interpretation | Error | Cause | Fix | |-------|-------|-----| | `KeyError: 'pae'` | Wrong PAE format | Check if AF2/AF3/Boltz format | | `FileNotFoundError` | Structure not found | Verify file paths | | `ValueError: no contacts` | No interface detected | Check chain IDs, reduce cutoffs | --- **Next**: Select top designs (ipSAE_min > 0.61) → experimental validation.