---
name: cobrapy-metabolic-modeling
description: "Constraint-based (COBRA) analysis of genome-scale metabolic models: FBA, FVA, knockouts, flux sampling, production envelopes, gapfilling, media optimization. Use for strain design, essential gene ID, flux analysis. For kinetic modeling use tellurium; for visualization use Escher."
license: GPL-2.0
---

# COBRApy — Constraint-Based Metabolic Modeling

## Overview

COBRApy is a Python package for constraint-based reconstruction and analysis (COBRA) of genome-scale metabolic models. It provides flux balance analysis (FBA), flux variability analysis (FVA), gene and reaction knockout screens, flux sampling, production envelopes, gapfilling, and media optimization on SBML-format metabolic networks.

## When to Use

- Predicting microbial growth rates under different nutrient conditions (FBA)
- Identifying essential genes or reactions via single and double knockout screens
- Determining flux ranges and alternative optimal solutions (FVA)
- Sampling feasible flux distributions to characterize metabolic flexibility
- Designing minimal growth media or optimizing carbon sources
- Computing production envelopes for metabolic engineering targets
- Gapfilling incomplete draft models using a universal reaction database
- For **kinetic modeling** or **dynamic ODE-based models**, use Tellurium instead
- For **pathway visualization** on metabolic maps, use Escher instead

## Prerequisites

- **Python packages**: `cobra` (includes GLPK solver), `numpy`, `pandas`
- **Optional solvers**: CPLEX or Gurobi (faster for large models, require license)
- **Data**: SBML (.xml), JSON, or YAML metabolic model files; available from BiGG Models, AGORA, or ModelSEED

```bash
pip install cobra
```

## Quick Start

```python
from cobra.io import load_model

model = load_model("textbook")  # E. coli core model
print(f"Model: {model.id} — {len(model.reactions)} rxns, {len(model.metabolites)} mets, {len(model.genes)} genes")

solution = model.optimize()
print(f"Growth rate: {solution.objective_value:.4f} /h")
print(f"Status: {solution.status}")
# Model: e_coli_core — 95 rxns, 72 mets, 137 genes
# Growth rate: 0.8739 /h
# Status: optimal
```

## Core API

### 1. Model I/O

Load bundled models and read/write standard formats.

```python
from cobra.io import load_model, read_sbml_model, write_sbml_model, load_json_model, save_json_model

# Bundled: "textbook" (95 rxns), "ecoli" (2583 rxns), "salmonella"
model = load_model("textbook")
# model = read_sbml_model("my_model.xml")   # from SBML file
# model = load_json_model("my_model.json")  # from JSON file

write_sbml_model(model, "output_model.xml")
save_json_model(model, "output_model.json")
print(f"Saved model: {model.id}")
```

### 2. Model Structure and Components

Access reactions, metabolites, and genes via DictList containers.

```python
from cobra.io import load_model
model = load_model("textbook")

# Inspect a reaction
rxn = model.reactions.get_by_id("PFK")
print(f"Reaction: {rxn.id} — {rxn.name}")
print(f"Equation: {rxn.reaction}")
print(f"Bounds: {rxn.bounds}, GPR: {rxn.gene_reaction_rule}")

# Inspect a metabolite
met = model.metabolites.get_by_id("atp_c")
print(f"Metabolite: {met.id}, Formula: {met.formula}, Compartment: {met.compartment}")

# Query and list exchange reactions
atp_rxns = model.reactions.query("atp", attribute="name")
print(f"ATP-related reactions: {len(atp_rxns)}, Exchange reactions: {len(model.exchanges)}")
```

### 3. Flux Balance Analysis (FBA)

Predict optimal flux distributions by maximizing an objective.

```python
from cobra.io import load_model
from cobra.flux_analysis import pfba

model = load_model("textbook")

# Standard FBA
solution = model.optimize()
print(f"Growth: {solution.objective_value:.4f} /h, Active fluxes: {(solution.fluxes.abs() > 1e-6).sum()}")

# Parsimonious FBA — same growth, minimal total flux
pfba_sol = pfba(model)
print(f"pFBA total flux: {pfba_sol.fluxes.abs().sum():.1f} vs standard: {solution.fluxes.abs().sum():.1f}")
```

```python
# Change objective; slim_optimize for speed
from cobra.io import load_model
model = load_model("textbook")

with model:
    model.objective = "ATPM"
    print(f"Max ATPM flux: {model.optimize().objective_value:.2f}")

print(f"Growth (slim): {model.slim_optimize():.4f}")  # no flux vector, faster
```

### 4. Flux Variability Analysis (FVA)

Determine feasible flux ranges at or near optimality.

```python
from cobra.io import load_model
from cobra.flux_analysis import flux_variability_analysis

model = load_model("textbook")

fva = flux_variability_analysis(model, fraction_of_optimum=1.0)
fva_90 = flux_variability_analysis(model, fraction_of_optimum=0.9)
fva["range"] = fva["maximum"] - fva["minimum"]
fva_90["range"] = fva_90["maximum"] - fva_90["minimum"]
print(f"Mean range at 100%: {fva['range'].mean():.2f}, at 90%: {fva_90['range'].mean():.2f}")
```

```python
# Loopless FVA on specific reactions
from cobra.io import load_model
from cobra.flux_analysis import flux_variability_analysis

model = load_model("textbook")
fva_ll = flux_variability_analysis(
    model, loopless=True, reaction_list=["PFK", "PGI", "FBA", "TPI", "GAPD"],
)
print(fva_ll)
```

### 5. Gene and Reaction Deletions

Screen for essential genes/reactions via knockout simulations.

```python
from cobra.io import load_model
from cobra.flux_analysis import single_gene_deletion, double_gene_deletion

model = load_model("textbook")
wt_growth = model.slim_optimize()

# Single gene deletions
gene_results = single_gene_deletion(model)
gene_results["growth_fraction"] = gene_results["growth"] / wt_growth
essential = gene_results[gene_results["growth_fraction"] < 0.01]
print(f"Essential genes: {len(essential)} / {len(model.genes)}")

# Double deletions (synthetic lethality) — use multiprocessing
double_results = double_gene_deletion(model, processes=4)
print(f"Double deletion results: {double_results.shape}")
```

### 6. Growth Media and Minimal Media

Modify nutrient availability and compute minimal media.

```python
from cobra.io import load_model
from cobra.medium import minimal_medium

model = load_model("textbook")

# View current medium
for rxn_id, flux in sorted(model.medium.items()):
    print(f"  {rxn_id}: {flux}")

# Anaerobic switch via context manager
with model:
    medium = model.medium
    medium["EX_o2_e"] = 0.0
    model.medium = medium
    print(f"Anaerobic growth: {model.slim_optimize():.4f} /h")

# Minimal medium
min_med = minimal_medium(model, minimize_components=True, open_exchanges=True)
print(f"Minimal medium: {len(min_med)} components")
```

### 7. Flux Sampling

Sample feasible flux distributions for variability analysis.

```python
from cobra.io import load_model
from cobra.sampling import sample

model = load_model("textbook")
samples = sample(model, n=500, method="optgp")
print(f"Samples shape: {samples.shape}")  # (500, n_reactions)
print(f"PFK flux: mean={samples['PFK'].mean():.2f}, std={samples['PFK'].std():.2f}")
```

### 8. Production Envelopes and Gapfilling

Compute phenotype phase planes and fill model gaps.

```python
from cobra.io import load_model
from cobra.flux_analysis import production_envelope

model = load_model("textbook")
envelope = production_envelope(
    model,
    reactions=model.reactions.get_by_id("EX_ac_e"),
    carbon_sources=model.reactions.get_by_id("EX_glc__D_e"),
)
print(f"Envelope: {len(envelope)} points")
print(envelope[["flux_minimum", "flux_maximum", "carbon_yield_minimum", "carbon_yield_maximum"]].head())
```

```python
# Gapfilling: restore growth after reaction removal
from cobra.io import load_model
from cobra.flux_analysis.gapfilling import gapfill

model = load_model("textbook")
universal = load_model("textbook")  # In practice, use a universal reaction DB
with model:
    model.remove_reactions([model.reactions.get_by_id("PFK")])
    print(f"Growth after removing PFK: {model.slim_optimize():.4f}")
    for rxn in gapfill(model, universal)[0]:
        print(f"  Gapfill suggests: {rxn.id}")
```

## Key Concepts

### DictList Objects

Reactions, metabolites, and genes are stored in `DictList` — ordered, indexable, and accessible by ID.

```python
rxn = model.reactions[0]                    # by index
rxn = model.reactions.get_by_id("PFK")      # by ID
matches = model.reactions.query("phospho")  # keyword search
```

### Exchange Reactions

`EX_` prefix reactions represent system boundary. Positive flux = secretion; negative = uptake. Managed via `model.medium` dict.

### Gene-Reaction Rules (GPR)

Boolean expressions linking genes to reactions: `(b0726 and b0727) or b1234`. Gene knockout propagates through GPR logic.

### Context Managers

`with model:` snapshots state and reverts all changes on exit (bounds, objective, medium, knockouts). Nesting supported.

## Common Workflows

### Workflow 1: Gene Knockout Screen

**Goal**: Identify essential, growth-reducing, and neutral genes.

```python
from cobra.io import load_model
from cobra.flux_analysis import single_gene_deletion

model = load_model("textbook")
wt_growth = model.slim_optimize()

results = single_gene_deletion(model)
results["growth_fraction"] = results["growth"] / wt_growth

essential = results[results["growth_fraction"] < 0.01]
reduced = results[(results["growth_fraction"] >= 0.01) & (results["growth_fraction"] < 0.9)]
neutral = results[results["growth_fraction"] >= 0.9]

print(f"Essential: {len(essential)}, Reduced: {len(reduced)}, Neutral: {len(neutral)}")
for idx in essential.index:
    print(f"  Essential gene: {list(idx)[0]}")
```

### Workflow 2: Media Optimization

**Goal**: Find minimal medium at different growth targets; compare aerobic vs anaerobic.

```python
from cobra.io import load_model
from cobra.medium import minimal_medium
import pandas as pd

model = load_model("textbook")
results = []
for frac in [0.1, 0.5, 0.8, 1.0]:
    with model:
        target = model.slim_optimize() * frac
        model.reactions.get_by_id("Biomass_Ecoli_core").lower_bound = target
        try:
            mm = minimal_medium(model, minimize_components=True, open_exchanges=True)
            results.append({"growth_frac": frac, "n_components": len(mm)})
        except Exception:
            results.append({"growth_frac": frac, "n_components": None})
print(pd.DataFrame(results).to_string(index=False))

for label, o2 in [("Aerobic", 1000.0), ("Anaerobic", 0.0)]:
    with model:
        medium = model.medium
        medium["EX_o2_e"] = o2
        model.medium = medium
        print(f"{label} growth: {model.slim_optimize():.4f} /h")
```

### Workflow 3: Production Strain Design

**Goal**: Design a strain with maximized target metabolite production. Combines modules 3, 5, and 8.

1. Load model and compute wild-type production envelope for target metabolite
2. Screen single gene knockouts for overproducers (filter where target secretion increases)
3. Combine top knockouts and validate with FBA under production conditions
4. Gapfill if knockouts create infeasibilities
5. Compute final production envelope and compare to wild-type

## Key Parameters

| Parameter | Module/Function | Default | Range / Options | Effect |
|-----------|----------------|---------|-----------------|--------|
| `fraction_of_optimum` | `flux_variability_analysis` | `1.0` | `0.0`-`1.0` | Fraction of max objective to maintain; lower = wider flux ranges |
| `loopless` | `flux_variability_analysis` | `False` | `True`, `False` | Eliminate thermodynamically infeasible loops; slower |
| `method` | `sample` | `"optgp"` | `"optgp"`, `"achr"` | Sampling algorithm; optgp supports parallelism |
| `n` | `sample` | required | `100`-`10000` | Number of flux samples to draw |
| `processes` | `sample`, `double_gene_deletion` | `1` | `1`-`N_cores` | Parallel worker processes |
| `minimize_components` | `minimal_medium` | `False` | `True`, `False` | True = fewest nutrients (MILP); False = minimize total flux |
| `open_exchanges` | `minimal_medium` | `False` | `True`, `False` | Allow all exchanges as nutrient candidates |
| `carbon_sources` | `production_envelope` | `None` | Reaction object | Compute carbon yield alongside flux envelope |
| `thinning` | `sample` | `100` | `1`-`1000` | Steps between kept samples; higher = less correlated |

## Best Practices

1. **Use context managers for temporary changes**: `with model:` reverts all modifications on exit.
   ```python
   with model:
       model.reactions.PFK.knock_out()
       print(model.slim_optimize())  # modified
   # model is restored here
   ```

2. **Validate with `slim_optimize()` before analysis**: Quick feasibility check before expensive operations (FVA, sampling).

3. **Check `solution.status` after optimization**: Always verify `"optimal"` before interpreting fluxes.

4. **Use loopless FVA when thermodynamic feasibility matters**: Standard FVA can include infeasible internal cycles that inflate flux ranges.

5. **Parallelize expensive operations**: Sampling and double deletions support `processes` parameter.

6. **Prefer SBML for model exchange**: Community standard supported by all COBRA tools.

7. **Use `slim_optimize()` in loops**: Skips full flux vector construction, significantly faster for screening.

8. **Validate flux samples**: Use `sampler.validate(samples)` to check stoichiometric and bound constraints.

## Common Recipes

### Recipe: Parameter Scan (Glucose Uptake Rate)

```python
from cobra.io import load_model

model = load_model("textbook")
print("Glucose_uptake | Growth_rate")
for glc_uptake in [1, 2, 5, 10, 15, 20]:
    with model:
        model.reactions.get_by_id("EX_glc__D_e").lower_bound = -glc_uptake
        growth = model.slim_optimize()
        print(f"  {glc_uptake:>13} | {growth:.4f}")
```

### Recipe: Batch Condition Analysis

```python
from cobra.io import load_model
import pandas as pd

model = load_model("textbook")
conditions = [
    {"name": "Rich aerobic", "EX_o2_e": 1000, "EX_glc__D_e": 10},
    {"name": "Anaerobic", "EX_o2_e": 0, "EX_glc__D_e": 10},
    {"name": "Low glucose", "EX_o2_e": 1000, "EX_glc__D_e": 1},
]
results = []
for c in conditions:
    with model:
        medium = model.medium
        medium["EX_o2_e"], medium["EX_glc__D_e"] = c["EX_o2_e"], c["EX_glc__D_e"]
        model.medium = medium
        results.append({"condition": c["name"], "growth": round(model.slim_optimize(), 4)})
print(pd.DataFrame(results).to_string(index=False))
```

### Recipe: Model Validation Checklist

```python
from cobra.io import load_model
from cobra.flux_analysis import find_blocked_reactions

model = load_model("textbook")

# Feasibility, mass balance, dead-ends, blocked reactions
print(f"Growth feasible: {model.slim_optimize() > 0}")
print(f"Missing formula: {sum(1 for m in model.metabolites if m.formula is None)}")
print(f"Dead-end metabolites: {sum(1 for m in model.metabolites if len(m.reactions) == 1)}")
print(f"Blocked reactions: {len(find_blocked_reactions(model))} / {len(model.reactions)}")
```

## Troubleshooting

| Problem | Cause | Solution |
|---------|-------|----------|
| `solution.status == "infeasible"` | Constraints cannot be simultaneously satisfied | Check medium has required nutrients; verify reaction bounds; use `model.medium` to restore defaults |
| `solution.status == "unbounded"` | No upper bound on fluxes | Set finite upper bounds on exchange reactions |
| Very slow optimization | Large model + default GLPK solver | Install CPLEX or Gurobi: `model.solver = "cplex"` |
| `ValueError` setting bounds | `lower_bound > upper_bound` temporarily | Set as tuple: `rxn.bounds = (new_lb, new_ub)` |
| Gene deletion returns NaN | Knockout makes model infeasible | Expected for essential genes; classify as essential |
| `IOError` reading SBML | Invalid SBML or missing namespace | Validate at sbml.org; try `cobra.io.sbml.validate_sbml_model(path)` |
| Flux samples fail validation | Numerical solver tolerance | Increase `thinning` parameter; try `method="achr"` |

## Bundled Resources

1 reference file:
- `references/api_workflows.md` — Consolidates API quick reference and advanced workflows. Covers: detailed function signatures, solver configuration (GLPK/CPLEX/Gurobi), advanced analysis (`find_blocked_reactions`, `find_essential_genes`/`find_essential_reactions`), model manipulation (adding reactions/metabolites/genes), flux sample validation, and 5 workflow examples (knockout with visualization, media design, flux space exploration, production strain design, model validation). Relocated inline: basic FBA/FVA/deletion/sampling (Core API modules 3-7). Omitted: geometric FBA internals, MIP gap configuration — consult COBRApy docs.

## Related Skills

- **escher** — interactive metabolic map visualization; use JSON models from COBRApy
- **bioservices** — retrieve models from BiGG, BioModels, or KEGG for import into COBRApy
- **networkx** — metabolic network topology analysis; export stoichiometry matrix as graph

## References

- [COBRApy documentation](https://cobrapy.readthedocs.io/) — official API reference and tutorials
- [BiGG Models](http://bigg.ucsd.edu/) — curated genome-scale metabolic model repository
- [COBRApy GitHub](https://github.com/opencobra/cobrapy) — source code and issue tracker
- Ebrahim et al. (2013) "COBRApy: COnstraints-Based Reconstruction and Analysis for Python" — [BMC Systems Biology](https://doi.org/10.1186/1752-0509-7-74)