---
name: "spikeinterface-electrophysiology"
description: "Unified Python framework for extracellular electrophysiology. Load 20+ formats (SpikeGLX, OpenEphys, NWB, Intan, Maxwell, Blackrock), preprocess, run 10+ sorters (Kilosort4, SpykingCircus2, Tridesclous, MountainSort5) via one API, compute quality metrics (SNR, ISI, firing rate), compare sorters, export NWB/Phy. For format-agnostic multi-sorter workflows. For Neuropixels-specific PSTH/decoding use neuropixels."
license: "MIT"
---

# SpikeInterface — Unified Extracellular Electrophysiology Framework

## Overview

SpikeInterface provides a common Python API to read extracellular recordings from 20+ file formats, preprocess raw voltage traces, run 10+ spike sorters, postprocess and quality-control sorted units, and export results — all without format-specific code. Its modular design lets users swap sorters, formats, and preprocessing steps without rewriting pipelines. SpikeInterface is built around lazy, chainable objects: a `Recording` holds raw data, a `Sorting` holds spike times, and a `SortingAnalyzer` ties them together for waveform and metric computation.

## When to Use

- Loading recordings from multiple acquisition systems (SpikeGLX, OpenEphys, Intan, NWB, Maxwell MEA, Blackrock) with a unified API rather than format-specific parsers
- Running the same preprocessing and sorting pipeline across experiments recorded on different hardware
- Comparing two or more spike sorters on the same recording to assess agreement and choose the best output
- Running containerized sorters (Kilosort, IronClust, MountainSort5) via Docker or Singularity without local installation
- Computing standard quality metrics (SNR, ISI violations, firing rate, presence ratio, amplitude cutoff) and applying threshold-based curation
- Validating spike-sorting accuracy against synthetic or hybrid ground-truth recordings
- Exporting sorted results to NWB for data sharing or to Phy for manual curation
- Use `neuropixels-analysis` instead for a complete Neuropixels-specific Kilosort4 workflow including PSTH computation, tuning curves, and population decoding
- For EEG, ECG, or other biosignal processing (not spike sorting), use `neurokit2` instead

## Prerequisites

- **Python packages**: `spikeinterface[full]>=0.101`, `probeinterface`, `numpy`, `matplotlib`
- **Optional sorter deps**: `kilosort` (pip), or Docker/Singularity for containerized sorters
- **Data requirements**: raw binary recording files plus probe geometry (`.prb`, `.json`, or auto-detected from format)
- **Hardware**: GPU required for Kilosort4; all other sorters run on CPU

```bash
pip install "spikeinterface[full]>=0.101" probeinterface
# Optional: Kilosort4 Python package
pip install kilosort
# Optional: Phy for manual curation
pip install phy
```

## Quick Start

```python
import spikeinterface.full as si
import spikeinterface.preprocessing as spre
import spikeinterface.sorters as ss
import spikeinterface.qualitymetrics as sqm

# Load, preprocess, sort, and inspect quality metrics in 10 lines
recording = si.read_openephys("/data/session_001", stream_name="Signals CH")
recording_pp = spre.bandpass_filter(
    spre.common_reference(recording, reference="global", operator="median"),
    freq_min=300, freq_max=6000,
)
sorting = ss.run_sorter("spykingcircus2", recording_pp, output_folder="./sc2_out")
analyzer = si.create_sorting_analyzer(sorting, recording_pp, folder="./analyzer")
analyzer.compute(["random_spikes", "waveforms", "templates", "noise_levels"])
metrics = sqm.compute_quality_metrics(analyzer, metric_names=["snr", "firing_rate", "isi_violation"])
print(metrics.describe())
```

## Core API

### Module 1: Recording I/O

SpikeInterface wraps every acquisition format behind a common `BaseRecording` interface. Once loaded, all objects expose the same methods regardless of origin format.

```python
import spikeinterface.full as si

# SpikeGLX (.bin + .meta)
recording_sglx = si.read_spikeglx("/data/session_001", stream_name="imec0.ap")

# OpenEphys (binary or classic)
recording_oe = si.read_openephys("/data/oe_session", stream_name="Signals CH")

# NWB file
recording_nwb = si.read_nwb_recording("/data/recording.nwb",
                                       electrical_series_name="ElectricalSeries")

# Intan RHD/RHS
recording_intan = si.read_intan("/data/session.rhd", stream_name="RHn")

# Inspect any recording with the same API
print(f"Format:       {type(recording_sglx).__name__}")
print(f"Channels:     {recording_sglx.get_num_channels()}")
print(f"Sampling rate:{recording_sglx.get_sampling_frequency()} Hz")
print(f"Duration:     {recording_sglx.get_total_duration():.1f} s")
print(f"Probe:        {recording_sglx.get_probe().name}")
```

```python
# List available streams before loading (useful when a file has multiple streams)
streams = si.get_neo_streams("spikeglx", "/data/session_001")
print("Available streams:", streams)
# e.g. ['imec0.ap', 'imec0.lf', 'nidq']

# Select a time slice (lazy, no data loaded until get_traces() is called)
recording_slice = recording_sglx.frame_slice(
    start_frame=0,
    end_frame=int(60 * recording_sglx.get_sampling_frequency()),  # first 60 s
)
print(f"Sliced duration: {recording_slice.get_total_duration():.1f} s")
```

### Module 2: Preprocessing

Preprocessing functions return new `Recording` objects wrapping the original; the chain is applied lazily when data is read. This keeps memory usage low even for multi-hour recordings.

```python
import spikeinterface.preprocessing as spre

# 1. Common median reference — removes shared noise across all channels
recording_cmr = spre.common_reference(recording_sglx,
                                       reference="global",
                                       operator="median")

# 2. Bandpass filter for action potentials (300–6000 Hz typical)
recording_filt = spre.bandpass_filter(recording_cmr,
                                       freq_min=300,
                                       freq_max=6000)

# 3. Remove bad channels automatically (coherence-based detection)
recording_clean, removed_ids = spre.remove_bad_channels(recording_filt,
                                                          method="coherence+psd")
print(f"Removed {len(removed_ids)} bad channels: {removed_ids}")
print(f"Clean channels: {recording_clean.get_num_channels()}")
```

```python
# Whitening — decorrelates channels; recommended before template-matching sorters
recording_white = spre.whiten(recording_clean, mode="local")

# Phase shift correction for Neuropixels (samples acquired with small time offsets)
recording_shifted = spre.phase_shift(recording_clean)

# Inspect a short snippet of preprocessed data
traces = recording_white.get_traces(start_frame=0, end_frame=3000, segment_index=0)
print(f"Trace snippet shape: {traces.shape}")   # (3000, n_channels)
print(f"Trace range: [{traces.min():.2f}, {traces.max():.2f}] µV")
```

### Module 3: Spike Sorting

`ss.run_sorter()` wraps every supported sorter behind a uniform call signature. Sorter-specific parameters are passed as keyword arguments; all other pipeline steps are identical.

```python
import spikeinterface.sorters as ss
from pathlib import Path

# List all sorters available in the current environment
available = ss.available_sorters()
print("Available sorters:", available)

# List sorters that can run without local installation (via container)
installed = ss.installed_sorters()
print("Installed locally:", installed)

# Run SpykingCircus2 (CPU, no external deps)
sorting_sc2 = ss.run_sorter(
    "spykingcircus2",
    recording_clean,
    output_folder=Path("./sorter_output/sc2"),
    remove_existing_folder=True,
    verbose=True,
)
print(f"SpykingCircus2 units: {len(sorting_sc2.get_unit_ids())}")
```

```python
# Run Kilosort4 via Docker container (no local GPU/MATLAB required)
sorting_ks4 = ss.run_sorter(
    "kilosort4",
    recording_clean,
    output_folder=Path("./sorter_output/ks4"),
    singularity_image=False,   # use Docker; set True for Singularity
    docker_image=True,
    remove_existing_folder=True,
    # Kilosort4-specific parameters
    nblocks=5,
    Th_learned=8,
    do_correction=True,
)
print(f"Kilosort4 units: {len(sorting_ks4.get_unit_ids())}")

# Run MountainSort5 (CPU, fast, good for tetrode/low-channel-count probes)
sorting_ms5 = ss.run_sorter(
    "mountainsort5",
    recording_clean,
    output_folder=Path("./sorter_output/ms5"),
    scheme="2",          # scheme 2 is recommended for high-density probes
    detect_threshold=5.5,
)
print(f"MountainSort5 units: {len(sorting_ms5.get_unit_ids())}")
```

### Module 4: Postprocessing (SortingAnalyzer)

`SortingAnalyzer` is the central postprocessing object in SpikeInterface >= 0.101. It replaces the older `WaveformExtractor` and provides a unified interface for waveforms, templates, PCAs, and downstream metrics.

```python
import spikeinterface.full as si
import spikeinterface.postprocessing as spost

# Create analyzer (saves to disk; use format="memory" for in-RAM only)
analyzer = si.create_sorting_analyzer(
    sorting_sc2,
    recording_clean,
    folder="./analyzer_sc2",
    format="binary_folder",
    overwrite=True,
    sparse=True,           # sparse=True: only nearby channels per unit
    ms_before=1.0,
    ms_after=2.0,
)

# Compute extensions in dependency order
analyzer.compute([
    "random_spikes",       # subsample spike indices for waveform extraction
    "waveforms",           # raw waveform snippets per unit
    "templates",           # mean/std template per unit
    "noise_levels",        # per-channel noise estimate
])

# Retrieve templates
templates = analyzer.get_extension("templates").get_data(outputs="Templates")
print(f"Templates object: {templates}")
print(f"Unit 0 template shape: {templates.get_one_template_dense(0).shape}")
# (n_samples, n_channels)
```

```python
# Compute amplitude and PCA extensions (needed for quality metrics)
analyzer.compute([
    "spike_amplitudes",          # amplitude at peak channel per spike
    "principal_components",      # PCA scores (n_components x n_spikes)
    "template_similarity",       # pairwise template correlation matrix
    "correlograms",              # auto- and cross-correlograms
    "unit_locations",            # estimated unit position on probe (center of mass)
])

# Access spike amplitudes for first unit
ext_amp = analyzer.get_extension("spike_amplitudes")
unit_ids = analyzer.unit_ids
amps = ext_amp.get_data()[analyzer.sorting.ids_to_indices([unit_ids[0]])]
print(f"Unit {unit_ids[0]} — median amplitude: {abs(amps).median():.1f} µV")
```

### Module 5: Quality Metrics

Quality metrics summarize unit isolation quality. Metrics requiring only spike times (ISI violations, firing rate) are fast; metrics requiring waveforms (SNR, amplitude cutoff) need the `SortingAnalyzer` to be populated first.

```python
import spikeinterface.qualitymetrics as sqm

# Compute a standard panel of quality metrics
metrics = sqm.compute_quality_metrics(
    analyzer,
    metric_names=[
        "snr",                    # signal-to-noise ratio of template peak
        "isi_violation",          # fraction of ISIs < refractory period
        "firing_rate",            # mean firing rate (Hz) over recording
        "presence_ratio",         # fraction of time windows with ≥1 spike
        "amplitude_cutoff",       # estimated fraction of spikes below threshold
        "nearest_neighbor",       # isolation distance in PCA space
        "silhouette_score",       # cluster separation in PCA space
    ],
)
print(metrics.head())
print(f"\nShape: {metrics.shape}")  # (n_units, n_metrics)
```

```python
import pandas as pd

# Apply threshold-based curation (Allen Brain Institute defaults)
thresholds = {
    "snr":                   (">=", 5.0),
    "isi_violations_ratio":  ("<=", 0.1),
    "firing_rate":           (">=", 0.1),
    "presence_ratio":        (">=", 0.9),
    "amplitude_cutoff":      ("<=", 0.1),
}

keep = pd.Series(True, index=metrics.index)
for col, (op, val) in thresholds.items():
    if col not in metrics.columns:
        continue
    if op == ">=":
        keep &= metrics[col] >= val
    else:
        keep &= metrics[col] <= val

good_unit_ids = metrics[keep].index.tolist()
print(f"Total units:   {len(metrics)}")
print(f"Curated units: {len(good_unit_ids)} ({100*len(good_unit_ids)/len(metrics):.0f}%)")

# Filter analyzer to good units
sorting_curated = sorting_sc2.select_units(good_unit_ids)
```

### Module 6: Comparison and Export

Compare sorters against each other or against ground truth, then export results in shareable formats.

```python
import spikeinterface.comparison as sc

# Compare two sorters — matches units by spike train overlap
comparison = sc.compare_two_sorters(
    sorting_sc2,
    sorting_ks4,
    sorting1_name="SpykingCircus2",
    sorting2_name="Kilosort4",
    match_score=0.5,        # minimum overlap to count as a match
    delta_time=0.4,         # coincidence window (ms)
)

# Performance summary per matched unit pair
perf = comparison.get_performance(method="by_unit")
print(perf.head(10))
# Columns: accuracy, recall, precision, false_discovery_rate, miss_rate

# Agreement score matrix (fraction overlap between all unit pairs)
agreement_matrix = comparison.get_agreement_fraction_table()
print(f"Agreement matrix shape: {agreement_matrix.shape}")
```

```python
import spikeinterface.exporters as sexp

# Export curated sorting to NWB (Neurodata Without Borders)
sexp.export_to_nwb(
    sorting_curated,
    nwb_file_path="./session_sorted.nwb",
    overwrite=True,
)
print("Exported to NWB: session_sorted.nwb")

# Export to Phy for manual curation
sexp.export_to_phy(
    analyzer,
    output_folder="./phy_export",
    compute_pc_features=True,
    copy_binary=True,
    remove_if_exists=True,
)
print("Phy export ready at: ./phy_export")
print("Launch Phy with: phy template-gui phy_export/params.py")
```

## Common Workflows

### Workflow 1: Multi-Sorter Comparison on OpenEphys Data

**Goal**: Load an OpenEphys recording, preprocess, run two sorters, compare their agreement, curate the higher-yield output, and export to NWB.

```python
import spikeinterface.full as si
import spikeinterface.preprocessing as spre
import spikeinterface.sorters as ss
import spikeinterface.comparison as sc
import spikeinterface.qualitymetrics as sqm
import spikeinterface.exporters as sexp
from pathlib import Path

# --- Step 1: Load ---
data_dir = Path("/data/oe_recording")
streams = si.get_neo_streams("openephys", data_dir)
print("Streams:", streams)

recording = si.read_openephys(data_dir, stream_name="Signals CH")
print(f"Loaded: {recording.get_num_channels()} ch, "
      f"{recording.get_sampling_frequency()} Hz, "
      f"{recording.get_total_duration():.1f} s")

# --- Step 2: Preprocess ---
rec = spre.bandpass_filter(recording, freq_min=300, freq_max=6000)
rec = spre.common_reference(rec, reference="global", operator="median")
rec, bad_ids = spre.remove_bad_channels(rec, method="coherence+psd")
print(f"Preprocessing complete. Removed channels: {bad_ids}")

# --- Step 3: Run two sorters ---
out = Path("./sorting_outputs")
sorting_sc2 = ss.run_sorter("spykingcircus2", rec,
                              output_folder=out / "sc2",
                              remove_existing_folder=True)
sorting_tdc = ss.run_sorter("tridesclous2", rec,
                              output_folder=out / "tdc",
                              remove_existing_folder=True)
print(f"SC2 units: {len(sorting_sc2.unit_ids)}, "
      f"TDC units: {len(sorting_tdc.unit_ids)}")

# --- Step 4: Compare ---
cmp = sc.compare_two_sorters(sorting_sc2, sorting_tdc,
                               sorting1_name="SC2",
                               sorting2_name="Tridesclous2",
                               match_score=0.5)
perf = cmp.get_performance(method="pooled_with_average")
print(f"\nAgreement performance:\n{perf}")

# --- Step 5: Quality metrics on SC2 (higher yield) ---
analyzer = si.create_sorting_analyzer(sorting_sc2, rec,
                                        folder="./analyzer_sc2",
                                        overwrite=True, sparse=True)
analyzer.compute(["random_spikes", "waveforms", "templates",
                  "noise_levels", "spike_amplitudes"])
metrics = sqm.compute_quality_metrics(
    analyzer,
    metric_names=["snr", "firing_rate", "isi_violation",
                  "presence_ratio", "amplitude_cutoff"],
)

keep = (metrics["snr"] >= 5) & (metrics["isi_violations_ratio"] <= 0.1) \
     & (metrics["firing_rate"] >= 0.1) & (metrics["presence_ratio"] >= 0.9)
sorting_curated = sorting_sc2.select_units(metrics[keep].index.tolist())
print(f"\nCurated: {len(sorting_curated.unit_ids)} / {len(sorting_sc2.unit_ids)} units")

# --- Step 6: Export winner to NWB ---
sexp.export_to_nwb(sorting_curated,
                    nwb_file_path="./session_sorted.nwb",
                    overwrite=True)
print("Saved: session_sorted.nwb")
```

### Workflow 2: Ground Truth Validation with Synthetic Recordings

**Goal**: Generate a synthetic recording with known spike trains, run a sorter, and measure true accuracy (recall, precision) against ground truth — for benchmarking sorters or testing preprocessing pipelines.

```python
import spikeinterface.full as si
import spikeinterface.preprocessing as spre
import spikeinterface.sorters as ss
import spikeinterface.comparison as sc
import numpy as np

# --- Step 1: Generate ground-truth synthetic recording ---
# Uses a Marsaglia noise model with realistic waveform templates
recording_gt, sorting_gt = si.generate_ground_truth_recording(
    durations=[120.0],             # 120 s recording
    sampling_frequency=30000.0,
    num_channels=32,
    num_units=10,
    noise_kwargs={"noise_level": 10.0, "dtype": "float32"},
    seed=42,
)
print(f"GT recording: {recording_gt.get_num_channels()} ch, "
      f"{recording_gt.get_total_duration():.0f} s")
print(f"GT units: {len(sorting_gt.unit_ids)}")
print(f"GT firing rates: "
      f"{[round(len(sorting_gt.get_unit_spike_train(u, 0))/120, 1) for u in sorting_gt.unit_ids]} Hz")

# --- Step 2: Preprocess ---
rec_pp = spre.bandpass_filter(recording_gt, freq_min=300, freq_max=6000)
rec_pp = spre.common_reference(rec_pp, reference="global", operator="median")

# --- Step 3: Sort with two sorters ---
sorting_sc2 = ss.run_sorter("spykingcircus2", rec_pp,
                              output_folder="./gt_sc2",
                              remove_existing_folder=True)
sorting_ms5 = ss.run_sorter("mountainsort5", rec_pp,
                              output_folder="./gt_ms5",
                              remove_existing_folder=True,
                              scheme="2")

# --- Step 4: Compare each sorter against ground truth ---
for name, sorting_test in [("SC2", sorting_sc2), ("MS5", sorting_ms5)]:
    cmp = sc.compare_sorter_to_ground_truth(sorting_gt, sorting_test,
                                              exhaustive_gt=True)
    perf = cmp.get_performance(method="pooled_with_average")
    print(f"\n{name} vs Ground Truth:")
    print(f"  Accuracy:  {perf['accuracy']:.3f}")
    print(f"  Recall:    {perf['recall']:.3f}")
    print(f"  Precision: {perf['precision']:.3f}")
    print(f"  Well-detected units: {cmp.get_well_detected_units(well_detected_score=0.8)}")
```

### Workflow 3: Batch Processing Multiple Sessions

**Goal**: Apply the same preprocessing + sorting pipeline to multiple recording sessions and collect quality metrics across all sessions.

```python
import spikeinterface.full as si
import spikeinterface.preprocessing as spre
import spikeinterface.sorters as ss
import spikeinterface.qualitymetrics as sqm
import pandas as pd
from pathlib import Path

sessions = list(Path("/data/experiment").glob("session_*/"))
all_metrics = []

for session_dir in sessions:
    print(f"Processing {session_dir.name} ...")
    try:
        streams = si.get_neo_streams("spikeglx", session_dir)
        ap_stream = [s for s in streams if "ap" in s][0]
        rec = si.read_spikeglx(session_dir, stream_name=ap_stream)

        # Preprocess
        rec = spre.bandpass_filter(
            spre.common_reference(rec, reference="global", operator="median"),
            freq_min=300, freq_max=6000,
        )
        rec, _ = spre.remove_bad_channels(rec)

        # Sort
        out_dir = session_dir / "sorting"
        sorting = ss.run_sorter("spykingcircus2", rec,
                                 output_folder=out_dir,
                                 remove_existing_folder=True)

        # Compute metrics
        analyzer = si.create_sorting_analyzer(
            sorting, rec, folder=session_dir / "analyzer", overwrite=True, sparse=True
        )
        analyzer.compute(["random_spikes", "waveforms", "templates",
                          "noise_levels", "spike_amplitudes"])
        m = sqm.compute_quality_metrics(
            analyzer, metric_names=["snr", "firing_rate", "isi_violation"]
        )
        m["session"] = session_dir.name
        all_metrics.append(m)

    except Exception as e:
        print(f"  FAILED: {e}")
        continue

# Combine across sessions
combined = pd.concat(all_metrics)
combined.to_csv("all_sessions_metrics.csv")
print(f"\nSaved metrics: {combined.shape[0]} units across {len(all_metrics)} sessions")
print(combined.groupby("session")[["snr", "firing_rate"]].median())
```

## Key Parameters

| Parameter | Module / Function | Default | Range / Options | Effect |
|-----------|-------------------|---------|-----------------|--------|
| `freq_min` / `freq_max` | `spre.bandpass_filter` | 300 / 6000 Hz | 150–500 / 3000–10000 Hz | Spike band; use 300–6000 Hz for AP activity |
| `reference` | `spre.common_reference` | `"global"` | `"global"`, `"local"`, `"single"` | Channel subset used for median reference subtraction |
| `method` | `spre.remove_bad_channels` | `"coherence+psd"` | `"coherence+psd"`, `"std"`, `"mad"` | Algorithm for bad channel detection |
| `scheme` | `ss.run_sorter("mountainsort5")` | `"2"` | `"1"`, `"2"`, `"3"` | Sorting scheme; scheme 2 recommended for high-density probes |
| `nblocks` | `ss.run_sorter("kilosort4")` | `5` | `0–10` | Number of drift correction blocks; 0 disables drift correction |
| `Th_learned` | `ss.run_sorter("kilosort4")` | `8` | `6–12` | Detection threshold (× noise); lower = more units, more noise |
| `match_score` | `sc.compare_two_sorters` | `0.5` | `0.1–0.9` | Minimum spike-train overlap to declare a unit match |
| `sparse` | `si.create_sorting_analyzer` | `True` | `True`, `False` | Limit waveform extraction to channels near each unit; reduces memory |
| `ms_before` / `ms_after` | `si.create_sorting_analyzer` | `1.0` / `2.0` ms | 0.5–2.0 / 1.0–3.0 ms | Waveform snippet window relative to detected spike peak |
| `snr` threshold | `sqm.compute_quality_metrics` | — | 5–10 recommended | Amplitude / noise ratio; > 5 indicates well-isolated unit |
| `isi_violations_ratio` | `sqm.compute_quality_metrics` | — | ≤ 0.1 recommended | Fraction of ISIs < refractory period (1.5 ms); < 0.1 = single unit |
| `presence_ratio` | `sqm.compute_quality_metrics` | — | ≥ 0.9 recommended | Fraction of recording epochs where unit fires; < 0.9 = drifting unit |

## Best Practices

1. **Always inspect available streams before loading**: Different acquisition systems save AP data, LFP data, and auxiliary channels as separate streams. Loading the wrong stream silently yields valid-looking but incorrect data.
   ```python
   streams = si.get_neo_streams("spikeglx", data_dir)
   print(streams)  # e.g. ['imec0.ap', 'imec0.lf', 'nidq']
   recording = si.read_spikeglx(data_dir, stream_name="imec0.ap")
   ```

2. **Chain preprocessing lazily; do not load to memory early**: Preprocessing objects are lazy and apply transformations at read time. Calling `get_traces()` on the raw recording before preprocessing will load unfiltered data into RAM unnecessarily. Build the full chain before any data access.

3. **Use `sparse=True` when creating a SortingAnalyzer**: For high-channel-count probes (64–384 channels), dense waveform extraction is 10–50× more expensive in RAM and disk than sparse. Sparse mode extracts waveforms only on the channels nearest each unit.

4. **Run containerized sorters to avoid dependency conflicts**: Kilosort2/3 (MATLAB), IronClust, and other sorters have complex dependencies. Use `docker_image=True` in `run_sorter()` to pull the official container and run the sorter in isolation:
   ```python
   sorting = ss.run_sorter("kilosort2_5", recording_clean,
                            output_folder="./ks25_out",
                            docker_image=True)
   ```

5. **Compute metrics extensions in dependency order**: Extensions depend on each other. The canonical order is: `random_spikes` → `waveforms` → `templates` → `noise_levels` → `spike_amplitudes` → `principal_components`. Skipping an earlier step causes a `MissingExtensionError` when a downstream step is requested.

6. **Save the SortingAnalyzer to disk for large recordings**: In-memory analyzers (`format="memory"`) are lost when the process exits. For recordings longer than 30 minutes or with many units, always specify a `folder` path so the analyzer can be reloaded:
   ```python
   analyzer = si.load_sorting_analyzer("./analyzer_sc2")
   ```

7. **Do not compare sorters with mismatched preprocessing**: When benchmarking sorters, run all of them on the same preprocessed `recording_clean` object. Running sorters on different preprocessing chains invalidates the comparison.

## Common Recipes

### Recipe: Load and Inspect a Multi-Stream Recording

When to use: Quickly check what streams are available in an unfamiliar recording and confirm channel counts and duration before committing to a full sort.

```python
import spikeinterface.full as si

data_dir = "/data/recording_session"

# Try SpikeGLX first; if it fails, try OpenEphys
try:
    streams = si.get_neo_streams("spikeglx", data_dir)
    fmt = "spikeglx"
except Exception:
    streams = si.get_neo_streams("openephys", data_dir)
    fmt = "openephys"

print(f"Format: {fmt}")
print(f"Streams: {streams}")

for stream in streams:
    try:
        rec = si.read_spikeglx(data_dir, stream_name=stream) if fmt == "spikeglx" \
              else si.read_openephys(data_dir, stream_name=stream)
        print(f"  {stream}: {rec.get_num_channels()} ch, "
              f"{rec.get_sampling_frequency()} Hz, "
              f"{rec.get_total_duration():.1f} s")
    except Exception as e:
        print(f"  {stream}: could not load ({e})")
```

### Recipe: Export Quality Metrics Report to CSV

When to use: After running quality metrics, save a tidy CSV summarizing all units with their metrics and a pass/fail column for downstream analysis or sharing with collaborators.

```python
import spikeinterface.qualitymetrics as sqm
import pandas as pd

metrics = sqm.compute_quality_metrics(
    analyzer,
    metric_names=["snr", "firing_rate", "isi_violation",
                  "presence_ratio", "amplitude_cutoff"],
)

# Add pass/fail column based on standard thresholds
metrics["pass_qc"] = (
    (metrics["snr"] >= 5) &
    (metrics["isi_violations_ratio"] <= 0.1) &
    (metrics["firing_rate"] >= 0.1) &
    (metrics["presence_ratio"] >= 0.9) &
    (metrics["amplitude_cutoff"] <= 0.1)
)

metrics.to_csv("unit_quality_metrics.csv")
n_pass = metrics["pass_qc"].sum()
print(f"QC report saved: {len(metrics)} total units, {n_pass} pass ({100*n_pass/len(metrics):.0f}%)")
print(metrics[metrics["pass_qc"]].describe())
```

### Recipe: Probe Geometry Visualization

When to use: Verify that the probe channel map loaded correctly before sorting. Incorrect channel maps silently degrade sorting quality on high-density probes.

```python
import spikeinterface.full as si
import matplotlib.pyplot as plt
import probeinterface.plotting as pp

recording = si.read_spikeglx("/data/session_001", stream_name="imec0.ap")
probe = recording.get_probe()
print(f"Probe name: {probe.name}")
print(f"N contacts: {probe.get_contact_count()}")
print(f"Contact positions (first 5):\n{probe.contact_positions[:5]}")

fig, ax = plt.subplots(figsize=(3, 10))
pp.plot_probe(probe, ax=ax, with_channel_index=True)
ax.set_title(f"{probe.name} — channel map")
plt.tight_layout()
plt.savefig("probe_geometry.png", dpi=150)
print("Saved probe_geometry.png")
```

## Troubleshooting

| Problem | Cause | Solution |
|---------|-------|----------|
| `ValueError: stream_name not found` | Recording has multiple streams; none is specified | Run `si.get_neo_streams(format, path)` to list available streams; pass the correct one to the reader |
| Sorter output has zero units | Detection threshold too high, or preprocessing removed all signal | Verify `recording_clean.get_traces()` returns non-zero data; lower detection threshold (e.g. `Th_learned=6` for Kilosort4) |
| `MissingExtensionError` | Analyzer extension depends on an uncomputed prerequisite | Follow the canonical compute order: `random_spikes` → `waveforms` → `templates` → `noise_levels` → `spike_amplitudes` |
| Docker sorter hangs at startup | Docker daemon not running or image not pulled | Run `docker ps` to confirm Docker is running; pull image manually with `docker pull spikeinterface/kilosort4-compiled-base` |
| `MemoryError` during waveform extraction | Dense extraction on high-channel-count probe | Use `sparse=True` in `create_sorting_analyzer`; reduce `max_spikes_per_unit` (default 500) |
| Bad channel detection removes too many channels | Threshold too aggressive or short recording | Set `method="std"` for a simpler threshold; increase `bad_threshold` parameter |
| Unit comparison shows 0% agreement between sorters | Delta time window too narrow or match score too strict | Increase `delta_time` (default 0.4 ms) and lower `match_score` (try 0.3) |
| NWB export raises `TypeError` on unit properties | Sorting contains non-serializable properties from sorter | Remove problematic properties: `sorting.remove_unit_property("property_name")` before export |
| `read_spikeglx` fails on LF stream | LFP stream uses different file suffix (`.lf.bin`) | Specify `stream_name="imec0.lf"` explicitly; confirm file exists with `ls data_dir/*.lf.bin` |

## Related Skills

- **neuropixels-analysis** — Neuropixels-specific pipeline using SpikeInterface + Kilosort4 with PSTH, tuning curves, and population decoding for rodent and primate experiments
- **neurokit2** — For biosignal processing (ECG, EEG, EDA, EMG, PPG) rather than spike sorting; use when data is not extracellular electrophysiology

## References

- [SpikeInterface documentation](https://spikeinterface.readthedocs.io/) — full API reference, tutorials, and sorter-specific guides
- [SpikeInterface GitHub](https://github.com/SpikeInterface/spikeinterface) — source code, changelogs, and issue tracker
- [Buccino et al. (2020), eLife — SpikeInterface paper](https://doi.org/10.7554/eLife.61834) — unified framework design and benchmarks across sorters
- [ProbeInterface documentation](https://probeinterface.readthedocs.io/) — probe geometry handling and channel map formats
- [SpikeInterface sorter list](https://spikeinterface.readthedocs.io/en/latest/modules/sorters.html) — supported sorters, requirements, and container images
- [NWB documentation](https://www.nwb.org/) — Neurodata Without Borders format for neurophysiology data sharing