---
name: neuropixels-analysis
description: Analyze Neuropixels extracellular recordings end-to-end with SpikeInterface. Covers loading SpikeGLX/Open Ephys/NWB data, preprocessing, drift/motion correction, Kilosort4 (and CPU) spike sorting, quality metrics, and unit curation (threshold-based, model-based UnitRefine, and AI-assisted visual review). Use when working with Neuropixels 1.0/2.0 recordings, spike sorting, or extracellular electrophysiology analysis.
license: MIT license
metadata:
  version: "2.0"
  skill-author: K-Dense Inc.
---

# Neuropixels Data Analysis

## Overview

Toolkit for analyzing Neuropixels high-density neural recordings using current best
practices from [SpikeInterface](https://spikeinterface.readthedocs.io/), the Allen
Institute, and the International Brain Laboratory (IBL). It covers the full workflow from
raw data to publication-ready curated units.

All examples use the real SpikeInterface API (`spikeinterface.full as si`) plus the
companion curation module (`spikeinterface.curation as sc`). The skill ships runnable
scripts in `scripts/` and a copy-and-edit template in `assets/` that implement this
workflow directly on top of SpikeInterface — there is no separate package to install
beyond the dependencies listed under [Installation](#installation).

## When to Use This Skill

This skill should be used when:
- Working with Neuropixels recordings (`.ap.bin`, `.lf.bin`, `.meta` files)
- Loading data from SpikeGLX, Open Ephys, or NWB formats
- Preprocessing neural recordings (filtering, common reference, bad-channel detection)
- Detecting and correcting motion/drift
- Running spike sorting (Kilosort4, SpykingCircus2, Mountainsort5, Tridesclous2)
- Computing quality metrics (SNR, ISI violations, presence ratio, amplitude cutoff)
- Curating units (threshold-based, model-based, or AI-assisted)
- Creating visualizations and exporting to Phy or NWB

## Supported Hardware & Formats

| Probe | Electrodes | Channels | Notes |
|-------|-----------|----------|-------|
| Neuropixels 1.0 | 960 | 384 | Use `phase_shift` for ADC correction |
| Neuropixels 2.0 (single) | 1280 | 384 | Denser geometry |
| Neuropixels 2.0 (4-shank) | 5120 | 384 | Multi-region recording |

| Format | Extension | Reader |
|--------|-----------|--------|
| SpikeGLX | `.ap.bin`, `.lf.bin`, `.meta` | `si.read_spikeglx()` |
| Open Ephys | `.continuous`, `.oebin` | `si.read_openephys()` |
| NWB | `.nwb` | `si.read_nwb()` |

## Quick Start

### Import and configure parallel processing

```python
import spikeinterface.full as si

# Global job kwargs are reused by all parallelizable steps
si.set_global_job_kwargs(n_jobs=-1, chunk_duration="1s", progress_bar=True)
```

### Loading data

```python
# Inspect available streams first
stream_names, stream_ids = si.get_neo_streams("spikeglx", "/path/to/run_g0/")
print(stream_names)  # e.g. ['imec0.ap', 'imec0.lf', 'nidq']

# SpikeGLX (most common) — select the AP stream by name
recording = si.read_spikeglx("/path/to/run_g0/", stream_name="imec0.ap", load_sync_channel=False)

# Open Ephys
recording = si.read_openephys("/path/to/Record_Node_101/")

# For quick iteration, slice the first 60 s
fs = recording.get_sampling_frequency()
recording_sub = recording.frame_slice(0, int(60 * fs))
```

### Full pipeline (bundled script)

The repository ships an end-to-end pipeline built on SpikeInterface:

```bash
python scripts/neuropixels_pipeline.py /path/to/spikeglx/data output/ --sorter kilosort4 --curation allen
```

It performs load → preprocess → drift check → optional motion correction → sorting →
postprocessing → quality metrics → curation → export. Read the steps below to run them
interactively or customize the pipeline.

## Standard Analysis Workflow

### 1. Preprocessing

Recommended chain, following the SpikeInterface Neuropixels how-to (IBL-style destriping
with channel removal + common reference):

```python
rec = si.highpass_filter(recording, freq_min=400.0)
bad_channel_ids, channel_labels = si.detect_bad_channels(rec)
rec = rec.remove_channels(bad_channel_ids)
rec = si.phase_shift(rec)  # ADC phase correction (Neuropixels 1.0)
rec = si.common_reference(rec, operator="median", reference="global")
```

Save the preprocessed recording (Kilosort needs a binary file, and it speeds up reuse):

```python
rec = rec.save(folder="preprocessed/", format="binary")
```

### 2. Check and correct drift

Always inspect drift before sorting:

```python
from spikeinterface.sortingcomponents.peak_detection import detect_peaks
from spikeinterface.sortingcomponents.peak_localization import localize_peaks

noise_levels = si.get_noise_levels(rec, return_in_uV=False)
peaks = detect_peaks(rec, method="locally_exclusive", noise_levels=noise_levels,
                     detect_threshold=5, radius_um=50.0)
peak_locations = localize_peaks(rec, peaks, method="center_of_mass")

# Visualize the drift raster
si.plot_drift_raster_map(peaks=peaks, peak_locations=peak_locations,
                         recording=rec, clim=(-50, 50))
```

Apply correction if needed (presets: `rigid_fast`, `kilosort_like`,
`nonrigid_accurate`, `nonrigid_fast_and_accurate`, `dredge`, `dredge_fast`):

```python
rec_corrected = si.correct_motion(rec, preset="nonrigid_fast_and_accurate", folder="motion/")
```

### 3. Spike sorting

```python
# Kilosort4 (recommended, requires a CUDA GPU)
sorting = si.run_sorter("kilosort4", rec_corrected, folder="ks4_output")

# CPU alternatives (internally developed, no external install)
sorting = si.run_sorter("spykingcircus2", rec_corrected, folder="sc2_output")
sorting = si.run_sorter("tridesclous2", rec_corrected, folder="tdc2_output")
sorting = si.run_sorter("mountainsort5", rec_corrected, folder="ms5_output")

# External sorters can run in containers without local install
sorting = si.run_sorter("kilosort2_5", rec_corrected, folder="ks25_output", docker_image=True)

print(si.installed_sorters())
```

> Note: `run_sorter` uses the `folder=` argument. The older `output_folder=` is deprecated.

### 4. Postprocessing

```python
analyzer = si.create_sorting_analyzer(sorting, rec_corrected, sparse=True,
                                      format="binary_folder", folder="analyzer/")

analyzer.compute("random_spikes", method="uniform", max_spikes_per_unit=500)
analyzer.compute("waveforms", ms_before=1.0, ms_after=2.0)
analyzer.compute("templates", operators=["average", "std"])
analyzer.compute("noise_levels")
analyzer.compute("spike_amplitudes")
analyzer.compute("correlograms", window_ms=50.0, bin_ms=1.0)
analyzer.compute("unit_locations", method="monopolar_triangulation")
analyzer.compute("template_similarity")

metric_names = ["firing_rate", "presence_ratio", "snr", "isi_violation", "amplitude_cutoff"]
analyzer.compute("quality_metrics", metric_names=metric_names)
metrics = analyzer.get_extension("quality_metrics").get_data()
```

### 5. Curation by metric thresholds

```python
# Allen-style query (note: column is isi_violations_ratio)
query = "(amplitude_cutoff < 0.1) & (isi_violations_ratio < 0.5) & (presence_ratio > 0.9)"
good_unit_ids = metrics.query(query).index.values
```

For reusable, multi-threshold logic with `allen` / `ibl` / `strict` presets, use the
bundled `scripts/compute_metrics.py`. See
[references/AUTOMATED_CURATION.md](references/AUTOMATED_CURATION.md) for details and the
Bombcell / UnitMatch tools.

### 6. Model-based curation (UnitRefine)

SpikeInterface can apply pretrained machine-learning classifiers from Hugging Face via the
`spikeinterface.curation` module. The UnitRefine models were trained on real Neuropixels
data (V1, SC, ALM):

```python
import spikeinterface.curation as sc

# 1) noise vs neural
noise_labels = sc.model_based_label_units(
    sorting_analyzer=analyzer,
    repo_id="SpikeInterface/UnitRefine_noise_neural_classifier",
    trust_model=True,
)
neural = analyzer.remove_units(noise_labels[noise_labels["prediction"] == "noise"].index)

# 2) single-unit (sua) vs multi-unit (mua) on the surviving units
sua_mua_labels = sc.model_based_label_units(
    sorting_analyzer=neural,
    repo_id="SpikeInterface/UnitRefine_sua_mua_classifier",
    trust_model=True,
)
```

Each call returns a DataFrame with `prediction` and `probability` (confidence) per unit.
`trust_model=True` (or an explicit `trusted=[...]` list) is required to load the `.skops`
model — only load models from sources you trust. Models trained on other brain
areas/datasets may not transfer; validate against a manually labelled subset.

### 7. AI-assisted curation (for uncertain units)

When running inside an agent such as Cursor or Claude Code, the agent can directly inspect
waveform/correlogram plots and give an expert read — no API setup required. Generate plots
and ask the agent to assess isolation quality.

For programmatic vision-model access, **read API keys from the environment — never hardcode
credentials in analysis scripts** (they leak into version control and logs):

```python
import os
from anthropic import Anthropic

client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])  # set this in your shell, not in code
```

See [references/AI_CURATION.md](references/AI_CURATION.md) for the full pattern (rendering a
unit summary image, building the prompt, and parsing the response).

### 8. Export results

```python
# Keep only good units, then export
analyzer_clean = analyzer.select_units(good_unit_ids, folder="analyzer_clean/", format="binary_folder")

# Phy for manual review
si.export_to_phy(analyzer_clean, output_folder="phy_export/",
                 compute_pc_features=True, compute_amplitudes=True)

# Figures report
si.export_report(analyzer_clean, "report/", format="png")

# NWB
from spikeinterface.exporters import export_to_nwb
export_to_nwb(analyzer_clean, "output.nwb")

# Metrics table
metrics.to_csv("quality_metrics.csv")
```

## Common Pitfalls and Best Practices

1. **Always check drift** before spike sorting — drift > ~10 μm meaningfully degrades quality.
2. **Use `phase_shift`** for Neuropixels 1.0 to correct ADC sampling offsets.
3. **Save the preprocessed recording** with `rec.save(folder=...)` to avoid recomputation (Kilosort also needs a binary file).
4. **Use a GPU** for Kilosort4 — it is far faster than CPU sorters.
5. **Review uncertain units** — automated/model-based curation is a starting point, not a verdict.
6. **Combine approaches** — thresholds for clear cases, model/AI for borderline units.
7. **Document thresholds and model repo IDs** for reproducibility.
8. **Export to Phy** for critical experiments — human oversight is valuable.

## Key Parameters to Adjust

### Preprocessing
- `freq_min`: highpass cutoff (300–400 Hz typical)
- `detect_bad_channels`: returns `(bad_channel_ids, channel_labels)`

### Motion Correction
- `preset`: `nonrigid_fast_and_accurate` (balanced), `nonrigid_accurate` (severe drift), `dredge` (state of the art)

### Spike Sorting (Kilosort4)
- `batch_size`: samples per batch (60000 default)
- `nblocks`: drift blocks (increase for long, drifty recordings)
- `Th_universal` / `Th_learned`: detection thresholds (lower = more spikes)

### Quality Metrics
- `snr`: signal-to-noise cutoff (3–5 typical)
- `isi_violations_ratio`: refractory violations (0.01–0.5)
- `presence_ratio`: recording coverage (0.5–0.95)

## Bundled Resources

### scripts/explore_recording.py
Quick inspection of a recording (streams, channels, duration, bad channels):
```bash
python scripts/explore_recording.py /path/to/data
```

### scripts/preprocess_recording.py
Automated preprocessing:
```bash
python scripts/preprocess_recording.py /path/to/data --output preprocessed/
```

### scripts/run_sorting.py
Run spike sorting:
```bash
python scripts/run_sorting.py preprocessed/ --sorter kilosort4 --output sorting/
```

### scripts/compute_metrics.py
Compute quality metrics and apply curation:
```bash
python scripts/compute_metrics.py sorting/ preprocessed/ --output metrics/ --curation allen
```

### scripts/export_to_phy.py
Export to Phy for manual curation:
```bash
python scripts/export_to_phy.py metrics/analyzer --output phy_export/
```

### scripts/neuropixels_pipeline.py
Complete end-to-end pipeline (see [Quick Start](#full-pipeline-bundled-script)).

### assets/analysis_template.py
Complete, editable analysis template. Copy and customize:
```bash
cp assets/analysis_template.py my_analysis.py
# Edit the PARAMETERS section, then run
python my_analysis.py
```

## Detailed Reference Guides

| Topic | Reference |
|-------|-----------|
| Full workflow | [references/standard_workflow.md](references/standard_workflow.md) |
| API reference (SpikeInterface) | [references/api_reference.md](references/api_reference.md) |
| Plotting guide | [references/plotting_guide.md](references/plotting_guide.md) |
| Preprocessing | [references/PREPROCESSING.md](references/PREPROCESSING.md) |
| Spike sorting | [references/SPIKE_SORTING.md](references/SPIKE_SORTING.md) |
| Motion correction | [references/MOTION_CORRECTION.md](references/MOTION_CORRECTION.md) |
| Quality metrics | [references/QUALITY_METRICS.md](references/QUALITY_METRICS.md) |
| Automated & model-based curation | [references/AUTOMATED_CURATION.md](references/AUTOMATED_CURATION.md) |
| AI-assisted curation | [references/AI_CURATION.md](references/AI_CURATION.md) |
| Waveform analysis | [references/ANALYSIS.md](references/ANALYSIS.md) |

## Installation

Requires Python ≥ 3.10. Using [uv](https://docs.astral.sh/uv/) is recommended.

```bash
# Core packages (SpikeInterface bundles the curation/model tooling)
uv pip install "spikeinterface[full]" probeinterface neo

# Spike sorters
uv pip install kilosort          # Kilosort4 (CUDA GPU required)
uv pip install spykingcircus     # SpykingCircus (legacy; SpykingCircus2 ships with SpikeInterface)
uv pip install mountainsort5     # Mountainsort5 (CPU)

# Model-based curation (UnitRefine) downloads from Hugging Face
uv pip install "huggingface_hub" skops

# Optional: AI-assisted visual curation
uv pip install anthropic

# Optional: IBL tools and Bombcell
uv pip install ibl-neuropixel ibllib bombcell
```

For reproducible environments, pin versions (current as of 2026-06: `spikeinterface==0.104.3`,
`kilosort==4.1.7`, `probeinterface==0.3.2`, `neo==0.14.4`). Unpinned installs are fine for
quick experimentation but should be pinned in production pipelines.

## Project Structure

```
project/
├── raw_data/
│   └── recording_g0/
│       └── recording_g0_imec0/
│           ├── recording_g0_t0.imec0.ap.bin
│           └── recording_g0_t0.imec0.ap.meta
├── preprocessed/           # Saved preprocessed recording
├── motion/                 # Motion estimation results
├── sorting_output/         # Spike sorter output
├── analyzer/               # SortingAnalyzer (waveforms, metrics)
├── phy_export/             # For manual curation
├── ai_curation/            # AI analysis reports
└── results/
    ├── quality_metrics.csv
    ├── curation_labels.json
    └── output.nwb
```

## Additional Resources

- **SpikeInterface Docs**: https://spikeinterface.readthedocs.io/
- **Neuropixels Tutorial**: https://spikeinterface.readthedocs.io/en/stable/how_to/analyze_neuropixels.html
- **Model-based Curation Tutorial**: https://spikeinterface.readthedocs.io/en/stable/tutorials/curation/plot_1_automated_curation.html
- **UnitRefine Models (Hugging Face)**: https://huggingface.co/SpikeInterface
- **Kilosort4 GitHub**: https://github.com/MouseLand/Kilosort
- **IBL Neuropixel Tools**: https://github.com/int-brain-lab/ibl-neuropixel
- **Allen Institute ecephys**: https://github.com/AllenInstitute/ecephys_spike_sorting
- **Bombcell (Automated QC)**: https://github.com/Julie-Fabre/bombcell
- **Awesome Neuropixels**: https://github.com/Julie-Fabre/awesome_neuropixels