---
name: numpy-best-practices
description: Best practices for NumPy array programming, numerical computing, and performance optimization in Python
---

# NumPy Best Practices

Expert guidelines for NumPy development, focusing on array programming, numerical computing, and performance optimization.

## Code Style and Structure

- Write concise, technical Python code with accurate NumPy examples
- Prefer vectorized operations over explicit loops for performance
- Use descriptive variable names reflecting data content (e.g., `weights`, `gradients`, `input_array`)
- Follow PEP 8 style guidelines for Python code
- Use functional programming patterns when appropriate

## Array Creation and Manipulation

- Use appropriate array creation functions: `np.array()`, `np.zeros()`, `np.ones()`, `np.empty()`, `np.arange()`, `np.linspace()`
- Prefer `np.zeros()` or `np.empty()` for pre-allocation when array size is known
- Use `np.concatenate()`, `np.vstack()`, `np.hstack()` for combining arrays
- Leverage broadcasting for operations on arrays with different shapes

## Indexing and Slicing

- Use advanced indexing with boolean arrays for conditional selection
- Prefer views over copies when possible to save memory
- Use `np.where()` for conditional element selection
- Understand the difference between fancy indexing (creates copy) and basic slicing (creates view)

## Data Types

- Specify appropriate data types explicitly using `dtype` parameter
- Use `np.float32` for memory-efficient computations when full precision is not needed
- Be aware of integer overflow with fixed-size integer types
- Use `np.asarray()` for type conversion without unnecessary copies

## Performance Optimization

### Vectorization

- Always prefer vectorized operations over Python loops
- Use NumPy universal functions (ufuncs) for element-wise operations
- Leverage `np.einsum()` for complex tensor operations
- Use `np.dot()` or `@` operator for matrix multiplication

### Memory Management

- Use `np.ndarray.flags` to check memory layout (C-contiguous vs Fortran-contiguous)
- Prefer in-place operations with `out` parameter when possible
- Use memory-mapped arrays (`np.memmap`) for large datasets
- Be mindful of array copies vs views

### Computation Efficiency

- Use `np.sum()`, `np.mean()`, `np.std()` with `axis` parameter for aggregations
- Leverage `np.cumsum()`, `np.cumprod()` for cumulative operations
- Use `np.searchsorted()` for efficient sorted array operations

## Error Handling and Validation

- Validate input shapes and data types before computations
- Use assertions for dimension checking with informative messages
- Handle NaN and Inf values appropriately with `np.isnan()`, `np.isinf()`
- Use `np.errstate()` context manager for controlling floating-point error handling

## Random Number Generation

- Use `np.random.default_rng()` for modern random number generation
- Set seeds for reproducibility: `rng = np.random.default_rng(seed=42)`
- Prefer the new Generator API over legacy `np.random` functions
- Use appropriate distributions: `rng.normal()`, `rng.uniform()`, `rng.choice()`

## Linear Algebra

- Use `np.linalg` for linear algebra operations
- Leverage `np.linalg.solve()` instead of computing inverse for linear systems
- Use `np.linalg.eig()`, `np.linalg.svd()` for decompositions
- Check matrix condition with `np.linalg.cond()` before inversion

## Testing and Documentation

- Write unit tests using `pytest` with `np.testing` assertions
- Use `np.testing.assert_array_equal()` for exact comparisons
- Use `np.testing.assert_array_almost_equal()` for floating-point comparisons
- Include comprehensive docstrings following NumPy docstring format

## Key Conventions

- Import as `import numpy as np`
- Use `snake_case` for variables and functions
- Document array shapes in docstrings
- Profile code with `%timeit` to identify bottlenecks