# LLM Utils Module

## Purpose

The utils module provides utility functions for Ollama server management, model discovery, and connection handling.

## Components

### Ollama Utilities (`ollama.py`)

Functions for Ollama server and model management:

#### Model Discovery

- **`is_ollama_running()`** - Check if Ollama server is running
- **`get_available_models()`** - List all available Ollama models
- **`select_best_model()`** - Select best model based on preferences
- **`get_model_info()`** - Get detailed information about a model

#### Server Management

- **`start_ollama_server()`** - Start Ollama server if not running
- **`check_ollama_health()`** - Health check

#### Model Operations

- **`preload_model()`** - Preload a model to reduce first-query latency
- **`ensure_model_available()`** - Ensure a model is available, install if needed

## Usage Examples

### Checking Ollama Status

```python
from infrastructure.llm.utils.ollama import is_ollama_running

if is_ollama_running():
    print("Ollama is ready")
else:
    print("Ollama is not running")
```

### Model Discovery

```python
from infrastructure.llm.utils.ollama import (
    get_available_models,
    select_best_model
)

# List all models
models = get_available_models()
print(f"Available models: {models}")

# Select best model
best_model = select_best_model()
print(f"Best model: {best_model}")
```

### Model Preloading

```python
from infrastructure.llm.utils.ollama import preload_model

success, error = preload_model("llama3.2:3b", timeout=60.0)
if success:
    print("Model preloaded successfully")
else:
    print(f"Preload failed: {error}")
```

## Model Preferences

The module uses a default preference list for model selection:
1. `llama3-gradient:latest` - Large context (256K), reliable
2. `llama3.1:latest` - Good balance
3. `llama2:latest` - Widely available
4. `gemma2:2b` - Fast, small
5. `gemma3:4b` - Medium size
6. `mistral:latest` - Alternative
7. `codellama:latest` - Code-focused

## Configuration

Environment variables:
- `OLLAMA_HOST` - Ollama server URL (default: http://localhost:11434)
- `OLLAMA_TIMEOUT` - Connection timeout (default: 2.0)

## Error Handling

Functions return detailed error information:
- Connection errors with helpful messages
- Model availability checks
- Timeout handling with retries

## See Also

- [`README.md`](README.md) - Quick reference
- [`../../llm/AGENTS.md`](../../llm/AGENTS.md) - LLM module overview
- [`ollama.py`](ollama.py) - Implementation