---
name: gemini-video
description: Invoke Google Gemini for video understanding and analysis using the Python google-genai SDK. Supports gemini-3-pro-preview and gemini-2.5-flash for video analysis, transcription, and content extraction.
---

# Gemini Video Skill

Invoke Google Gemini models for video understanding, analysis, transcription, and content extraction using the Python `google-genai` SDK.

## Available Models

| Model ID | Description | Best For |
|----------|-------------|----------|
| `gemini-3-pro-preview` | Best multimodal understanding | Complex video analysis, detailed descriptions |
| `gemini-2.5-pro` | Advanced reasoning | Deep video analysis with reasoning |
| `gemini-2.5-flash` | Fast processing | Quick video summaries, high throughput |

## Configuration

**API Key**: Use environment variable `GEMINI_API_KEY`

## Usage

### Video Analysis (Local File)

For local video files, use the File API to upload first:

```bash
python -c "
from google import genai
from google.genai import types
import time

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

# Upload video file
video_file = client.files.upload(file='VIDEO_PATH')
print(f'Uploaded file: {video_file.name}')

# Wait for processing
while video_file.state.name == 'PROCESSING':
    print('Processing video...')
    time.sleep(5)
    video_file = client.files.get(name=video_file.name)

if video_file.state.name == 'FAILED':
    raise ValueError('Video processing failed')

# Analyze video
response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='Describe what happens in this video'),
            types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type))
        ])
    ]
)
print(response.text)
"
```

### Video Analysis (From URL)

```bash
python -c "
from google import genai
from google.genai import types

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

# For publicly accessible video URLs
video_url = 'VIDEO_URL_HERE'

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='Analyze this video and provide a detailed summary'),
            types.Part(file_data=types.FileData(file_uri=video_url, mime_type='video/mp4'))
        ])
    ]
)
print(response.text)
"
```

### Video Transcription

```bash
python -c "
from google import genai
from google.genai import types
import time

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

# Upload video
video_file = client.files.upload(file='VIDEO_PATH')
while video_file.state.name == 'PROCESSING':
    time.sleep(5)
    video_file = client.files.get(name=video_file.name)

# Transcribe
response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='Transcribe all spoken words in this video. Include timestamps if possible.'),
            types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type))
        ])
    ]
)
print(response.text)
"
```

## Workflow

When this skill is invoked:

1. **Determine the task type**:
   - **Video Summary**: Generate overview of video content
   - **Transcription**: Extract spoken words
   - **Visual Analysis**: Describe visual elements, scenes, actions
   - **Content Extraction**: Pull specific information from video
   - **Q&A**: Answer questions about video content

2. **Prepare the video**:
   - Local file → Upload via File API
   - Remote URL → Use directly (if publicly accessible)
   - Wait for processing if needed

3. **Select the appropriate model**:
   - Complex analysis → `gemini-3-pro-preview`
   - Quick summaries → `gemini-2.5-flash`

4. **Execute and return results**

## Example Invocations

### Summarize Meeting Recording
```bash
python -c "
from google import genai
from google.genai import types
import time

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

video_file = client.files.upload(file='meeting_recording.mp4')
while video_file.state.name == 'PROCESSING':
    time.sleep(5)
    video_file = client.files.get(name=video_file.name)

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='''Summarize this meeting recording:
1. List all participants mentioned
2. Key discussion points
3. Action items and decisions made
4. Any deadlines mentioned'''),
            types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type))
        ])
    ]
)
print(response.text)
"
```

### Analyze Tutorial Video
```bash
python -c "
from google import genai
from google.genai import types
import time

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

video_file = client.files.upload(file='tutorial.mp4')
while video_file.state.name == 'PROCESSING':
    time.sleep(5)
    video_file = client.files.get(name=video_file.name)

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='''Analyze this tutorial video and create:
1. A step-by-step guide based on the content
2. Key concepts explained
3. Any tips or best practices mentioned
4. Prerequisites needed to follow along'''),
            types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type))
        ])
    ]
)
print(response.text)
"
```

### Extract Code from Coding Video
```bash
python -c "
from google import genai
from google.genai import types
import time

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

video_file = client.files.upload(file='coding_session.mp4')
while video_file.state.name == 'PROCESSING':
    time.sleep(5)
    video_file = client.files.get(name=video_file.name)

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='''Extract all code shown in this video:
1. Identify the programming language
2. Capture the complete code snippets
3. Note any explanations given for the code
4. List any libraries or dependencies mentioned'''),
            types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type))
        ])
    ]
)
print(response.text)
"
```

### Timestamp-Based Analysis
```bash
python -c "
from google import genai
from google.genai import types
import time

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

video_file = client.files.upload(file='presentation.mp4')
while video_file.state.name == 'PROCESSING':
    time.sleep(5)
    video_file = client.files.get(name=video_file.name)

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='''Create a timestamped outline of this video:
Format: [MM:SS] - Topic/Event
Include major topic changes, key points, and notable moments.'''),
            types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type))
        ])
    ]
)
print(response.text)
"
```

### Q&A About Video
```bash
python -c "
from google import genai
from google.genai import types
import time

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

video_file = client.files.upload(file='lecture.mp4')
while video_file.state.name == 'PROCESSING':
    time.sleep(5)
    video_file = client.files.get(name=video_file.name)

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='YOUR_QUESTION_ABOUT_VIDEO_HERE'),
            types.Part(file_data=types.FileData(file_uri=video_file.uri, mime_type=video_file.mime_type))
        ])
    ]
)
print(response.text)
"
```

## File Management

### List Uploaded Files
```bash
python -c "
from google import genai

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))

for f in client.files.list():
    print(f'{f.name}: {f.state.name} ({f.mime_type})')
"
```

### Delete Uploaded File
```bash
python -c "
from google import genai

client = genai.Client(api_key=os.environ.get('GEMINI_API_KEY'))
client.files.delete(name='files/FILE_ID_HERE')
print('File deleted')
"
```

## Supported Video Formats

- MP4 (`video/mp4`)
- MOV (`video/quicktime`)
- AVI (`video/x-msvideo`)
- FLV (`video/x-flv`)
- MKV (`video/x-matroska`)
- WebM (`video/webm`)
- WMV (`video/x-ms-wmv`)
- 3GPP (`video/3gpp`)

## Video Limitations

- **Maximum file size**: Check current API limits (typically 2GB)
- **Maximum duration**: Varies by model (typically up to 1 hour)
- **Processing time**: Longer videos take more time to process
- **Quota**: Video analysis consumes more tokens than text

## Error Handling

Common errors and solutions:
- **PROCESSING state stuck**: Video may be too large or corrupted
- **FAILED state**: Unsupported format or processing error
- **File not found**: Upload before analysis
- **Rate limiting**: Implement retry with exponential backoff

## Notes

- Videos must be uploaded via File API before analysis (no inline data like images)
- Processing time depends on video length and complexity
- Uploaded files are automatically deleted after 48 hours
- For very long videos, consider chunking or asking specific timestamp questions
- Gemini 3 Pro provides the most detailed video analysis

## Tools to Use

- **Bash**: Execute Python commands
- **Read**: Load local video file paths
- **Write**: Save transcriptions or analysis to files
- **Glob**: Find video files in directories