---
name: adk-rag-agent
description: Build RAG (Retrieval-Augmented Generation) agents with Google ADK and Vertex AI RAG Engine. Use when implementing document Q&A, knowledge base search, or citation-backed responses. Covers VertexAiRagRetrieval tool, corpus setup, and citation formatting.
---

# Google ADK RAG Agent

Build agents that answer questions from document corpora using Vertex AI RAG Engine.

## Requirements

- Vertex AI backend (not Gemini API)
- Google Cloud project with Vertex AI enabled
- RAG corpus created in Vertex AI

## Environment Variables

```bash
GOOGLE_GENAI_USE_VERTEXAI=1
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_CLOUD_LOCATION=us-central1
RAG_CORPUS=projects/{PROJECT_ID}/locations/{LOCATION}/ragCorpora/{CORPUS_ID}
```

## Core Implementation

```python
from google.adk import Agent
from google.adk.tools import VertexAiRagRetrieval

# Configure RAG retrieval tool
rag_tool = VertexAiRagRetrieval(
    name="retrieve_docs",
    description="Retrieve relevant documentation for the question",
    rag_corpus=os.environ["RAG_CORPUS"],
    similarity_top_k=10,
    vector_distance_threshold=0.6,
)

# Create agent with RAG tool
agent = Agent(
    name="rag_agent",
    model="gemini-2.0-flash-001",
    instruction=INSTRUCTION_PROMPT,
    tools=[rag_tool],
)
```

## Instruction Prompt Pattern

```python
INSTRUCTION_PROMPT = """
You are an AI assistant with access to a specialized document corpus.

RETRIEVAL:
- Use retrieve_docs for specific knowledge questions
- Skip retrieval for casual conversation
- Ask clarifying questions when intent is unclear

SCOPE:
- Only answer questions related to the corpus
- Say "I don't have information about that" for out-of-scope queries

CITATIONS:
- Always cite sources at the end of responses
- Format: [Title](url) or [Document Section](url)
- Consolidate multiple citations from the same source
"""
```

## Corpus Setup

Create corpus via Vertex AI Console or SDK:

```python
from vertexai.preview import rag

# Create corpus
corpus = rag.create_corpus(display_name="my-corpus")

# Import documents (PDF, TXT, HTML)
rag.import_files(
    corpus_name=corpus.name,
    paths=["gs://bucket/doc.pdf"],  # or local files
    chunk_size=512,
    chunk_overlap=100,
)
```

## Key Parameters

| Parameter | Description | Default |
|-----------|-------------|---------|
| `similarity_top_k` | Max chunks to retrieve | 10 |
| `vector_distance_threshold` | Min similarity (0-1, lower=stricter) | 0.6 |
| `chunk_size` | Tokens per chunk at import | 512 |
| `chunk_overlap` | Overlap between chunks | 100 |

## Citation Best Practices

1. Single source → single citation at end
2. Multiple sources → list all citations
3. Same document, multiple chunks → consolidate into one citation
4. Never expose internal chunk IDs to users

## References

- [Corpus setup details](references/corpus-setup.md)
- [Sample repo](https://github.com/google/adk-samples/tree/main/python/agents/RAG)