"在


# 多文档代理(V1)

在本指南中,您将学习如何在 LlamaIndex 文档上设置一个多文档代理。

这是 V0 多文档代理的扩展,具有以下附加功能:
- 在文档(工具)检索过程中重新排序
- 查询规划工具,代理可以用来规划

我们使用以下架构实现这一点:

- 在每个文档上设置一个“文档代理”:每个文档代理可以在其文档内进行问答/总结
- 在这组文档代理上设置一个顶层代理。进行工具检索,然后在工具集上进行协同训练以回答问题。


如果您在colab上打开这个笔记本,您可能需要安装LlamaIndex 🦙。


In [None]:
%pip install llama-index-core
%pip install llama-index-agent-openai
%pip install llama-index-readers-file
%pip install llama-index-postprocessor-cohere-rerank
%pip install llama-index-llms-openai
%pip install llama-index-embeddings-openai
%pip install unstructured[html]

In [None]:
%load_ext autoreload
%autoreload 2

## 设置和下载数据

在这一部分,我们将加载LlamaIndex文档。


In [None]:
domain = "docs.llamaindex.ai"
docs_url = "https://docs.llamaindex.ai/en/latest/"
!wget -e robots=off --recursive --no-clobber --page-requisites --html-extension --convert-links --restrict-file-names=windows --domains {domain} --no-parent {docs_url}

In [None]:
from llama_index.readers.file import UnstructuredReader

reader = UnstructuredReader()

In [None]:
from pathlib import Path

all_files_gen = Path("./docs.llamaindex.ai/").rglob("*")
all_files = [f.resolve() for f in all_files_gen]

In [None]:
all_html_files = [f for f in all_files if f.suffix.lower() == ".html"]

In [None]:
len(all_html_files)

1219

In [None]:
from llama_index.core import Document# TODO: 如果您想要更多的文档,请将其设置为更高的值doc_limit = 100docs = []for idx, f in enumerate(all_html_files): if idx > doc_limit: break print(f"索引 {idx}/{len(all_html_files)}") loaded_docs = reader.load_data(file=f, split_documents=True) # 硬编码索引。这之前的所有内容都是所有页面的目录 start_idx = 72 loaded_doc = Document( text="\n\n".join([d.get_content() for d in loaded_docs[72:]]), metadata={"path": str(f)}, ) print(loaded_doc.metadata["path"]) docs.append(loaded_doc)

# 定义全局LLM + 嵌入

在这个notebook中,我们将定义一个全局LLM(全局线性语言模型)和嵌入层。全局LLM是一种用于自然语言处理任务的模型,它可以学习单词之间的关系并将它们映射到一个连续的向量空间中。嵌入层用于将单词转换为密集的向量表示,这些向量可以作为模型的输入。


In [None]:
import os

os.environ["OPENAI_API_KEY"] = "sk-..."

import nest_asyncio

nest_asyncio.apply()

In [None]:
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import Settings

llm = OpenAI(model="gpt-3.5-turbo")
Settings.llm = llm
Settings.embed_model = OpenAIEmbedding(
 model="text-embedding-3-small", embed_batch_size=256
)

## 构建多文档代理

在本节中,我们将向您展示如何构建多文档代理。我们首先为每个文档构建一个文档代理,然后使用对象索引定义顶层父代理。


### 为每个文档构建文档代理

在这一部分,我们为每个文档定义"文档代理"。

我们为每个文档定义了一个向量索引(用于语义搜索)和摘要索引(用于摘要生成)。然后,这两个查询引擎被转换为工具,传递给一个调用OpenAI函数的代理。

这个文档代理可以动态选择在给定文档中执行语义搜索或摘要生成。

我们为每个城市创建一个单独的文档代理。


In [None]:
from llama_index.agent.openai import OpenAIAgentfrom llama_index.core import ( load_index_from_storage, StorageContext, VectorStoreIndex,)from llama_index.core import SummaryIndexfrom llama_index.core.tools import QueryEngineTool, ToolMetadatafrom llama_index.core.node_parser import SentenceSplitterimport osfrom tqdm.notebook import tqdmimport pickleasync def build_agent_per_doc(nodes, file_base): print(file_base) vi_out_path = f"./data/llamaindex_docs/{file_base}" summary_out_path = f"./data/llamaindex_docs/{file_base}_summary.pkl" if not os.path.exists(vi_out_path): Path("./data/llamaindex_docs/").mkdir(parents=True, exist_ok=True) # 构建向量索引 vector_index = VectorStoreIndex(nodes) vector_index.storage_context.persist(persist_dir=vi_out_path) else: vector_index = load_index_from_storage( StorageContext.from_defaults(persist_dir=vi_out_path), ) # 构建摘要索引 summary_index = SummaryIndex(nodes) # 定义查询引擎 vector_query_engine = vector_index.as_query_engine(llm=llm) summary_query_engine = summary_index.as_query_engine( response_mode="tree_summarize", llm=llm ) # 提取摘要 if not os.path.exists(summary_out_path): Path(summary_out_path).parent.mkdir(parents=True, exist_ok=True) summary = str( await summary_query_engine.aquery( "提取该文档的简洁1-2行摘要" ) ) pickle.dump(summary, open(summary_out_path, "wb")) else: summary = pickle.load(open(summary_out_path, "rb")) # 定义工具 query_engine_tools = [ QueryEngineTool( query_engine=vector_query_engine, metadata=ToolMetadata( name=f"vector_tool_{file_base}", description=f"用于与特定事实相关的问题", ), ), QueryEngineTool( query_engine=summary_query_engine, metadata=ToolMetadata( name=f"summary_tool_{file_base}", description=f"用于摘要问题", ), ), ] # 构建代理 function_llm = OpenAI(model="gpt-4") agent = OpenAIAgent.from_tools( query_engine_tools, llm=function_llm, verbose=True, system_prompt=f"""\您是一名专门设计用于回答关于“{file_base}.html”部分LlamaIndex文档的查询的代理。在回答问题时,您必须始终使用提供的工具之一;不要依赖先前的知识。\""", ) return agent, summaryasync def build_agents(docs): node_parser = SentenceSplitter() # 构建代理字典 agents_dict = {} extra_info_dict = {} # # 这是为了基准线 # all_nodes = [] for idx, doc in enumerate(tqdm(docs)): nodes = node_parser.get_nodes_from_documents([doc]) # all_nodes.extend(nodes) # ID将是基础+父级 file_path = Path(doc.metadata["path"]) file_base = str(file_path.parent.stem) + "_" + str(file_path.stem) agent, summary = await build_agent_per_doc(nodes, file_base) agents_dict[file_base] = agent extra_info_dict[file_base] = {"summary": summary, "nodes": nodes} return agents_dict, extra_info_dict

In [None]:
agents_dict, extra_info_dict = await build_agents(docs)

### 构建Retriever-Enabled OpenAI Agent

我们构建了一个顶层代理,可以协调不同的文档代理来回答任何用户查询。

这个`RetrieverOpenAIAgent`在使用工具之前执行工具检索(与默认代理不同,后者试图将所有工具放入提示中)。

**与V0版本相比的改进**:与V0版本中的“基础”版本相比,我们进行了以下改进。

- 添加重新排序功能:我们使用Cohere重新排序器来更好地过滤候选文档集。
- 添加查询规划工具:我们添加了一个显式的查询规划工具,它是根据检索到的工具集动态创建的。


In [None]:
# 为每个文档代理定义工具all_tools = []for file_base, agent in agents_dict.items(): summary = extra_info_dict[file_base]["summary"] doc_tool = QueryEngineTool( query_engine=agent, metadata=ToolMetadata( name=f"tool_{file_base}", description=summary, ), ) all_tools.append(doc_tool)

In [None]:
print(all_tools[0].metadata)

ToolMetadata(description='This document provides examples and documentation for an agent on the llama index platform.', name='tool_latest_index', fn_schema=)


In [None]:
# 定义一个“对象”索引和检索器from llama_index.core import VectorStoreIndexfrom llama_index.core.objects import ( ObjectIndex, ObjectRetriever,)from llama_index.postprocessor.cohere_rerank import CohereRerankfrom llama_index.core.query_engine import SubQuestionQueryEnginefrom llama_index.core.schema import QueryBundlefrom llama_index.llms.openai import OpenAIllm = OpenAI(model_name="gpt-4-0613")obj_index = ObjectIndex.from_objects( all_tools, index_cls=VectorStoreIndex,)vector_node_retriever = obj_index.as_node_retriever( similarity_top_k=10,)# 定义一个自定义对象检索器,添加一个查询规划工具class CustomObjectRetriever(ObjectRetriever): def __init__( self, retriever, object_node_mapping, node_postprocessors=None, llm=None, ): self._retriever = retriever self._object_node_mapping = object_node_mapping self._llm = llm or OpenAI("gpt-4-0613") self._node_postprocessors = node_postprocessors or [] def retrieve(self, query_bundle): if isinstance(query_bundle, str): query_bundle = QueryBundle(query_str=query_bundle) nodes = self._retriever.retrieve(query_bundle) for processor in self._node_postprocessors: nodes = processor.postprocess_nodes( nodes, query_bundle=query_bundle ) tools = [self._object_node_mapping.from_node(n.node) for n in nodes] sub_question_engine = SubQuestionQueryEngine.from_defaults( query_engine_tools=tools, llm=self._llm ) sub_question_description = f"""\用于涉及比较多个文档的任何查询。始终使用此工具进行比较查询 - 确保使用原始查询调用此工具。不要对涉及多个文档的任何查询使用其他工具。""" sub_question_tool = QueryEngineTool( query_engine=sub_question_engine, metadata=ToolMetadata( name="compare_tool", description=sub_question_description ), ) return tools + [sub_question_tool]

In [None]:
# 用ObjectRetriever包装它以返回对象custom_obj_retriever = CustomObjectRetriever( vector_node_retriever, obj_index.object_node_mapping, node_postprocessors=[CohereRerank(top_n=5)], llm=llm,)

In [None]:
tmps = custom_obj_retriever.retrieve("hello")# 应该是 5 + 1 -- 5 来自 reranker,1 来自子问题print(len(tmps))

6


In [None]:
from llama_index.agent.openai import OpenAIAgentfrom llama_index.core.agent import ReActAgenttop_agent = OpenAIAgent.from_tools( tool_retriever=custom_obj_retriever, system_prompt=""" \您是一个专门用于回答关于文档的查询的代理。请始终使用提供的工具来回答问题。不要依赖先前的知识。\""", llm=llm, verbose=True,)# top_agent = ReActAgent.from_tools(# tool_retriever=custom_obj_retriever,# system_prompt=""" \# 您是一个专门用于回答关于文档的查询的代理。# 请始终使用提供的工具来回答问题。不要依赖先前的知识。\# """,# llm=llm,# verbose=True,# )

### 定义基准向量存储索引

作为比较的基准,我们定义一个“简单”的RAG管道,将所有文档都存储在单个向量索引集合中。

我们设置top_k = 4


In [None]:
all_nodes = [
 n for extra_info in extra_info_dict.values() for n in extra_info["nodes"]
]

In [None]:
base_index = VectorStoreIndex(all_nodes)
base_query_engine = base_index.as_query_engine(similarity_top_k=4)

## 运行示例查询

让我们运行一些示例查询,涵盖从针对单个文档的问答/摘要到针对多个文档的问答/摘要。


In [None]:
response = top_agent.query(
 "What types of agents are available in LlamaIndex?",
)

Added user message to memory: What types of agents are available in LlamaIndex?
=== Calling Function ===
Calling function: tool_agents_index with args: {"input":"types of agents"}
Added user message to memory: types of agents
=== Calling Function ===
Calling function: vector_tool_agents_index with args: {
 "input": "types of agents"
}
Got output: The types of agents mentioned in the provided context are ReActAgent, Native OpenAIAgent, OpenAIAgent with Query Engine Tools, OpenAIAgent Query Planning, OpenAI Assistant, OpenAI Assistant Cookbook, Forced Function Calling, Parallel Function Calling, and Context Retrieval.

Got output: The types of agents mentioned in the `agents_index.html` part of the LlamaIndex docs are:

1. ReActAgent
2. Native OpenAIAgent
3. OpenAIAgent with Query Engine Tools
4. OpenAIAgent Query Planning
5. OpenAI Assistant
6. OpenAI Assistant Cookbook
7. Forced Function Calling
8. Parallel Function Calling
9. Context Retrieval



In [None]:
print(response)

The types of agents available in LlamaIndex include ReActAgent, Native OpenAIAgent, OpenAIAgent with Query Engine Tools, OpenAIAgent Query Planning, OpenAI Assistant, OpenAI Assistant Cookbook, Forced Function Calling, Parallel Function Calling, and Context Retrieval.


In [None]:
# 基线response = base_query_engine.query( "LlamaIndex中有哪些类型的代理可用?",)print(str(response))

The types of agents available in LlamaIndex are ReActAgent, Native OpenAIAgent, and OpenAIAgent.


In [None]:
response = top_agent.query(
 "Compare the content in the agents page vs. tools page."
)

Added user message to memory: Compare the content in the agents page vs. tools page.
=== Calling Function ===
Calling function: compare_tool with args: {"input":"agents vs tools"}
Generated 2 sub questions.
[1;3;38;2;237;90;200m[tool_understanding_index] Q: What are the functionalities of agents in the Llama Index platform?
[0mAdded user message to memory: What are the functionalities of agents in the Llama Index platform?
[1;3;38;2;90;149;237m[tool_understanding_index] Q: How do agents differ from tools in the Llama Index platform?
[0mAdded user message to memory: How do agents differ from tools in the Llama Index platform?
=== Calling Function ===
Calling function: vector_tool_understanding_index with args: {
 "input": "difference between agents and tools"
}
=== Calling Function ===
Calling function: vector_tool_understanding_index with args: {
 "input": "functionalities of agents"
}
Got output: Agents are typically individuals or entities that act on behalf of others, making dec

In [None]:
print(response)

The comparison between the content in the agents page and the tools page highlights the difference in their roles and functionalities. Agents on the Llama Index platform are responsible for decision-making and interacting with users, while tools are instruments used to perform specific functions or tasks, controlled by agents to assist in providing responses.


In [None]:
response = top_agent.query(
 "Can you compare the compact and tree_summarize response synthesizer response modes at a very high-level?"
)

Added user message to memory: Can you compare the compact and tree_summarize response synthesizer response modes at a very high-level?
=== Calling Function ===
Calling function: compare_tool with args: {"input":"Compare the compact and tree_summarize response synthesizer response modes at a very high-level."}
Generated 4 sub questions.
[1;3;38;2;237;90;200m[tool_querying_index] Q: What are the key differences between the compact and tree_summarize response synthesizer response modes?
[0mAdded user message to memory: What are the key differences between the compact and tree_summarize response synthesizer response modes?
[1;3;38;2;90;149;237m[tool_querying_index] Q: How does the compact response synthesizer response mode optimize query logic and response quality?
[0mAdded user message to memory: How does the compact response synthesizer response mode optimize query logic and response quality?
[1;3;38;2;11;159;203m[tool_querying_index] Q: How does the tree_summarize response synthesi

In [None]:
print(str(response))

The "compact" response synthesizer mode provides concise and direct responses, while the "tree_summarize" mode offers detailed and structured responses in a tree-like format. The compact mode is suitable for brief answers, while the tree_summarize mode presents information hierarchically for a comprehensive understanding of the query topic.
