{"cells": [{"cell_type": "markdown", "id": "91c0b9fd-213a-4da8-b84b-c766b424716c", "metadata": {}, "source": ["# GPT Builder演示\n", "\n", "受GPT接口启发，该接口在2023年OpenAI Dev Day上展示。使用自然语言构建一个代理。\n", "\n", "在这里，您可以构建您自己的代理...与另一个代理一起！\n"]}, {"cell_type": "code", "execution_count": null, "id": "3e112b8c", "metadata": {}, "outputs": [], "source": ["%pip install llama-index-agent-openai\n", "%pip install llama-index-embeddings-openai\n", "%pip install llama-index-llms-openai"]}, {"cell_type": "code", "execution_count": null, "id": "93ff34e9", "metadata": {}, "outputs": [], "source": ["import os\n", "\n", "os.environ[\"OPENAI_API_KEY\"] = \"sk-...\""]}, {"cell_type": "code", "execution_count": null, "id": "0b6d7505-d582-465e-b86c-eaf2cf8c28f8", "metadata": {}, "outputs": [], "source": ["from llama_index.embeddings.openai import OpenAIEmbedding\n", "from llama_index.llms.openai import OpenAI\n", "from llama_index.core import Settings\n", "\n", "llm = OpenAI(model=\"gpt-4\")\n", "Settings.llm = llm\n", "Settings.embed_model = OpenAIEmbedding(model=\"text-embedding-3-small\")"]}, {"cell_type": "markdown", "id": "839cb488-912e-4a34-88a4-98c751798fcc", "metadata": {}, "source": ["## 定义候选工具\n", "\n", "我们还定义了一个工具检索器，用于检索候选工具。\n", "\n", "在这个设置中，我们将工具定义为不同的维基百科页面。\n"]}, {"cell_type": "code", "execution_count": null, "id": "bc91fd57-a681-4c18-991c-6f011d180dea", "metadata": {}, "outputs": [], "source": ["from llama_index.core import SimpleDirectoryReader"]}, {"cell_type": "code", "execution_count": null, "id": "50797099-bff6-40f8-b245-62a80b07e7db", "metadata": {}, "outputs": [], "source": ["wiki_titles = [\"Toronto\", \"Seattle\", \"Chicago\", \"Boston\", \"Houston\"]"]}, {"cell_type": "code", "execution_count": null, "id": "c5aba2fc-9bde-44e7-8a69-8a25ffa8de73", "metadata": {}, "outputs": [], "source": ["from pathlib import Path", "", "import requests", "", "for title in wiki_titles:", "    response = requests.get(", "        \"https://en.wikipedia.org/w/api.php\",", "        params={", "            \"action\": \"query\",", "            \"format\": \"json\",", "            \"titles\": title,", "            \"prop\": \"extracts\",", "            # 'exintro': True,", "            \"explaintext\": True,", "        },", "    ).json()", "    page = next(iter(response[\"query\"][\"pages\"].values()))", "    wiki_text = page[\"extract\"]", "", "    data_path = Path(\"data\")", "    if not data_path.exists():", "        Path.mkdir(data_path)", "", "    with open(data_path / f\"{title}.txt\", \"w\") as fp:", "        fp.write(wiki_text)"]}, {"cell_type": "code", "execution_count": null, "id": "a034ebf2-4a31-488b-bfbf-777dbc768426", "metadata": {}, "outputs": [], "source": ["# 加载所有维基文档", "city_docs = {}", "for wiki_title in wiki_titles:", "    city_docs[wiki_title] = SimpleDirectoryReader(", "        input_files=[f\"data/{wiki_title}.txt\"]", "    ).load_data()"]}, {"cell_type": "markdown", "id": "96ccd68b-45fe-43aa-a209-b2fd5d2aa75d", "metadata": {}, "source": ["### 为每个文档构建查询工具\n"]}, {"cell_type": "code", "execution_count": null, "id": "cc9e8634-18e3-4762-8e9e-792a5ce8e934", "metadata": {}, "outputs": [], "source": ["from llama_index.core import VectorStoreIndex", "from llama_index.agent.openai import OpenAIAgent", "from llama_index.core.tools import QueryEngineTool, ToolMetadata", "from llama_index.core import VectorStoreIndex", "", "# 构建工具字典", "tool_dict = {}", "", "for wiki_title in wiki_titles:", "    # 构建向量索引", "    vector_index = VectorStoreIndex.from_documents(", "        city_docs[wiki_title],", "    )", "    # 定义查询引擎", "    vector_query_engine = vector_index.as_query_engine(llm=llm)", "", "    # 定义工具", "    vector_tool = QueryEngineTool(", "        query_engine=vector_query_engine,", "        metadata=ToolMetadata(", "            name=wiki_title,", "            description=(\"用于与\" f\" {wiki_title} 相关的问题\"),", "        ),", "    )", "    tool_dict[wiki_title] = vector_tool"]}, {"cell_type": "markdown", "id": "9d2a3aeb-11ee-4dd2-aabb-d148213e234a", "metadata": {}, "source": ["### 定义工具检索器\n"]}, {"cell_type": "code", "execution_count": null, "id": "70d41c03-4110-4990-990b-7a3d706c0c84", "metadata": {}, "outputs": [], "source": ["# 定义一个在这些工具上进行索引和检索的“对象”", "from llama_index.core import VectorStoreIndex", "from llama_index.core.objects import ObjectIndex", "", "tool_index = ObjectIndex.from_objects(", "    list(tool_dict.values()),", "    index_cls=VectorStoreIndex,", ")", "tool_retriever = tool_index.as_retriever(similarity_top_k=1)"]}, {"cell_type": "markdown", "id": "1642f27f-457a-4cd7-b543-9f81a04a42da", "metadata": {}, "source": ["### 加载数据\n", "\n", "这里我们从不同城市加载维基百科页面。\n"]}, {"cell_type": "markdown", "id": "063cc9a7-d74c-4e08-9110-a63e11841d7f", "metadata": {}, "source": ["## 定义GPT Builder的元工具\n", "\n", "在构建GPT模型时，我们需要定义一些元工具，以便更好地管理和组织代码。这些元工具可以帮助我们创建、训练和部署GPT模型。\n"]}, {"cell_type": "code", "execution_count": null, "id": "4d76c29b-4b24-4d47-bd1e-027af9427f6c", "metadata": {}, "outputs": [], "source": ["from llama_index.core.llms import ChatMessage", "from llama_index.core import ChatPromptTemplate", "from typing import List", "", "GEN_SYS_PROMPT_STR = \"\"\"\\", "以下是任务信息。", "", "根据给定的任务，请为OpenAI提供的机器人生成一个系统提示：", "{task} \\", "\"\"\"", "", "gen_sys_prompt_messages = [", "    ChatMessage(", "        role=\"system\",", "        content=\"您正在帮助构建另一个机器人的系统提示。\",", "    ),", "    ChatMessage(role=\"user\", content=GEN_SYS_PROMPT_STR),", "]", "", "GEN_SYS_PROMPT_TMPL = ChatPromptTemplate(gen_sys_prompt_messages)", "", "", "agent_cache = {}", "", "", "def create_system_prompt(task: str):", "    \"\"\"根据输入的任务创建另一个代理的系统提示。\"\"\"", "    llm = OpenAI(llm=\"gpt-4\")", "    fmt_messages = GEN_SYS_PROMPT_TMPL.format_messages(task=task)", "    response = llm.chat(fmt_messages)", "    return response.message.content", "", "", "def get_tools(task: str):", "    \"\"\"根据输入的任务获取要使用的相关工具集。\"\"\"", "    subset_tools = tool_retriever.retrieve(task)", "    return [t.metadata.name for t in subset_tools]", "", "", "def create_agent(system_prompt: str, tool_names: List[str]):", "    \"\"\"根据系统提示和输入的工具集创建一个代理。\"\"\"", "    llm = OpenAI(model=\"gpt-4\")", "    try:", "        # 获取工具列表", "        input_tools = [tool_dict[tn] for tn in tool_names]", "", "        agent = OpenAIAgent.from_tools(input_tools, llm=llm, verbose=True)", "        agent_cache[\"agent\"] = agent", "        return_msg = \"代理创建成功。\"", "    except Exception as e:", "        return_msg = f\"构建代理时出现错误。错误信息：{repr(e)}\"", "    return return_msg"]}, {"cell_type": "code", "execution_count": null, "id": "02ebe043-73f1-4f09-88e1-011ceb4ed05d", "metadata": {}, "outputs": [], "source": ["from llama_index.core.tools import FunctionTool\n", "\n", "system_prompt_tool = FunctionTool.from_defaults(fn=create_system_prompt)\n", "get_tools_tool = FunctionTool.from_defaults(fn=get_tools)\n", "create_agent_tool = FunctionTool.from_defaults(fn=create_agent)"]}, {"cell_type": "code", "execution_count": null, "id": "c42842b3-438e-484e-9598-ad7dc7e2de09", "metadata": {}, "outputs": [], "source": ["GPT_BUILDER_SYS_STR = \"\"\"\\", "您正在帮助构建一个代理，以满足用户指定的任务。通常应按照以下顺序使用这些工具来构建代理。", "", "1) 创建系统提示工具：为代理创建系统提示。", "2) 获取工具工具：获取要使用的候选工具集。", "3) 创建代理工具：创建最终的代理。", "\"\"\"", "", "prefix_msgs = [ChatMessage(role=\"system\", content=GPT_BUILDER_SYS_STR)]", "", "", "builder_agent = OpenAIAgent.from_tools(", "    tools=[system_prompt_tool, get_tools_tool, create_agent_tool],", "    prefix_messages=prefix_msgs,", "    verbose=True,", ")"]}, {"cell_type": "code", "execution_count": null, "id": "7e135c1a-fcd5-40bc-b92a-5c1ad6ad9a50", "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["Added user message to memory: Build an agent that can tell me about Toronto.\n", "=== Calling Function ===\n", "Calling function: create_system_prompt with args: {\n", "  \"task\": \"tell me about Toronto\"\n", "}\n", "Got output: \"Generate a brief summary about Toronto, including its history, culture, landmarks, and notable features.\"\n", "========================\n", "\n", "=== Calling Function ===\n", "Calling function: get_tools with args: {\n", "  \"task\": \"tell me about Toronto\"\n", "}\n", "Got output: ['Toronto']\n", "========================\n", "\n", "=== Calling Function ===\n", "Calling function: create_agent with args: {\n", "  \"system_prompt\": \"Generate a brief summary about Toronto, including its history, culture, landmarks, and notable features.\",\n", "  \"tool_names\": [\"Toronto\"]\n", "}\n", "Got output: Agent created successfully.\n", "========================\n", "\n"]}, {"data": {"text/plain": ["Response(response='The agent has been successfully created. It can now provide information about Toronto, including its history, culture, landmarks, and notable features.', source_nodes=[], metadata=None)"]}, "execution_count": null, "metadata": {}, "output_type": "execute_result"}], "source": ["builder_agent.query(\"Build an agent that can tell me about Toronto.\")"]}, {"cell_type": "code", "execution_count": null, "id": "ee65b244-a6f0-447f-a88d-b7cbdfe8a74a", "metadata": {}, "outputs": [], "source": ["city_agent = agent_cache[\"agent\"]"]}, {"cell_type": "code", "execution_count": null, "id": "1c66eea2-21a9-4e3d-a3a2-f4219476903e", "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["Added user message to memory: Tell me about the parks in Toronto\n", "Toronto is known for its beautiful and diverse parks. Here are a few of the most popular ones:\n", "\n", "1. **High Park**: This is Toronto's largest public park featuring many hiking trails, sports facilities, a beautiful lakefront, convenient parking, easy public transit access, a dog park, a zoo, and playgrounds for children. It's also known for its spring cherry blossoms.\n", "\n", "2. **Toronto Islands**: A group of small islands located just off the shore of the city's downtown district, offering stunning views of the city skyline. The islands provide a great escape from the city with their car-free environment, picnic spots, swimming beaches, and Centreville Amusement Park.\n", "\n", "3. **Trinity Bellwoods Park**: A popular park in the downtown area, it's a great place for picnics, sports, dog-walking, or just relaxing. It also has a community recreation centre with a pool and gym.\n", "\n", "4. **Rouge National Urban Park**: Located in the city's east end, this is Canada's first national urban park. It offers hiking, swimming, camping, and a chance to learn about the area's cultural and agricultural heritage.\n", "\n", "5. **Riverdale Farm**: This 7.5-acre farm in the heart of Toronto provides an opportunity to experience farm life and interact with a variety of farm animals.\n", "\n", "6. **Evergreen Brick Works**: A former industrial site that has been transformed into an eco-friendly community center with a park, farmers market, and cultural events.\n", "\n", "7. **Scarborough Bluffs Park**: Offers a unique natural environment with stunning views of Lake Ontario from atop the bluffs.\n", "\n", "8. **Edwards Gardens**: A beautiful botanical garden located in North York, perfect for a peaceful walk surrounded by nature.\n", "\n", "9. **Sunnybrook Park**: A large public park that offers many recreational activities including horseback riding, sports fields, and picnic areas.\n", "\n", "10. **Cherry Beach**: Located on the waterfront, this park offers a sandy beach, picnic areas, and a dog off-leash area. It's a great spot for swimming, sunbathing, and barbecuing.\n", "\n", "These parks offer a variety of experiences, from urban amenities to natural beauty, making Toronto a great city for outdoor enthusiasts.\n"]}], "source": ["response = city_agent.query(\"Tell me about the parks in Toronto\")\n", "print(str(response))"]}], "metadata": {"kernelspec": {"display_name": ".venv", "language": "python", "name": "python3"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3"}}, "nbformat": 4, "nbformat_minor": 5}