{"cells": [{"attachments": {}, "cell_type": "markdown", "id": "978146e2", "metadata": {}, "source": ["<a href=\"https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/llm/huggingface.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"在 Colab 中打开\"/></a>\n"]}, {"cell_type": "markdown", "id": "f717d3d4-942b-4d86-9435-fc44b3ac6d39", "metadata": {}, "source": ["# Hugging Face 语言模型\n", "\n", "有许多种方法可以与[Hugging Face](https://huggingface.co/)的语言模型进行交互。\n", "Hugging Face本身提供了几个Python包来实现访问，\n", "LlamaIndex将这些包装成了`LLM`实体：\n", "\n", "- [`transformers`](https://github.com/huggingface/transformers) 包：\n", "  使用 `llama_index.llms.HuggingFaceLLM`\n", "- [Hugging Face 推理 API](https://huggingface.co/inference-api),\n", "  [由 `huggingface_hub[inference]` 包装](https://github.com/huggingface/huggingface_hub)：\n", "  使用 `llama_index.llms.HuggingFaceInferenceAPI`\n", "\n", "这两者有非常多的可能组合方式，因此本笔记本仅详细介绍了一些。\n", "让我们以Hugging Face的[文本生成任务](https://huggingface.co/tasks/text-generation)作为示例。\n"]}, {"cell_type": "markdown", "id": "90cf0f2e-8d8d-4e42-81bf-866c759221e1", "metadata": {}, "source": ["在下面的代码中，我们安装了这个演示所需的包：\n", "\n", "- `transformers[torch]` 是为了 `HuggingFaceLLM`\n", "- `huggingface_hub[inference]` 是为了 `HuggingFaceInferenceAPI`\n", "- 引号是为了 Z shell (`zsh`)\n"]}, {"cell_type": "code", "execution_count": null, "id": "f413f179", "metadata": {}, "outputs": [], "source": ["%pip install llama-index-llms-huggingface"]}, {"cell_type": "code", "execution_count": null, "id": "3b04b4a5-6fce-4188-a538-9a5ce2fa56f6", "metadata": {}, "outputs": [], "source": ["!pip install \"transformers[torch]\" \"huggingface_hub[inference]\""]}, {"cell_type": "markdown", "id": "3dac8f9f-7136-43f7-9e9f-de679e74d66e", "metadata": {}, "source": ["现在我们已经准备好了，让我们开始玩一下吧：\n"]}, {"attachments": {}, "cell_type": "markdown", "id": "2c577674", "metadata": {}, "source": ["如果您在colab上打开这个笔记本，您可能需要安装LlamaIndex 🦙。\n"]}, {"cell_type": "code", "execution_count": null, "id": "86028752", "metadata": {}, "outputs": [], "source": ["!pip install llama-index"]}, {"cell_type": "code", "execution_count": null, "id": "0465029c-fe69-454a-9561-55f7a382b2e2", "metadata": {}, "outputs": [], "source": ["import os", "from typing import List, Optional", "", "from llama_index.llms.huggingface import (", "    HuggingFaceInferenceAPI,", "    HuggingFaceLLM,", ")", "", "# 参考：https://huggingface.co/docs/hub/security-tokens", "# 我们只需要一个具有读取权限的令牌来进行演示", "HF_TOKEN: Optional[str] = os.getenv(\"HUGGING_FACE_TOKEN\")", "# 注意：当这个令牌在HuggingFaceInferenceAPI中被使用时，None默认将回退到Hugging Face的令牌存储中。"]}, {"cell_type": "code", "execution_count": null, "id": "a27feba3-d027-4d10-b1af-1e130e764a67", "metadata": {}, "outputs": [], "source": ["# 这里使用了 https://huggingface.co/HuggingFaceH4/zephyr-7b-alpha", "# 如果是第一次调用，则会下载到本地的 Hugging Face 模型缓存中，", "# 然后在本地机器上运行模型", "locally_run = HuggingFaceLLM(model_name=\"HuggingFaceH4/zephyr-7b-alpha\")", "", "# 这将使用相同的模型，但在 Hugging Face 的服务器上远程运行，", "# 通过 Hugging Face 推理 API 访问", "# 请注意，使用您的令牌不会产生费用，", "# 推理 API 是免费的，只是有速率限制", "remotely_run = HuggingFaceInferenceAPI(", "    model_name=\"HuggingFaceH4/zephyr-7b-alpha\", token=HF_TOKEN", ")", "", "# 或者您可以跳过提供令牌，匿名使用 Hugging Face 推理 API", "remotely_run_anon = HuggingFaceInferenceAPI(", "    model_name=\"HuggingFaceH4/zephyr-7b-alpha\"", ")", "", "# 如果您没有向 HuggingFaceInferenceAPI 提供 model_name，", "# 则会使用 Hugging Face 推荐的模型（感谢 huggingface_hub）", "remotely_run_recommended = HuggingFaceInferenceAPI(token=HF_TOKEN)", ""]}, {"cell_type": "markdown", "id": "b801bef7-2593-49e2-a550-721e6b796486", "metadata": {}, "source": ["使用`HuggingFaceInferenceAPI`完成的基础是Hugging Face的[文本生成任务](https://huggingface.co/tasks/text-generation)。\n"]}, {"cell_type": "code", "execution_count": null, "id": "631269c9-38ca-49d2-a7f0-f88e21adef6e", "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": [" beyond!\n", "The Infinity Wall Clock is a unique and stylish way to keep track of time. The clock is made of a durable, high-quality plastic and features a bright LED display. The Infinity Wall Clock is powered by batteries and can be mounted on any wall. It is a great addition to any home or office.\n"]}], "source": ["completion_response = remotely_run_recommended.complete(\"To infinity, and\")\n", "print(completion_response)"]}, {"cell_type": "markdown", "id": "dda1be10", "metadata": {}, "source": ["如果您修改了LLM，还应该相应地修改全局的分词器！\n"]}, {"cell_type": "code", "execution_count": null, "id": "12e0f3c0", "metadata": {}, "outputs": [], "source": ["from llama_index.core import set_global_tokenizer\n", "from transformers import AutoTokenizer\n", "\n", "set_global_tokenizer(\n", "    AutoTokenizer.from_pretrained(\"HuggingFaceH4/zephyr-7b-alpha\").encode\n", ")"]}, {"cell_type": "markdown", "id": "3fa723d6-4308-4d94-9609-8c51ce8184c3", "metadata": {}, "source": ["如果你感兴趣，其他Hugging Face推理API任务包括：\n", "\n", "- `llama_index.llms.HuggingFaceInferenceAPI.chat`：[对话任务](https://huggingface.co/tasks/conversational)\n", "- `llama_index.embeddings.HuggingFaceInferenceAPIEmbedding`：[特征提取任务](https://huggingface.co/tasks/feature-extraction)\n", "\n", "是的，Hugging Face嵌入模型支持以下内容：\n", "\n", "- `transformers[torch]`：由`HuggingFaceEmbedding`包装\n", "- `huggingface_hub[inference]`：由`HuggingFaceInferenceAPIEmbedding`包装\n", "\n", "上述两个都是`llama_index.embeddings.base.BaseEmbedding`的子类。\n"]}, {"cell_type": "markdown", "id": "92c09b9f", "metadata": {}, "source": ["### 使用Hugging Face的`text-generation-inference`\n"]}, {"cell_type": "markdown", "id": "752520ec", "metadata": {}, "source": ["新的`TextGenerationInference`类允许与运行[`text-generation-inference`, TGI](https://huggingface.co/docs/text-generation-inference/index)的端点进行交互。除了快速的推理之外，它还支持从版本`2.0.1`开始的`tool`使用。\n"]}, {"cell_type": "markdown", "id": "055ddcb1", "metadata": {}, "source": ["要初始化`TextGenerationInference`的实例，您需要提供端点URL（TGI的自托管实例或在Hugging Face上创建的公共推理端点）。对于私有推理端点，需要提供您的HF令牌（可以作为初始化参数或环境变量）。\n"]}, {"cell_type": "code", "execution_count": null, "id": "c02f350f", "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": [" beyond! This phrase is a reference to the famous line from the movie \"Toy Story\" when Buzz Lightyear, a toy astronaut, exclaims \"To infinity and beyond!\" as he soars through space. It has since become a catchphrase for reaching for the stars and striving for greatness. However, if you meant to ask a mathematical question, \"To infinity\" refers to a very large, infinite number, and \"and beyond\" could be interpreted as continuing infinitely in a certain direction. For example, \"2 to the power of infinity\" would represent a very large, infinite number.\n"]}], "source": ["", "# 导入必要的库", "import os", "from typing import List, Optional", "", "from llama_index.llms.huggingface import (", "    TextGenerationInference,", ")", "", "# 定义URL地址", "URL = \"your_tgi_endpoint\"", "model = TextGenerationInference(", "    model_url=URL, token=False", ")  # 如果是公共端点，请将token设置为False", "", "# 调用模型生成文本", "completion_response = model.complete(\"To infinity, and\")", "print(completion_response)"]}, {"cell_type": "markdown", "id": "e9270b99", "metadata": {}, "source": ["要使用`TextGenerationInference`工具，您可以使用已经存在的工具，也可以自定义一个：\n"]}, {"cell_type": "code", "execution_count": null, "id": "90a041cc", "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["{'tool_calls': [{'id': 0, 'type': 'function', 'function': {'description': None, 'name': 'get_current_weather_n_days', 'arguments': {'format': 'celsius', 'location': 'Paris, Ile-de-France', 'num_days': 7}}}]}\n"]}], "source": ["from typing import List, Literal", "from llama_index.core.bridge.pydantic import BaseModel, Field", "from llama_index.core.tools import FunctionTool", "from llama_index.core.base.llms.types import (", "    ChatMessage,", "    MessageRole,", ")", "", "", "def get_current_weather(location: str, format: str):", "    \"\"\"获取当前天气", "", "    Args:", "    location (str): 城市和州，例如：旧金山，加利福尼亚", "    format (str): 要使用的温度单位（'celsius' 或 'fahrenheit'）。从用户位置推断出来。", "    \"\"\"", "    ...", "", "", "class WeatherArgs(BaseModel):", "    location: str = Field(", "        description=\"城市和地区，例如：巴黎，法兰西岛\"", "    )", "    format: Literal[\"fahrenheit\", \"celsius\"] = Field(", "        description=\"要使用的温度单位（'fahrenheit' 或 'celsius'）。从位置推断出来。\",", "    )", "", "", "weather_tool = FunctionTool.from_defaults(", "    fn=get_current_weather,", "    name=\"get_current_weather\",", "    description=\"获取当前天气\",", "    fn_schema=WeatherArgs,", ")", "", "", "def get_current_weather_n_days(location: str, format: str, num_days: int):", "    \"\"\"获取未来N天的天气预报", "", "    Args:", "    location (str): 城市和州，例如：旧金山，加利福尼亚", "    format (str): 要使用的温度单位（'celsius' 或 'fahrenheit'）。从用户位置推断出来。", "    num_days (int): 天气预报的天数。", "    \"\"\"", "    ...", "", "", "class ForecastArgs(BaseModel):", "    location: str = Field(", "        description=\"城市和地区，例如：巴黎，法兰西岛\"", "    )", "    format: Literal[\"fahrenheit\", \"celsius\"] = Field(", "        description=\"要使用的温度单位（'fahrenheit' 或 'celsius'）。从位置推断出来。\",", "    )", "    num_days: int = Field(", "        description=\"天气预报的持续时间（天）。\",", "    )", "", "", "forecast_tool = FunctionTool.from_defaults(", "    fn=get_current_weather_n_days,", "    name=\"get_current_weather_n_days\",", "    description=\"获取未来N天的当前天气\",", "    fn_schema=ForecastArgs,", ")", "", "usr_msg = ChatMessage(", "    role=MessageRole.USER,", "    content=\"巴黎未来一周的天气如何？\",", ")", "", "response = model.chat_with_tools(", "    user_msg=usr_msg,", "    tools=[", "        weather_tool,", "        forecast_tool,", "    ],", "    tool_choice=\"get_current_weather_n_days\",", ")", "", "print(response.message.additional_kwargs)"]}], "metadata": {"kernelspec": {"display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3"}}, "nbformat": 4, "nbformat_minor": 5}