{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction to MLflow and Transformers\n", "\n", "Welcome to our tutorial on leveraging the power of **Transformers** with **MLflow**. This guide is designed for beginners, focusing on machine learning workflows and model management.\n" ] }, { "cell_type": "raw", "metadata": {}, "source": [ "Download this Notebook" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### What Will You Learn?\n", "\n", "In this tutorial, you will learn how to:\n", "\n", "- Set up a simple text generation pipeline using the Transformers library.\n", "- Log the model and its parameters using MLflow.\n", "- Infer the input and output signature of the model automatically.\n", "- Simulate serving the model using MLflow and make predictions with it.\n", " \n", "#### Introduction to Transformers\n", "Transformers are a type of deep learning model that have revolutionized natural language processing (NLP). Developed by [🤗 Hugging Face](https://huggingface.co/docs/transformers/index), the Transformers library offers a variety of state-of-the-art pre-trained models for NLP tasks.\n", "\n", "#### Why Combine MLflow with Transformers?\n", "Integrating MLflow with Transformers offers numerous benefits:\n", "\n", "- **Experiment Tracking**: Log and compare model parameters and metrics.\n", "- **Model Management**: Track different model versions and their performance.\n", "- **Reproducibility**: Document all aspects needed to reproduce predictions.\n", "- **Deployment**: Simplify deploying Transformers models to production.\n", "\n", "Now, let's dive into the world of MLflow and Transformers!" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "env: TOKENIZERS_PARALLELISM=false\n" ] } ], "source": [ "# Disable tokenizers warnings when constructing pipelines\n", "%env TOKENIZERS_PARALLELISM=false\n", "\n", "import warnings\n", "\n", "# Disable a few less-than-useful UserWarnings from setuptools and pydantic\n", "warnings.filterwarnings(\"ignore\", category=UserWarning)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Imports and Pipeline configuration\n", "\n", "In this first section, we are setting up our environment and configuring aspects of the transformers pipeline that we'll be using to generate a text response from the LLM. \n", "\n", "#### Setting up our Pipeline\n", "\n", "- **Import**: We import the necessary libraries: transformers for building our NLP model and mlflow for model tracking and management.\n", "- **Task Definition**: We then define the task for our pipeline, which in this case is `text2text-generation`` This task involves generating new text based on the input text.\n", "- **Pipeline Declaration**: Next, we create a generation_pipeline using the `pipeline` function from the Transformers library. This pipeline is configured to use the `declare-lab/flan-alpaca-large model`, which is a pre-trained model suitable for text generation.\n", "- **Input Example**: For the purposes of generating a signature later on, as well as having a visual indicator of the expected input data to our model when loading as a `pyfunc`, we next set up an input_example that contains sample prompts.\n", "- **Inference Parameters**: Finally, we define parameters that will be used to control the behavior of the model during inference, such as the maximum length of the generated text and whether to sample multiple times.\n", "\n", "#### Understanding Pipelines\n", "Pipelines are a high-level abstraction provided by the Transformers library that simplifies the process of using models for inference. They encapsulate the complexity of the underlying code, offering a straightforward API for a variety of tasks, such as text classification, question answering, and in our case, text generation.\n", "\n", "##### The `pipeline()` function\n", "The pipeline() function is a versatile tool that can be used to create a pipeline for any supported task. When you specify a task, the function returns a pipeline object tailored for that task, constructing the required calls to sub-components (a tokenizer, encoder, generative model, etc.) in the order needed to fulfill the needs of the specified task. This abstraction dramatically simplifies the code required to use these models and their respective components.\n", "\n", "##### Task-Specific Pipelines\n", "In addition to the general pipeline() function, there are task-specific pipelines for different domains like audio, computer vision, and natural language processing. These specialized pipelines are optimized for their respective tasks and can provide additional convenience and functionality.\n", "\n", "##### Benefits of Using Pipelines\n", "Using pipelines has several advantages:\n", "\n", "- **Simplicity**: You can perform complex tasks with a minimal amount of code.\n", "- **Flexibility**: You can specify different models and configurations to customize the pipeline for your needs.\n", "- **Efficiency**: Pipelines handle batching and dataset iteration internally, which can lead to performance improvements.\n", "\n", "Due to the utility and simple, high-level API, MLflow's `transformers` implementation uses the `pipeline` abstraction by default (although it can support component-only mode as well)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import transformers\n", "\n", "import mlflow\n", "\n", "# Define the task that we want to use (required for proper pipeline construction)\n", "task = \"text2text-generation\"\n", "\n", "# Define the pipeline, using the task and a model instance that is applicable for our task.\n", "generation_pipeline = transformers.pipeline(\n", " task=task,\n", " model=\"declare-lab/flan-alpaca-large\",\n", ")\n", "\n", "# Define a simple input example that will be recorded with the model in MLflow, giving\n", "# users of the model an indication of the expected input format.\n", "input_example = [\"prompt 1\", \"prompt 2\", \"prompt 3\"]\n", "\n", "# Define the parameters (and their defaults) for optional overrides at inference time.\n", "parameters = {\"max_length\": 512, \"do_sample\": True, \"temperature\": 0.4}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Introduction to Model Signatures in MLflow\n", "\n", "In the realm of machine learning, model signatures play a crucial role in ensuring that models receive and produce the expected data types and structures. MLflow includes a feature for defining model signatures, helping to standardize and enforce correct model usage.\n", "\n", "#### Quick Learning Points\n", "\n", "- **Model Signature Purpose**: Ensures consistent data types and structures for model inputs and outputs.\n", "- **Visibility and Validation**: Visible in MLflow's UI and validated by MLflow's deployment tools.\n", "- **Signature Types**: Column-based for tabular data, tensor-based for tensor inputs/outputs, and with params for models requiring additional inference parameters.\n", "\n", "#### Understanding Model Signatures\n", "A model signature in MLflow describes the schema for inputs, outputs, and parameters of an ML model. It is a blueprint that details the expected data types and shapes, facilitating a clear interface for model usage. Signatures are particularly useful as they are:\n", "\n", "- Displayed in MLflow's UI for easy reference.\n", "- Employed by MLflow's deployment tools to validate inputs during inference.\n", "- Stored in a standardized JSON format alongside the model's metadata.\n", "\n", "#### The Role of Signatures in Code\n", "In the following section, we are using MLflow to infer the signature of a machine learning model. This involves specifying an input example, generating a model output example, and defining any additional inference parameters. The resulting signature is used to validate future inputs and document the expected data formats.\n", "\n", "#### Types of Model Signatures\n", "Model signatures can be:\n", "\n", "- **Column-based**: Suitable for models that operate on tabular data, with each column having a specified data type and optional name.\n", "- **Tensor-based**: Designed for models that take tensors as inputs and outputs, with each tensor having a specified data type, shape, and optional name.\n", "- **With Params**: Some models require additional parameters for inference, which can also be included in the signature.\n", "\n", "For the transformers flavor, all input types are of the Column-based type (referred to within MLflow as `ColSpec` types).\n", "\n", "#### Signature Enforcement\n", "MLflow enforces the signature at the time of model inference, ensuring that the provided input and parameters match the expected schema. If there's a mismatch, MLflow will raise an exception or issue a warning, depending on the nature of the mismatch." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "inputs: \n", " [string]\n", "outputs: \n", " [string]\n", "params: \n", " ['max_length': long (default: 512), 'do_sample': boolean (default: True), 'temperature': double (default: 0.4)]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Generate the signature for the model that will be used for inference validation and type checking (as well as validation of parameters being submitted during inference)\n", "signature = mlflow.models.infer_signature(\n", " input_example,\n", " mlflow.transformers.generate_signature_output(generation_pipeline, input_example),\n", " parameters,\n", ")\n", "\n", "# Visualize the signature\n", "signature" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating an experiment\n", "\n", "We create a new MLflow Experiment so that the run we're going to log our model to does not log to the default experiment and instead has its own contextually relevant entry." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# If you are running this tutorial in local mode, leave the next line commented out.\n", "# Otherwise, uncomment the following line and set your tracking uri to your local or remote tracking server.\n", "\n", "# mlflow.set_tracking_uri(\"http://127.0.0.1:8080\")\n", "\n", "mlflow.set_experiment(\"Transformers Introduction\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Logging the Transformers Model with MLflow\n", "\n", "We log our model with MLflow to manage its lifecycle efficiently and keep track of its versions and configurations.\n", " \n", "Logging a model in MLflow is a crucial step in the model lifecycle management, enabling efficient tracking, versioning, and management. The process involves registering the model along with its essential metadata within the MLflow tracking system.\n", "\n", "Utilizing the [mlflow.transformers.log_model](https://www.mlflow.org/docs/latest/python_api/mlflow.transformers.html#mlflow.transformers.log_model) function, specifically tailored for models and components from the `transformers` library, simplifies this task. This function is adept at handling various aspects of the models from this library, including their complex pipelines and configurations.\n", "\n", "When logging the model, crucial metadata such as the model's signature, which was previously established, is included. This metadata plays a significant role in the subsequent stages of the model's lifecycle, from tracking its evolution to facilitating its deployment in different environments. The signature, in particular, ensures the model's compatibility and consistent performance across various platforms, thereby enhancing its utility and reliability in practical applications.\n", "\n", "By logging our model in this way, we ensure that it is not only well-documented but also primed for future use, whether it be for further development, comparative analysis, or deployment.\n", "\n", "\n", "#### A Tip for Saving Storage Cost\n", "\n", "When you call [mlflow.transformers.log_model](https://www.mlflow.org/docs/latest/python_api/mlflow.transformers.html#mlflow.transformers.log_model), MLflow will saves a full copy of the Transformers model weight. However, this could take large storage space (3GB for `flan-alpaca-large` model), but might be redundant when you don't modify the model weight, because it is exactly same as the one you can download from the HuggingFace model hub.\n", "\n", "To avoid the unnecessary copy, you can use 'reference-only' save mode which is introduced in MLflow 2.11.0, by setting ``save_pretrained=False`` when logging or saving the Transformer model. This tells MLflow not to save the copy of the base model weight, but just a reference to the HuggingFace Hub repository and version, hence more storage-efficient and faster in time. For more details about this feature, please refer to [Storage-Efficient Model Logging](https://www.mlflow.org/docs/latest/llms/transformers/guide/index.html#storage-efficient-model-logging-with-save-pretrained-option)." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "with mlflow.start_run():\n", " model_info = mlflow.transformers.log_model(\n", " transformers_model=generation_pipeline,\n", " artifact_path=\"text_generator\",\n", " input_example=input_example,\n", " signature=signature,\n", " # Transformer model does not use Pandas Dataframe as input, internal input type conversion should be skipped.\n", " example_no_conversion=True,\n", " # Uncomment the following line to save the model in 'reference-only' mode:\n", " # save_pretrained=False,\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Loading the Text Generation Model\n", "\n", "We initialize our text generation model using MLflow's pyfunc module for seamless model loading and interaction.\n", "\n", "The `pyfunc` module in MLflow serves as a generic wrapper for Python functions. Its application in MLflow facilitates the loading of machine learning models as standard Python functions. This approach is especially advantageous for models logged or registered via MLflow, streamlining the interaction with the model regardless of its training or serialization specifics.\n", "\n", "Utilizing [mlflow.pyfunc.load_model](https://www.mlflow.org/docs/latest/python_api/mlflow.pyfunc.html#mlflow.pyfunc.load_model), our previously logged text generation model is loaded using its unique model URI. This URI is a reference to the stored model artifacts. MLflow efficiently handles the model's deserialization, along with any associated dependencies, preparing it for immediate use.\n", "\n", "Once the model, referred to as `sentence_generator`, is loaded, it operates as a conventional Python function. This functionality allows for the generation of text based on given prompts. The model encompasses the complete process of inference, eliminating the need for manual input preprocessing or output postprocessing. This encapsulation not only simplifies model interaction but also ensures the model's adaptability for deployment across various platforms." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "60ad7546d18a44789207819b8f0889ad", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Loading checkpoint shards: 0%| | 0/7 [00:00