{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating a Custom Model: \"Add N\" Model\n", "Our first example is simple yet illustrative. We'll create a model that adds a specified numeric value, n, to all columns of a Pandas DataFrame input. This will demonstrate the process of defining a custom model, saving it, loading it back, and performing predictions." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Step 1: Define the Model Class\n", "We begin by defining a Python class for our model. This class should inherit from mlflow.pyfunc.PythonModel and implement the necessary methods." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import mlflow.pyfunc\n", "\n", "\n", "class AddN(mlflow.pyfunc.PythonModel):\n", " \"\"\"\n", " A custom model that adds a specified value `n` to all columns of the input DataFrame.\n", "\n", " Attributes:\n", " -----------\n", " n : int\n", " The value to add to input columns.\n", " \"\"\"\n", "\n", " def __init__(self, n):\n", " \"\"\"\n", " Constructor method. Initializes the model with the specified value `n`.\n", "\n", " Parameters:\n", " -----------\n", " n : int\n", " The value to add to input columns.\n", " \"\"\"\n", " self.n = n\n", "\n", " def predict(self, context, model_input, params=None):\n", " \"\"\"\n", " Prediction method for the custom model.\n", "\n", " Parameters:\n", " -----------\n", " context : Any\n", " Ignored in this example. It's a placeholder for additional data or utility methods.\n", "\n", " model_input : pd.DataFrame\n", " The input DataFrame to which `n` should be added.\n", "\n", " params : dict, optional\n", " Additional prediction parameters. Ignored in this example.\n", "\n", " Returns:\n", " --------\n", " pd.DataFrame\n", " The input DataFrame with `n` added to all columns.\n", " \"\"\"\n", " return model_input.apply(lambda column: column + self.n)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Step 2: Save the Model\n", "Now that our model class is defined, we can instantiate it and save it using MLflow." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/benjamin.wilson/miniconda3/envs/mlflow-dev-env/lib/python3.8/site-packages/_distutils_hack/__init__.py:30: UserWarning: Setuptools is replacing distutils.\n", " warnings.warn(\"Setuptools is replacing distutils.\")\n" ] } ], "source": [ "# Define the path to save the model\n", "model_path = \"/tmp/add_n_model\"\n", "\n", "# Create an instance of the model with `n=5`\n", "add5_model = AddN(n=5)\n", "\n", "# Save the model using MLflow\n", "mlflow.pyfunc.save_model(path=model_path, python_model=add5_model)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Step 3: Load the Model\n", "With our model saved, we can load it back using MLflow and then use it for predictions." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# Load the saved model\n", "loaded_model = mlflow.pyfunc.load_model(model_path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Step 4: Evaluate the Model\n", "Let's now use our loaded model to perform predictions on a sample input and verify its correctness." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "\n", "# Define a sample input DataFrame\n", "model_input = pd.DataFrame([range(10)])\n", "\n", "# Use the loaded model to make predictions\n", "model_output = loaded_model.predict(model_input)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
0123456789
0567891011121314
\n", "
" ], "text/plain": [ " 0 1 2 3 4 5 6 7 8 9\n", "0 5 6 7 8 9 10 11 12 13 14" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model_output" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Conclusion\n", "This simple example demonstrates the power and flexibility of MLflow's custom pyfunc. By encapsulating arbitrary Python code and its dependencies, custom pyfunc models ensure a consistent and unified interface for a wide range of use cases. Whether you're working with a niche machine learning framework, need custom preprocessing steps, or want to integrate unique prediction logic, pyfunc is the tool for the job." ] } ], "metadata": { "kernelspec": { "display_name": "mlflow-dev-env", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.13" } }, "nbformat": 4, "nbformat_minor": 2 }