{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Creating a Custom Model: \"Add N\" Model\n",
    "Our first example is simple yet illustrative. We'll create a model that adds a specified numeric value, n, to all columns of a Pandas DataFrame input. This will demonstrate the process of defining a custom model, saving it, loading it back, and performing predictions."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Step 1: Define the Model Class\n",
    "We begin by defining a Python class for our model. This class should inherit from mlflow.pyfunc.PythonModel and implement the necessary methods."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import mlflow.pyfunc\n",
    "\n",
    "\n",
    "class AddN(mlflow.pyfunc.PythonModel):\n",
    "    \"\"\"\n",
    "    A custom model that adds a specified value `n` to all columns of the input DataFrame.\n",
    "\n",
    "    Attributes:\n",
    "    -----------\n",
    "    n : int\n",
    "        The value to add to input columns.\n",
    "    \"\"\"\n",
    "\n",
    "    def __init__(self, n):\n",
    "        \"\"\"\n",
    "        Constructor method. Initializes the model with the specified value `n`.\n",
    "\n",
    "        Parameters:\n",
    "        -----------\n",
    "        n : int\n",
    "            The value to add to input columns.\n",
    "        \"\"\"\n",
    "        self.n = n\n",
    "\n",
    "    def predict(self, context, model_input, params=None):\n",
    "        \"\"\"\n",
    "        Prediction method for the custom model.\n",
    "\n",
    "        Parameters:\n",
    "        -----------\n",
    "        context : Any\n",
    "            Ignored in this example. It's a placeholder for additional data or utility methods.\n",
    "\n",
    "        model_input : pd.DataFrame\n",
    "            The input DataFrame to which `n` should be added.\n",
    "\n",
    "        params : dict, optional\n",
    "            Additional prediction parameters. Ignored in this example.\n",
    "\n",
    "        Returns:\n",
    "        --------\n",
    "        pd.DataFrame\n",
    "            The input DataFrame with `n` added to all columns.\n",
    "        \"\"\"\n",
    "        return model_input.apply(lambda column: column + self.n)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Step 2: Save the Model\n",
    "Now that our model class is defined, we can instantiate it and save it using MLflow."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/benjamin.wilson/miniconda3/envs/mlflow-dev-env/lib/python3.8/site-packages/_distutils_hack/__init__.py:30: UserWarning: Setuptools is replacing distutils.\n",
      "  warnings.warn(\"Setuptools is replacing distutils.\")\n"
     ]
    }
   ],
   "source": [
    "# Define the path to save the model\n",
    "model_path = \"/tmp/add_n_model\"\n",
    "\n",
    "# Create an instance of the model with `n=5`\n",
    "add5_model = AddN(n=5)\n",
    "\n",
    "# Save the model using MLflow\n",
    "mlflow.pyfunc.save_model(path=model_path, python_model=add5_model)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Step 3: Load the Model\n",
    "With our model saved, we can load it back using MLflow and then use it for predictions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load the saved model\n",
    "loaded_model = mlflow.pyfunc.load_model(model_path)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Step 4: Evaluate the Model\n",
    "Let's now use our loaded model to perform predictions on a sample input and verify its correctness."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "# Define a sample input DataFrame\n",
    "model_input = pd.DataFrame([range(10)])\n",
    "\n",
    "# Use the loaded model to make predictions\n",
    "model_output = loaded_model.predict(model_input)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "      <th>7</th>\n",
       "      <th>8</th>\n",
       "      <th>9</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>5</td>\n",
       "      <td>6</td>\n",
       "      <td>7</td>\n",
       "      <td>8</td>\n",
       "      <td>9</td>\n",
       "      <td>10</td>\n",
       "      <td>11</td>\n",
       "      <td>12</td>\n",
       "      <td>13</td>\n",
       "      <td>14</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   0  1  2  3  4   5   6   7   8   9\n",
       "0  5  6  7  8  9  10  11  12  13  14"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model_output"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Conclusion\n",
    "This simple example demonstrates the power and flexibility of MLflow's custom pyfunc. By encapsulating arbitrary Python code and its dependencies, custom pyfunc models ensure a consistent and unified interface for a wide range of use cases. Whether you're working with a niche machine learning framework, need custom preprocessing steps, or want to integrate unique prediction logic, pyfunc is the tool for the job."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "mlflow-dev-env",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}