{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# ArbitraryNumberImputer\n", "\n", "\n", "ArbitraryNumberImputer replaces NA by an arbitrary value. It works for numerical variables. The arbitrary value needs to be defined by the user.\n", "\n", "**For this demonstration, we use the Ames House Prices dataset produced by Professor Dean De Cock:**\n", "\n", "[Dean De Cock (2011) Ames, Iowa: Alternative to the Boston Housing\n", "Data as an End of Semester Regression Project, Journal of Statistics Education, Vol.19, No. 3](http://jse.amstat.org/v19n3/decock.pdf)\n", "\n", "The version of the dataset used in this notebook can be obtained from [Kaggle](https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Version" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'1.2.0'" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Make sure you are using this \n", "# Feature-engine version.\n", "\n", "import feature_engine\n", "\n", "feature_engine.__version__" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "\n", "from sklearn.model_selection import train_test_split\n", "\n", "from feature_engine.imputation import ArbitraryNumberImputer" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load data" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IdMSSubClassMSZoningLotFrontageLotAreaStreetAlleyLotShapeLandContourUtilities...PoolAreaPoolQCFenceMiscFeatureMiscValMoSoldYrSoldSaleTypeSaleConditionSalePrice
0160RL65.08450PaveNaNRegLvlAllPub...0NaNNaNNaN022008WDNormal208500
1220RL80.09600PaveNaNRegLvlAllPub...0NaNNaNNaN052007WDNormal181500
2360RL68.011250PaveNaNIR1LvlAllPub...0NaNNaNNaN092008WDNormal223500
3470RL60.09550PaveNaNIR1LvlAllPub...0NaNNaNNaN022006WDAbnorml140000
4560RL84.014260PaveNaNIR1LvlAllPub...0NaNNaNNaN0122008WDNormal250000
\n", "

5 rows × 81 columns

\n", "
" ], "text/plain": [ " Id MSSubClass MSZoning LotFrontage LotArea Street Alley LotShape \\\n", "0 1 60 RL 65.0 8450 Pave NaN Reg \n", "1 2 20 RL 80.0 9600 Pave NaN Reg \n", "2 3 60 RL 68.0 11250 Pave NaN IR1 \n", "3 4 70 RL 60.0 9550 Pave NaN IR1 \n", "4 5 60 RL 84.0 14260 Pave NaN IR1 \n", "\n", " LandContour Utilities ... PoolArea PoolQC Fence MiscFeature MiscVal MoSold \\\n", "0 Lvl AllPub ... 0 NaN NaN NaN 0 2 \n", "1 Lvl AllPub ... 0 NaN NaN NaN 0 5 \n", "2 Lvl AllPub ... 0 NaN NaN NaN 0 9 \n", "3 Lvl AllPub ... 0 NaN NaN NaN 0 2 \n", "4 Lvl AllPub ... 0 NaN NaN NaN 0 12 \n", "\n", " YrSold SaleType SaleCondition SalePrice \n", "0 2008 WD Normal 208500 \n", "1 2007 WD Normal 181500 \n", "2 2008 WD Normal 223500 \n", "3 2006 WD Abnorml 140000 \n", "4 2008 WD Normal 250000 \n", "\n", "[5 rows x 81 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Download the data from Kaggle and store it\n", "# in the same folder as this notebook.\n", "\n", "data = pd.read_csv('houseprice.csv')\n", "\n", "data.head()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "((1022, 79), (438, 79))" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Separate the data into train and test sets.\n", "\n", "X_train, X_test, y_train, y_test = train_test_split(\n", " data.drop(['Id', 'SalePrice'], axis=1),\n", " data['SalePrice'],\n", " test_size=0.3,\n", " random_state=0,\n", ")\n", "\n", "X_train.shape, X_test.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imputate variables with same number\n", "\n", "We will impute 2 numerical variables with the number 999." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "LotFrontage 0.184932\n", "MasVnrArea 0.004892\n", "dtype: float64" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Check missing data\n", "\n", "X_train[['LotFrontage', 'MasVnrArea']].isnull().mean()" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "ArbitraryNumberImputer(arbitrary_number=-999,\n", " variables=['LotFrontage', 'MasVnrArea'])" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Let's create an instance of the imputer where we impute\n", "# 2 variables with the same arbitraty number.\n", "\n", "imputer = ArbitraryNumberImputer(\n", " arbitrary_number=-999,\n", " variables=['LotFrontage', 'MasVnrArea'],\n", ")\n", "\n", "imputer.fit(X_train)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-999" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# The number to use in the imputation\n", "# is stored as parameter.\n", "\n", "imputer.arbitrary_number" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'LotFrontage': -999, 'MasVnrArea': -999}" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# The imputer will use the same value to impute\n", "# all indicated variables.\n", "\n", "imputer.imputer_dict_" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "LotFrontage -999.0\n", "MasVnrArea -999.0\n", "dtype: float64" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Impute variables\n", "\n", "train_t = imputer.transform(X_train)\n", "test_t = imputer.transform(X_test)\n", "\n", "# Sanity check: the min value is the one used for \n", "# the imputation\n", "\n", "train_t[['LotFrontage', 'MasVnrArea']].min()" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZYAAAD4CAYAAADPccAIAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/Il7ecAAAACXBIWXMAAAsTAAALEwEAmpwYAAAyWElEQVR4nO3de3zU1Z3/8dc7kwQiCiJGRcEGlaooghIV66W2tkirFV2xorbi1mr9VdZLt1asLlqtW22tularq5WK1BUtWkzVXe/VuvVCULyAUgLV5aLc5SK3zMzn98f3O8kwmUkmyUxC8v08H495ZOZ8z/fMObnMJ+fyPV+ZGc4551yhlHR2BZxzznUvHlicc84VlAcW55xzBeWBxTnnXEF5YHHOOVdQpZ1dgc606667WlVVVWdXwznnupRZs2atNLPKXMcjHViqqqqora3t7Go451yXIunj5o77UJhzzrmC8sDinHOuoDywOOecK6hIz7E457qG+vp6Fi9ezObNmzu7KpHSs2dPBgwYQFlZWavO88DinNvuLV68mJ122omqqiokdXZ1IsHMWLVqFYsXL2bQoEGtOteHwpxz273NmzfTr18/DyodSBL9+vVrUy+xqIFF0mhJ8yTVSZqY5XgPSY+Ex9+QVBWmf13SLEnvhV+/mnbOiDC9TtIdCn/TJO0i6TlJ88OvfYvZNudcx/Kg0vHa+j0vWmCRFAPuAr4BDAHOkjQkI9v5wBoz2w+4Dbg5TF8JfMvMhgLjgalp59wNXAAMDh+jw/SJwAtmNhh4IXztnOtAyaTxX2/8H2s31nd2VVwnKmaP5QigzswWmtlWYBowJiPPGGBK+Hw6cIIkmdnbZrY0TJ8DVIS9m/5AbzN73YIbyTwInJqlrClp6c65DvL2ojX89E/vcfMzH3Z2VQpuxx13zDvvAw88wNKlSxteH3/88ey///4MHz6c4cOHM3369HbXZ8aMGcydO7fd5RRDMQPLXsCitNeLw7SsecwsDqwF+mXkOR14y8y2hPkX5yhzdzP7JHz+KbB7tkpJulBSraTaFStWtK5FzrlmrdywFYC/f7q+k2vSuTIDC8BDDz3E7NmzmT17NmPHjt3mWCKRaPV7RDWwtJukgwiGx37QmvPC3kzWW2Oa2b1mVm1m1ZWVObe6cc61web64ANyzcatnVyTjjF79mxGjhzJIYccwmmnncaaNWuYPn06tbW1nHPOOQwfPpxNmzZlPbeqqoorr7ySww47jD/+8Y88/PDDDB06lIMPPpgrr7yyId+OO+7I1VdfzbBhwxg5ciTLli3jb3/7GzU1NVxxxRUMHz6cBQsWcN9993H44YczbNgwTj/9dDZu3AjAggULGDlyJEOHDuWaa67Zpuf1q1/9isMPP5xDDjmEa6+9tmDfl2IuN14CDEx7PSBMy5ZnsaRSoA+wCkDSAOBPwLlmtiAt/4AcZS6T1N/MPgmHzJYXsjHOuZalAsvaTfGivcfP/jyHuUvXFbTMIXv25tpvHdTq884991x+85vf8OUvf5lJkybxs5/9jNtvv50777yTW265herq6oa855xzDhUVFQC88MILAPTr14+33nqLpUuXMnLkSGbNmkXfvn0ZNWoUM2bM4NRTT+Xzzz9n5MiR3HjjjfzkJz/hvvvu45prruGUU07h5JNPbuj97LzzzlxwwQUAXHPNNdx///38y7/8C5deeimXXnopZ511Fvfcc09DfZ599lnmz5/Pm2++iZlxyimn8Morr3Dccce1+fuYUswey0xgsKRBksqBcUBNRp4agsl5gLHAi2ZmknYGngImmtn/pjKHQ13rJI0MV4OdCzyRpazxaenOuQ6yaWsQWNZt6v6T92vXruWzzz7jy1/+MgDjx4/nlVdeyZk/fSisX79gxP/MM88EYObMmRx//PFUVlZSWlrKOeec01BWeXk5J598MgAjRozgo48+ylr++++/z7HHHsvQoUN56KGHmDNnDgCvvfYaZ5xxBgBnn312Q/5nn32WZ599lkMPPZTDDjuMDz/8kPnz57fjO9KoaD0WM4tLmgA8A8SAyWY2R9L1QK2Z1QD3A1Ml1QGrCYIPwARgP2CSpElh2igzWw78EHgAqAD+O3wA3AQ8Kul84GPg28Vqm3Muu031SQC2JpLEE0lKY4X/37UtPYvtVa9evVrMU1ZW1rDsNxaLEY9n7w2ed955zJgxg2HDhvHAAw/wl7/8pdlyzYyrrrqKH/ygVTMNeSnqHIuZPW1mXzSzfc3sxjBtUhhUMLPNZnaGme1nZkeY2cIw/edm1svMhqc9lofHas3s4LDMCeF8Cma2ysxOMLPBZvY1M1tdzLY555pKDYVB40R+d9WnTx/69u3LX//6VwCmTp3a0HvZaaedWL8+/wUMRxxxBC+//DIrV64kkUjw8MMPN5SVS+Z7rF+/nv79+1NfX89DDz3UkD5y5Egee+wxAKZNm9aQfuKJJzJ58mQ2bNgAwJIlS1i+vDAzCL6li3OuYNIDy5LPNrJHn56dWJvC2rhxIwMGNE7x/uhHP2LKlClcdNFFbNy4kX322Yff//73QNB7uOiii6ioqOC1115rsez+/ftz00038ZWvfAUz46STTmLMmMyrM7Y1btw4LrjgAu644w6mT5/ODTfcwJFHHkllZSVHHnlkQ9C5/fbb+c53vsONN97I6NGj6dOnDwCjRo3igw8+4KijjgKCRQJ/+MMf2G233dr0/Umn8B/+SKqurja/0ZdzhTPpifd58LXgHlDXnHQg3z92n4KU+8EHH3DggQcWpKyo2bhxIxUVFUhi2rRpPPzwwzzxRP5T0Nm+95JmmVl1jlO8x+KcK5xNWxPs0bsnZaXirf9b09nVccCsWbOYMGECZsbOO+/M5MmTi/6eHliccwVTn0jSo6yEPftUsHzdls6ujgOOPfZY3nnnnQ59z+36AknnXNcSTxqxEtG3V1lkLpJ0TXlgcc4VTCJplJaIPhXlRb1I0m3fPLA45wom6LGUUFEW22aFmIsWDyzOuYJJ9Vh6lpV4YIkwDyzOuYJJzbFUlMWIJ436RLKzq1Qwvm1+/nxVmHOuYBLJJLES0bMsBgQXTJYVYVuX7d0DDzzAwQcfzJ577tmQ9tBDD22zKWW6RCJBLBZr1XvMmDGDk08+mSFDMu+f2Pmi9xN3zhVNPGFhYAk+WjbXd58eSza+bX523mNxzhVMImmUl5bQI63HUnCXXQazZxe2zOHD4fbbW32ab5ufnfdYnHMFk5pj6VnMwLKd8G3zc/Mei3OuYJIWrAorD+dVthZj8r4NPYvtlW+b75xzLQjmWEooi6nhdXfl2+bnVtTAImm0pHmS6iRNzHK8h6RHwuNvSKoK0/tJeknSBkl3puXfSdLstMdKSbeHx86TtCLt2PeL2TbnXFOp61hSN/jqTsuNU9vmpx633norU6ZM4YorruCQQw5h9uzZTJoU3JcwtW1+c5P36dK3zR82bBgjRozIa9v8X/3qVxx66KEsWLCgYdv8o48+mgMOOKAh3+23386tt97KIYccQl1d3Tbb5p999tkcddRRDB06lLFjx7YqGDanaNvmS4oBfwe+DiwmuFXxWWY2Ny3PD4FDzOwiSeOA08zsTEm9gEOBg4GDzWxCjveYBVxuZq9IOg+ozpU3G98237nCOuHXf+GA/r0558i9Ofu+N3j4gpEctW+/dpfr2+a3XXfbNv8IoC51V0hJ04AxQPoVPWOA68Ln04E7JcnMPgdelbRfrsIlfRHYDfhrEerunGuDVI8lde1KPNl9eixdVXfbNn8vYFHa68XAkbnymFlc0lqgH7Ayj/LHAY/Ytl2u0yUdR9BTutzMFmWeJOlC4EKAvffeO8+mOOfykVoVVlrS/edYugrfNr91xgEPp73+M1BlZocAzwFTsp1kZveaWbWZVVdWVnZANZ2LjsweSyHnWKJ8t9vO0tbveTEDyxJgYNrrAWFa1jySSoE+wKqWCpY0DCg1s1mpNDNbZWapOwv9DhjR9qo759oitbtxaWpVWLIwwaBnz56sWrXKg0sHMjNWrVpFz549W31uMYfCZgKDJQ0iCCDjgLMz8tQA44HXgLHAi5bfb85ZbNtbQVJ/M/skfHkK8EE76u6ca4OGVWElhe2xDBgwgMWLF7NixYqClOfy07NnTwYMGNDq84oWWMI5kwnAM0AMmGxmcyRdD9SaWQ1wPzBVUh2wmiD4ACDpI6A3UC7pVGBU2oqybwPfzHjLSySdAsTDss4rVtucc9nFE8EmlOUNQ2GF6WGUlZUxaNCggpTliq+oV96b2dPA0xlpk9KebwbOyHFuVTPl7pMl7SrgqrbW1TnXfo3XsaQm731VWBR15cl759x2Jp40YrHGwFJfoDkW17V4YHHOFUzDqrBwjsV7LNHkgcU5VxBm1nRVmF/HEkkeWJxzBZEa9drmOha/8j6SPLA45woitX2LX3nvPLA45woiEXZZYiUiViKk7rW7scufBxbnXEGkrrIvLRFSMIFfqOtYXNfigcU5VxCJRGOPBaA0Jl8VFlEeWJxzBZHeY0l9LdReYa5r8cDinCuIpKV6LMHHSlmsxOdYIsoDi3OuIJr0WGJqmNB30eKBxTlXEE3mWHzyPrI8sDjnCiJ1HUvqqvtYiUj4BZKR5IHFOVcQ6dexQLgqzIfCIskDi3OuILKuCvOhsEjywOKcK4jGHkvwsVJaUuI9logqamCRNFrSPEl1kiZmOd5D0iPh8TckVYXp/SS9JGmDpDszzvlLWObs8LFbc2U55zpGtlVhcZ9jiaSiBRZJMeAu4BvAEOAsSUMysp0PrDGz/YDbgJvD9M3AvwE/zlH8OWY2PHwsb6Es51wHSKRtQglBgPHlxtFUzB7LEUCdmS00s63ANGBMRp4xwJTw+XTgBEkys8/N7FWCAJOvrGW1vfrOudZIzaeUbrPc2HssUVTMwLIXsCjt9eIwLWseM4sDa4F+eZT9+3AY7N/SgkdeZUm6UFKtpNoVK1a0pj3OuWZkWxXmPZZo6oqT9+eY2VDg2PDx3dacbGb3mlm1mVVXVlYWpYLORVHDHEvadSx+gWQ0FTOwLAEGpr0eEKZlzSOpFOgDrGquUDNbEn5dD/wXwZBbm8pyzhVO5qqwsliJ91giqpiBZSYwWNIgSeXAOKAmI08NMD58PhZ40cxy/iZKKpW0a/i8DDgZeL8tZTnnCitzVVjQY/E5ligqLVbBZhaXNAF4BogBk81sjqTrgVozqwHuB6ZKqgNWEwQfACR9BPQGyiWdCowCPgaeCYNKDHgeuC88JWdZzrniy1wVVuZzLJFVtMACYGZPA09npE1Ke74ZOCPHuVU5ih2RI3/Ospxzxde0x+JDYVHVFSfvnXPboSarwkpEvV8gGUkeWJxzBRFvsm2+GrbSd9HigcU5VxDZrmOp96GwSPLA4pwriMY5lsZNKH2OJZo8sDjnCiJh2/ZYfLlxdHlgcc4VRCIMIqW+3DjyPLA45woiNRRWkrbc2G/0FU0eWJxzBZHqnZTFGnssfj+WaPLA4pwriHiy6RxL0iDpw2GR44HFOVcQiYxVYWWx4Kvfnjh6PLA45wqiYY4lvENSqufiE/jR44HFOVcQyaQRKxGpe++lVof5ti7R44HFOVcQ8TCwpKQCi2/rEj0eWJxzBZFIJhuCCUAsnGPxHkv0eGBxzhVEZo+lzOdYIssDi3OuIBIZgSX13C+SjJ6iBhZJoyXNk1QnaWKW4z0kPRIef0NSVZjeT9JLkjZIujMt/w6SnpL0oaQ5km5KO3aepBWSZoeP7xezbc65bSWSts1QmC83jq6iBRZJMeAu4BvAEOAsSUMysp0PrDGz/YDbgJvD9M3AvwE/zlL0LWZ2AHAocLSkb6Qde8TMhoeP3xWwOc65FuTusfgcS9QUs8dyBFBnZgvNbCswDRiTkWcMMCV8Ph04QZLM7HMze5UgwDQws41m9lL4fCvwFjCgiG1wzuUpnrSGiyOhcWsX77FETzEDy17AorTXi8O0rHnMLA6sBfrlU7iknYFvAS+kJZ8u6V1J0yUNzHHehZJqJdWuWLEir4Y451rWtMdS0pDuoqVLTt5LKgUeBu4ws4Vh8p+BKjM7BHiOxp7QNszsXjOrNrPqysrKjqmwcxHQ5DqWsMfi92SJnmIGliVAeq9hQJiWNU8YLPoAq/Io+15gvpndnkows1VmtiV8+TtgRNuq7Zxri0Qymf0CSe+xRE4xA8tMYLCkQZLKgXFATUaeGmB8+Hws8KKZNftbKOnnBAHosoz0/mkvTwE+aHvVnXOtlbkqLDXfUu/LjSOntFgFm1lc0gTgGSAGTDazOZKuB2rNrAa4H5gqqQ5YTRB8AJD0EdAbKJd0KjAKWAdcDXwIvBXuSXRnuALsEkmnAPGwrPOK1TbnXFOZcyypoTDvsURP0QILgJk9DTydkTYp7flm4Iwc51blKFbZEs3sKuCqNlXUOddu8SY9Ft+EMqq65OS9c277k0haw22JoXEozDehjB4PLM65gogntu2xNFwg6T2WyMkrsEh6XNJJkjwQOeeySljGJpR+gWRk5RsofgucDcyXdJOk/YtYJ+dcF5TIuPLe7yAZXXkFFjN73szOAQ4DPgKel/Q3Sf8sqayYFXTOdQ1Nts2P+XLjqMp7aEtSP4IlvN8H3gb+gyDQPFeUmjnnupTMCyQbeyw+xxI1eS03lvQnYH9gKvAtM/skPPSIpNpiVc4513XEE7m2dPEeS9Tkex3LfeE1KQ0k9TCzLWZWXYR6Oee6mKRlv/Le51iiJ9+hsJ9nSXutkBVxznVtvgmlS2m2xyJpD4Kt7SskHUrjVe+9gR2KXDfnXBfSdK8wXxUWVS0NhZ1IMGE/ALg1LX098NMi1ck51wXFE9mvvPfrWKKn2cBiZlOAKZJON7PHOqhOzrkuKFePJe6T95HT0lDYd8zsD0CVpB9lHjezW7Oc5pyLoODK+8Zp25ISUSLf0iWKWhoK6xV+3bHYFXHOdW2ZPRYIhsN8KCx6WhoK+8/w6886pjrOua4qntj2AkkILpL0yfvoyXcTyl9K6i2pTNILklZI+k4e542WNE9SnaSJWY73kPRIePwNSVVhej9JL0naIOnOjHNGSHovPOcOhXf7krSLpOckzQ+/9s3rO+CcK4jMG31BsOTYlxtHT77XsYwys3XAyQR7he0HXNHcCZJiwF3AN4AhwFmShmRkOx9YY2b7AbcBN4fpm4F/A36cpei7gQuAweFjdJg+EXjBzAYDL4SvnXMdJPNGXxBM4HuPJXryDSypIbOTgD+a2do8zjkCqDOzhWa2FZgGjMnIMwaYEj6fDpwgSWb2uZm9ShBgGoT3te9tZq+bmQEPAqdmKWtKWrpzrgMkLVuPpcS3dImgfAPLk5I+BEYAL0iqJONDP4u9gEVprxeHaVnzmFkcWAv0a6HMxTnK3D1tD7NPgd2zFSDpQkm1kmpXrFjRQhOcc/nK3WPxobCoyXfb/InAl4BqM6sHPqdp72O7EfZmsv6bZGb3mlm1mVVXVlZ2cM2c656SScOMbZYbQzDH4texRE++m1ACHEBwPUv6OQ82k38JMDDt9YAwLVuexWG5fYBVLZQ5IEeZyyT1N7NPwiGz5c2U45wroNSS4ljGv6q+3Dia8l0VNhW4BTgGODx8tLSr8UxgsKRBksqBcUBNRp4aYHz4fCzwYtjbyCoc6lonaWS4Guxc4IksZY1PS3fOFVmiIbBk9FhK5BdIRlC+PZZqYEhzH/qZzCwuaQLwDBADJpvZHEnXA7VmVgPcD0yVVAesJgg+AEj6iGCzy3JJpxKsTJsL/BB4AKgA/jt8ANwEPCrpfOBj4Nv51tU51z6J8KMhc44lVuJDYVGUb2B5H9gD+KSljOnCe7g8nZE2Ke35ZuCMHOdW5UivBQ7Okr4KOKE19XPOFUYikeqxbBtYymIlvtw4gvINLLsCcyW9CWxJJZrZKUWplXOuS0kNd6XuwZISKxH1HlgiJ9/Acl0xK+Gc69pSvZISZfZYfLlxFOUVWMzsZUlfAAab2fOSdiCYN3HOuYaVX9nmWPwCyejJd1XYBQRXxv9nmLQXMKNIdXLOdTGNq8Ka7m7scyzRk++V9xcDRwPrAMxsPrBbsSrlnOtaUsEjc44luEDSh8KiJt/AsiXc7wuA8GJG/zfEOQekXyCZ7ToW/6iImnwDy8uSfgpUSPo68Efgz8WrlnOuK2kYClOWG335HEvk5BtYJgIrgPeAHxBcm3JNsSrlnOtaUsuNm9zoK+ZX3kdRvqvCkpJmADPMzLcEds5tI9UrKcuYYynzobBIarbHosB1klYC84B54d0jJzV3nnMuWhovkNz2IyXmQ2GR1NJQ2OUEq8EON7NdzGwX4EjgaEmXF712zrkuoT5XjyXmd5CMopYCy3eBs8zsH6kEM1sIfIdgZ2HnnGu4r31Zkx6Lz7FEUUuBpczMVmYmhvMsZcWpknOuq0kNd2VeeV8W8/uxRFFLgWVrG4855yKk2R6Lz7FETkurwoZJWpclXUDPItTHOdcFNc6x+I2+XAuBxcx8o0nnXItybZvv97yPpnwvkGwTSaMlzZNUJ2liluM9JD0SHn9DUlXasavC9HmSTgzT9pc0O+2xTtJl4bHrJC1JO/bNYrbNOdeoocdSkmW5cdJoxc1nXTeQ7/1YWk1SDLgL+DqwGJgpqSa8vXDK+cAaM9tP0jjgZuBMSUMIblN8ELAn8LykL5rZPGB4WvlLgD+llXebmd1SrDY557JrmGMpbXqBJARbvmT2Zlz3VcweyxFAnZktDDewnAaMycgzBpgSPp8OnCBJYfo0M9sSLnWuC8tLdwKwwMw+LloLnHN5Se1gXJrZYwmDia8Mi5ZiBpa9gEVprxeHaVnzmFkcWAv0y/PcccDDGWkTJL0rabKkvtkqJelCSbWSales8N1pnCuEnBdIhoHGL5KMlqLOsRSLpHLgFIJdllPuBvYlGCr7BPh1tnPN7F4zqzaz6srKymJX1blISE3eZ1tuDPgEfsQUM7AsAQamvR4QpmXNE97jpQ+wKo9zvwG8ZWbLUglmtszMEmaWBO6j6dCZc65IUj2WzHmUsoahMF9yHCXFDCwzgcGSBoU9jHFATUaeGmB8+Hws8KIFy0dqgHHhqrFBwGDgzbTzziJjGExS/7SXpwHvF6wlzrlmNUzeZ1kVBj7HEjVFWxVmZnFJE4BngBgw2czmSLoeqDWzGuB+YKqkOmA1QfAhzPcoMBeIAxebWQJAUi+ClWY/yHjLX0oaTnBny4+yHHfOFUk8YcRKREnmPe998j6SihZYAMzsaYKbgqWnTUp7vhk4I8e5NwI3Zkn/nGCCPzP9u+2tr3OubeoTySb7hEHj3mF+3/to6ZKT98657Ut9wppM3EPjZH69B5ZI8cDinGu3eDKZ9QLI8tLgI2ZL3ANLlHhgcc61W30imbXHkgosWz2wRIoHFudcu9UnrGH7lnQ9Yh5YosgDi3Ou3eKJZJP73UNaj8XnWCLFA4tzrt3qE9k3mfShsGjywOKca7ct8SQ9SpvevqnMh8IiyQOLc67dtsQT9Cj1oTAX8MDinGu3oMeSJbB4jyWSPLA459ptSzxJj7KmQ2E9vMcSSR5YnHPttqW+haEw77FEigcW51y7bc01FOaBJZI8sDjn2i3XqjCfY4kmDyzOuXbbEk/Qo6zpx0lprIQS+RxL1Hhgcc6125b67ENhEFzL4j2WaClqYJE0WtI8SXWSJmY53kPSI+HxNyRVpR27KkyfJ+nEtPSPJL0nabak2rT0XSQ9J2l++LVvMdvmnGuUaygMgnkW77FES9ECi6QYcBfB/emHAGdJGpKR7XxgjZntB9wG3ByeO4TgbpIHAaOB34blpXzFzIabWXVa2kTgBTMbDLwQvnbOFVkyaWxN5O6x9Cj1HkvUFLPHcgRQZ2YLzWwrMA0Yk5FnDDAlfD4dOEGSwvRpZrbFzP4B1IXlNSe9rCnAqe1vgnOuJaneSLY5Fggm8D2wREsxA8tewKK014vDtKx5zCwOrCW47XBz5xrwrKRZki5My7O7mX0SPv8U2L0QjXDONW9LfRA0yrPsbgw+FBZFRb3nfZEcY2ZLJO0GPCfpQzN7JT2DmZkky3ZyGIwuBNh7772LX1vnurmN9XEAevXI/nFS7kNhkVPMHssSYGDa6wFhWtY8kkqBPsCq5s41s9TX5cCfaBwiWyapf1hWf2B5tkqZ2b1mVm1m1ZWVlW1unHMu8PmWBAA7lDczee+BJVKKGVhmAoMlDZJUTjAZX5ORpwYYHz4fC7xoZhamjwtXjQ0CBgNvSuolaScASb2AUcD7WcoaDzxRpHY559Js3Br2WMqz91jKYj4UFjVFGwozs7ikCcAzQAyYbGZzJF0P1JpZDXA/MFVSHbCaIPgQ5nsUmAvEgYvNLCFpd+BPwfw+pcB/mdn/hG95E/CopPOBj4FvF6ttzrlGG7eGPZYeOXossRK2eI8lUoo6x2JmTwNPZ6RNSnu+GTgjx7k3AjdmpC0EhuXIvwo4oZ1Vds61Uks9lvLSEjZsiXdklVwn8yvvnXPtkppj6ZWjx+LXsUSPBxbnXLukeiw7NNNj8cASLR5YnHPt0tBjyRVYfPI+cjywOOfaJdVjqWhmuXHqIkoXDR5YnHPtsmFLgrKYGm7qlamiLMbmeKKDa+U6kwcW51y7rN9cT++eZTmPV5SXNixJdtHggcU51y6fbaqnzw7NBJayGFvjSRLJrLssuW7IA4tzrl3WbqynT0VzPZbgY2ZTvfdaosIDi3OuXdZuqmfnZgNLsFpskw+HRYYHFudcu3y2aWvzPZayYLXYZu+xRIYHFudcu3zWwlBYatdjn8CPDg8szrk227Q1wfrNcXbr3TNnnlSPxedYosMDi3OuzZau3QTAnjunBRaz4BGqaOix+EaUUeGBxTnXZp98thmA/r17wmOPwejRsMsuUF4O++4Ll11G70+D+/v5HEt0dMVbEzvnthNLP9tE/3UrGP7d0+C1V+ELX4Bx46BvX5gzB+6+mwPuuYeLRp7FxrOGd3Z1XQfxwOKca7Otr7/BEw/+iB6xBNx3H3zve1CSNhCyaBGbL76EiX9+gEU/XgFPPR70Zly3VtShMEmjJc2TVCdpYpbjPSQ9Eh5/Q1JV2rGrwvR5kk4M0wZKeknSXElzJF2alv86SUskzQ4f3yxm25yLvLfe4vQr/5l4eQ/0t7/B97+/bVABGDiQjQ89zL8f/88MfP4pOPNM2Lq1c+rrOkzRAoukGHAX8A1gCHCWpCEZ2c4H1pjZfsBtwM3huUMIblN8EDAa+G1YXhz4VzMbAowELs4o8zYzGx4+trlzpXOugOrq4MQT2VDRixt/cjccdFDOrDv0KOXeI0/n1UsmwYwZcPHF20zuu+6nmD2WI4A6M1toZluBacCYjDxjgCnh8+nACQpuaD8GmGZmW8zsH0AdcISZfWJmbwGY2XrgA2CvIrbBOZdpwwY47TRIJrn0e7+ktOoLzWbvWRqsCnvz5HPg6qvhd7+D//iPjqip6yTFDCx7AYvSXi+maRBoyGNmcWAt0C+fc8Nhs0OBN9KSJ0h6V9JkSX2zVUrShZJqJdWuWLGi1Y1yLtLMgiGvuXNh2jTm7rRHszsbA5SUiJ16lrJuUz1cf30QlP71X+GVVzqo0q6jdcnlxpJ2BB4DLjOzdWHy3cC+wHDgE+DX2c41s3vNrNrMqisrKzuius51H/ffD488AjfeiH3ta6zfHGenni2vAepTURYElpISePBB2GcfOOccWLOmAyrtOloxA8sSYGDa6wFhWtY8kkqBPsCq5s6VVEYQVB4ys8dTGcxsmZklzCwJ3EcwFOecK5SFC+Hyy+GrX4Wf/IRN9QkSSWOnFnosADvvUMZnm+qDFzvuCA8/DMuWwQUX+HxLN1TMwDITGCxpkKRygsn4mow8NcD48PlY4EUzszB9XLhqbBAwGHgznH+5H/jAzG5NL0hS/7SXpwHvF7xFzkVVIgHjxwc9jt//HkpKWL85uJK+d0V+PZa1qcACUF0NN94YXFR5//3FqrXrJEULLOGcyQTgGYJJ9kfNbI6k6yWdEma7H+gnqQ74ETAxPHcO8CgwF/gf4GIzSwBHA98FvpplWfEvJb0n6V3gK8DlxWqbc5Fz663w6qvwm9/A3nsDwZ0jgbx6LE0CCwTzLCecAJddFqwyc91GUS+QDJf8Pp2RNint+WbgjBzn3gjcmJH2KqAc+b/b3vo657J491245hr4p3+C7zb+ma3dFPRY8ptjKeezjRmBpaQEHngAhg6F73wnCFylfs12d9AlJ++dcx1ky5YgmPTtC/fcA2r8vy7VY+mdR2DZeYcy1m7aimXOpwwYEJT7xhvw7/9e0Kq7zuOBxTmX23XXBT2W++6DjFWUDXMseQyF7dG7J/UJY9XnWa66P/PMYIXY9dfDm28Wotauk3lgcc5l9+qrcPPNwXUr3/pWk8OpwJLPHMsefYJt9VO7ITdx552w557BkNjnn7e9zm674IHFOdfU+vVw7rlQVRVM3GexrmHyvuWhsD37VACN929pYuedg+tb6urgxz9uS43ddsQDi3Ouqcsvh48/hqlTYaedsmZZv7meWIkabj3cnP7hjcA+XZujxwJw/PFBULnnHnjyybbU2m0nPLA457Y1Y0ZwbcmVV8LRR+fMlrrqXsq6UHMbu+xQTo/SEhat3th8xhtugGHD4PzzYfnyVlbcbS88sDjnGi1YAOedByNGBBP3zch3OxcI9gvbt3JH6lZsaD5jjx7whz/A2rV+VX4X5oHFORfYtAnGjg2uL5k+vcUbcq3bVJ/XirCUwbvvSN3yFgILwMEHwy9+ATU1cNtteZfvth8eWJxzQc/gwgth9uxgXqWqqsVT1m1uXWDZr3JHFq/ZxMat8ZYzX3opnH46XHEFPO23VupqPLA45+Daa4MhqBtugJNOyuuU1gyFQdBjAViwPI/lxCUlMGVKMN8ybhy8917e7+M6nwcW56LurruCgHL++cGNuPK0blM9vSta0WPZLVhdNm/Z+vxO6NULnngiWJX2ta/BvHl5v5frXL4xjyusNWuCq6dnz4Y5c+DTT2HlymCopbQUdtst2MbjoIPg8MNh+HCoqOjsWkfXb34Dl1wCp5wCd9+9zZYtLVm3Od6qobBBu/Zixx6lvP1/axg7YkB+Jw0cCC+8AF/+crBh5XPPwYEH5v2ernN4YHHtt2xZsET1scfgpZcgHo6hDxgAe+0VXFFdUgL19fDJJ8G+UPfeG+Tp0SO4fuGkk+Dkk2HQoM5qRbQkk8Hw189/DqeeGty8qyz/IJFIGhu2tG4oLFYiDt17Z2Z93Mqbex1wADz/PHz968Hy5xkz4LjjWleG61AeWFzbLFoEjz8eBJNXXw16JIMHBxe4jRoV9ET6Zr07dGDpUpg5E15+GZ56Kviv+ZJLgjH1f/qn4HHQQa36D9rladWqYJuWGTOC4a/f/rbFFWCZUlvg77xD/sEIYOQ+/fjVM/NYvGYjA/rukP+JQ4fCa6/BN78ZDIv94hfBRZwlPpq/XTKzyD5GjBhhrhXmzze76Sazww83C0KJ2cEHm117rdm775olk+0r+9e/NjvmGDMpKHu//cx+8hOz1183SyQK1ozISibNHn/cbI89zEpLzW67rc0/szlL1toXrnzSnnp3aavOW7T6cxs08Uk79/43bPGaja1/49WrzU47Lfj9+MpXzN5/v/VluHYDaq2Zz1YP9y63RCL4L/GnPw2uLRg8GCZODELKL34RTKa+915wId3Qoe3rXey3H/zoR/DXvwa9mXvuCe6LfuutMHJkMNY+YQK8+GLjUJvLTzIZzE0ce2zQE9xtN6itDW6w1caf2bL1wdYsu/fu0arzBvTdgatPGsJrC1dx6l3/y8oNW1r3xn37Br3ke++Ft98Oerjf/z7Mndu6clxxNRd12vsARgPzgDpgYpbjPYBHwuNvAFVpx64K0+cBJ7ZUJjAoLKMuLLO8pfp5jyXDxo1B7+DXvw7+K6ysDP4zjMWC/w5vu83so486tk6rV5tNnRrUp6IiqM8uu5iNH2/2u9+ZvfOOWX19x9apK0gkzGprza6/3mzffYPvW//+ZvfcY7Z1a7uL/8PrH9kXrnzSFq3+vE3nv7f4M9vvp0/ZZdPeto9Xfm6Pzvw/W7Z2U+sKWbnS7JJLzHr2DNp3zDFmt98e9H7b03t2LaKFHouCPIUnKQb8Hfg6sBiYCZxlZnPT8vwQOMTMLpI0DjjNzM6UNAR4GDgC2BN4HvhieFrWMiU9CjxuZtMk3QO8Y2Z3N1fH6upqq62tLWCrt0NmsHkzbNgQbEe+YUOwSmvp0mAifelSmD8/+I9v4cLGLTT22QeOOQZGjw4ezc2XdJSNG+GZZ4K5nSefhM8+C9IrKoIJ3n33DR5VVcF/5bvuGtxDpG/fIE9FRTBB3dXnbcyCG3Bt3Bj8TJctC36OS5bA3/8O77wTPFavDvIfdxz84AdBb6Vnz4JU4arH3+XJdz/hnUmjKClp2/fz1mfncceLjbck7t2zlP8461C+sv9urSpnzrsL+PSWOzn89WfoPf+DILGyEo48EvbfP+gN77039OvX+OjVq3v8LnQSSbPMrDrn8SIGlqOA68zsxPD1VQBm9ou0PM+EeV6TVAp8ClQCE9PzpvKFpzUpE7gJWAHsYWbxzPfOpc2BZfJkuOWW1CxDkJbv89bkbe95iUTw4ZNM5m5LRUXwYTxkSLCMc+hQ+NKXoH//1n9fOlIyGWyxXlsbPD78MNjn6h//CFaf5VJSEny4VlQEy59LSoIPl5KSbR/paSnZfh7pX4txLD1PPB78PDduzL2HVkVFMGw5bFiwRHfUqCDIZjHqtpdJJA0zSJqRDL82vg7SLO1YMsy/fkuckw/pz51nH5a9HnnYXJ/g3Mlv0rtnGd87uoobnvqADz5Zx8BdKiiPlWBh8xvqkmx8nkgG9YonrWEhAcDwTcs5dun7HPzRHL64+O/suXIJPeJZbi4GJCXisVK2lpZTX1pOPBajzw7llJfGgp9/6pH6fcj16KquvTa4yVobtBRYirkqbC9gUdrrxcCRufKEAWEt0C9Mfz3j3L3C59nK7Ad8ZmbxLPm3IelC4EKAvffeu3UtStl11+CPNyiw8Zcr3+cddZ4EO+4Y/HeW+tqrV1D/PfcMgkfv3l3zj6OkBL74xeBx9tmN6YlE8B/8ihVBz2zFiuDamk2bgsfmzY3PE4kgQCWT4SdYctuHWZAn/fuT7eeR/rUYx1JfYzHYYYemj912a1zWvfvuQb48DN49uGCxRKJEwVcp83Xj8xKBwjw7lMc470vtWxresyzGoz84quH14//vS/z+b/9g3qfriScMBLGMusRKmj4f2HcHvn34QGo/Ws3T733KP+LDWBAGXiWT9Fmzgp3XLGfH9WvpteEzem1YS/nWzZTW11Ma30ppfT1l9VuIJeIc1L835T1LG/9BSz1Svw+Zj66siKMQkVtubGb3AvdC0GNpUyGnnBI83PYnFgs+YPfcs7Nrst27qx29jWKoKI/xw+P3a/P5Jxy4OyccuHsBa+TaqpirwpYAA9NeDwjTsuYJh8L6AKuaOTdX+ipg57CMXO/lnHOuAxQzsMwEBksaJKkcGAfUZOSpAcaHz8cCL4YrDmqAcZJ6SBoEDAbezFVmeM5LYRmEZT5RxLY555zLoWhDYeGcyQTgGSAGTDazOZKuJ1iqVgPcD0yVVAesJggUhPkeBeYCceBiM0sAZCszfMsrgWmSfg68HZbtnHOugxVtVVhXEInlxs45V2AtrQrzK++dc84VlAcW55xzBeWBxTnnXEF5YHHOOVdQkZ68l7QC+LiTq7ErsLKT69ARvJ3di7eze2ltO79gZpW5DkY6sGwPJNU2t7qiu/B2di/ezu6l0O30oTDnnHMF5YHFOedcQXlg6Xz3dnYFOoi3s3vxdnYvBW2nz7E455wrKO+xOOecKygPLM455wrKA0uRSTpD0hxJSUnVaelVkjZJmh0+7kk7NkLSe5LqJN0hBbcQlLSLpOckzQ+/bgc3og/kamd47KqwLfMknZiWPjpMq5M0MS19kKQ3wvRHwlskbHckXSdpSdrP8Jtpx1rV5q6kO7QhnaSPwr+32ZJqw7Ssf2sK3BG2/V1J29fd0tJImixpuaT309Ja3S5J48P88yWNz/ZeTZiZP4r4AA4E9gf+AlSnpVcB7+c4501gJCDgv4FvhOm/BCaGzycCN3d2+/Jo5xDgHaAHMAhYQHDLg1j4fB+gPMwzJDznUWBc+Pwe4P91dvtytPk64MdZ0lvd5q7y6A5tyNKmj4BdM9Ky/q0B3wz/JhX+jb7R2fVvpl3HAYelf860tl3ALsDC8Gvf8Hnflt7beyxFZmYfmNm8fPNL6g/0NrPXLfjJPgicGh4eA0wJn09JS+90zbRzDDDNzLaY2T+AOuCI8FFnZgvNbCswDRgT9s6+CkwPz9+u2pmnVrW5E+vZFt2hDfnI9bc2BnjQAq8T3Lm2fyfUr0Vm9grBfa7StbZdJwLPmdlqM1sDPAeMbum9PbB0rkGS3pb0sqRjw7S9gMVpeRaHaQC7m9kn4fNPga5wg++9gEVpr1PtyZXeD/jMzOIZ6durCeHQweS0ocnWtrkr6Q5tyGTAs5JmSbowTMv1t9bV29/adrWpvUW7g2SUSHoe2CPLoavNLNctkj8B9jazVZJGADMkHZTve5qZSerQteJtbGeX1lybgbuBGwg+mG4Afg18r+Nq5wrkGDNbImk34DlJH6Yf7Iy/tY5QzHZ5YCkAM/taG87ZAmwJn8+StAD4IrAEGJCWdUCYBrBMUn8z+yTspi5vX81bXedWt5Og7gPTXqe3J1v6KoJueGnYa0nP3+HybbOk+4Anw5etbXNX0lzbuiQzWxJ+XS7pTwTDfbn+1rp6+1vbriXA8Rnpf2npTXworJNIqpQUC5/vAwwGFobd1HWSRobzDecCqd5ADZBalTE+LX17VgOMk9RD0iCCdr4JzAQGhyvAyoFxQE04r/QSMDY8f7ttZ8bY+mlAavVNq9rckXUugO7QhgaSeknaKfUcGEXwc8z1t1YDnBuuohoJrE0bWuoKWtuuZ4BRkvqGQ72jwrTmdfbKhe7+IPjAWUzQO1kGPBOmnw7MAWYDbwHfSjunmuCXewFwJ407JPQDXgDmA88Du3R2+1pqZ3js6rAt8whXuIXp3wT+Hh67Oi19H4IP4jrgj0CPzm5fjjZPBd4D3g3/MPu3tc1d6dEd2pDWln0IVra9E/49Xh2mZ/1bI1g1dVfY9vdIWwG5vT2AhwmG3OvDv83z29IuguHduvDxz/m8t2/p4pxzrqB8KMw551xBeWBxzjlXUB5YnHPOFZQHFueccwXlgcU551xBeWBxzjlXUB5YnHPOFdT/B1x1JMHMLZyyAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# The distribution of the variable\n", "# changed with the transformation.\n", "\n", "fig = plt.figure()\n", "ax = fig.add_subplot(111)\n", "X_train['LotFrontage'].plot(kind='kde', ax=ax)\n", "train_t['LotFrontage'].plot(kind='kde', ax=ax, color='red')\n", "lines, labels = ax.get_legend_handles_labels()\n", "ax.legend(lines, labels, loc='best')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Impute variables with different numbers\n", "\n", "We can also impute different variables with different values. In this case, we need to start the transformer with a dictionary of variable to value pairs." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "ArbitraryNumberImputer(imputer_dict={'LotFrontage': -678, 'MasVnrArea': -789})" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Impute different variables with different values\n", "\n", "imputer = ArbitraryNumberImputer(\n", " imputer_dict={\"LotFrontage\": -678, \"MasVnrArea\": -789}\n", ")\n", "\n", "imputer.fit(X_train)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'LotFrontage': -678, 'MasVnrArea': -789}" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# In this case, the imputer_dict_ matches the \n", "# entered dictionary.\n", "\n", "imputer.imputer_dict_" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "LotFrontage -678.0\n", "MasVnrArea -789.0\n", "dtype: float64" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Now we impute the missing data\n", "\n", "train_t = imputer.transform(X_train)\n", "test_t = imputer.transform(X_test)\n", "\n", "# Sanity check: check minimum values\n", "\n", "train_t[['LotFrontage', 'MasVnrArea']].min()" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZUAAAD4CAYAAAAkRnsLAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/Il7ecAAAACXBIWXMAAAsTAAALEwEAmpwYAAA2TUlEQVR4nO3deZhU1Zn48e/b1Sur2DSIoDYIkhAFlFZxEjXGiJgg6KgJaqJOHI0TmVEzLhANUaOJRqOMUWM0EtE4oJJIOokZ92y/GLUxuAASGkTZRGh2ml6q6v39cU41t4tbSzddvdDv53nq6Vvnnnvq1C2ot85yzxVVxRhjjGkLeR1dAWOMMfsPCyrGGGPajAUVY4wxbcaCijHGmDZjQcUYY0ybye/oCnSk/v37a3l5eUdXwxhjupSFCxduUtWysH3dOqiUl5dTVVXV0dUwxpguRUQ+TLXPur+MMca0GQsqxhhj2owFFWOMMW2mW4+pGGO6hsbGRtasWUNdXV1HV6VbKS4uZsiQIRQUFGR9jAUVY0ynt2bNGnr37k15eTki0tHV6RZUlZqaGtasWcPQoUOzPs66v4wxnV5dXR2lpaUWUNqRiFBaWtri1mFOg4qITBSRZSJSLSLTQ/YXichTfv/rIlLu008TkYUi8q7/+4XAMeN8erWI3Cf+X5mIHCgiL4rIcv+3Xy7fmzGmfVlAaX+tOec5CyoiEgEeAM4ARgHni8iopGyXAltUdThwL3CnT98EnKmqRwEXA08EjvkpcBkwwj8m+vTpwMuqOgJ42T83xrSR9dt283/vfdzR1TCdXC5bKscB1aq6UlUbgHnAlKQ8U4A5fns+cKqIiKr+Q1XX+fTFQIlv1QwC+qjq39XdCOZx4KyQsuYE0o0xbeD6+e9wxS8X8sn27jlY3qtXr6zzPvbYY6xbt67p+ec//3lGjhzJ2LFjGTt2LPPnz9/n+ixYsIAlS5bsczltLZdBZTCwOvB8jU8LzaOqUWAbUJqU5xzgLVWt9/nXpChzoKqu99sfAwPDKiUil4tIlYhUbdy4sWXvyJhu7K/VmwBYsn57B9ek80sOKgBPPvkkixYtYtGiRZx77rnN9sVisRa/RncMKvtMRD6D6xL7ZkuO862Y0FtaqurDqlqhqhVlZaFL1xhjQpT2LARg1aZdHVyTzmPRokWMHz+e0aNHc/bZZ7Nlyxbmz59PVVUVF154IWPHjmX37t2hx5aXl3PDDTdwzDHH8MwzzzB37lyOOuoojjzySG644YamfL169eLGG29kzJgxjB8/ng0bNvC3v/2NyspKrrvuOsaOHcuKFSt45JFHOPbYYxkzZgznnHMOtbW1AKxYsYLx48dz1FFHcdNNNzVrcd11110ce+yxjB49mu9973ttck5yOaV4LXBI4PkQnxaWZ42I5AN9gRoAERkCPAtcpKorAvmHpChzg4gMUtX1vpvsk7Z8M8Z0d/l57jfoBx0cVG757WKWrGvb1tKog/vwvTM/0+LjLrroIn7yk59w8sknM3PmTG655RZmzZrF/fffz913301FRUVT3gsvvJCSkhIAXn75ZQBKS0t56623WLduHePHj2fhwoX069ePCRMmsGDBAs466yx27drF+PHjuf3227n++ut55JFHuOmmm5g8eTKTJk1qavUccMABXHbZZQDcdNNNPProo/znf/4nV111FVdddRXnn38+Dz30UFN9XnjhBZYvX84bb7yBqjJ58mT+/Oc/c9JJJ7X6PEJuWypvAiNEZKiIFAJTgcqkPJW4gXiAc4FXVFVF5ADg98B0Vf1/icy+e2u7iIz3s74uAn4TUtbFgXRjTBvYtrsRgA9qaju4Jp3Dtm3b2Lp1KyeffDIAF198MX/+859T5g92f5WWul7+r371qwC8+eabfP7zn6esrIz8/HwuvPDCprIKCwuZNGkSAOPGjWPVqlWh5b/33nuceOKJHHXUUTz55JMsXrwYgNdee43zzjsPgAsuuKAp/wsvvMALL7zA0UcfzTHHHMP777/P8uXL9+GMODlrqahqVESmAc8DEWC2qi4WkVuBKlWtBB4FnhCRamAzLvAATAOGAzNFZKZPm6CqnwDfAh4DSoA/+AfAHcDTInIp8CHwlVy9N2O6G1Vld6Pr9/9g084OrUtrWhSdVc+ePTPmKSgoaJraG4lEiEajofkuueQSFixYwJgxY3jsscf44x//mLZcVWXGjBl885stGl3IKKdjKqr6nKoeoaqHq+rtPm2mDyioap2qnqeqw1X1OFVd6dNvU9Weqjo28PjE76tS1SN9mdP8+AmqWqOqp6rqCFX9oqpuzuV7M6Y7aYy5IcpInrB2y27qoy0fWN7f9O3bl379+vGXv/wFgCeeeKKp1dK7d2927NiRdVnHHXccf/rTn9i0aROxWIy5c+c2lZVK8mvs2LGDQYMG0djYyJNPPtmUPn78eH71q18BMG/evKb0008/ndmzZ7Nzp/uRsHbtWj75ZN9HDWyZFmNMRtF4HIARA3rx/sc7+LCmliMG9u7gWrWv2tpahgzZM6T77W9/mzlz5nDFFVdQW1vLsGHD+MUvfgG4VsMVV1xBSUkJr732WsayBw0axB133MEpp5yCqvLlL3+ZKVOSr8BoburUqVx22WXcd999zJ8/n+9///scf/zxlJWVcfzxxzcFnFmzZvG1r32N22+/nYkTJ9K3b18AJkyYwNKlSznhhBMANyHgl7/8JQMGDGjV+UkQ/0O/W6qoqFC7SZcxmW3b3ciYW17gkn8p57G/reK600dy5SnD2+31ly5dyqc//el2e739SW1tLSUlJYgI8+bNY+7cufzmN9kPOYedexFZqKoVYfmtpWKMySgacy2Vof178ulBfXj9g81ceUoHV8pkZeHChUybNg1V5YADDmD27Nk5fT0LKsaYjBJjKgWRPMpLe7Ds4+zHC0zHOvHEE3n77bfb7fU69cWPxpjOodG3VPIjwsEHlPBxN12qxWRmQcUYk1E0nmipCH1LCqhtiDUFGmOCLKgYYzJKBJCCSB59il2v+Y668OslTPdmQcUYk1FT91deHn1K3K1lt/sr7I0JsqBijMkoGtvT/dWn2AeVuu4VVGzp++zY7C9jTEbB7q9InlsypK7RxlRSeeyxxzjyyCM5+OCDm9KefPLJZgtMBsViMSKRSIteY8GCBUyaNIlRo5LvfdixrKVijMkoMaU4PyIUF7ivjcRaYN2ZLX2/N2upGGMySizTUhDJoyjf/aKu66igcvXVsGhR25Y5dizMmtXiw2zp+71ZS8UYk1FiTCU/Tygu6OCg0knY0vfhrKVijMmoITCmUlLogkp9R42ptKJF0VnZ0vfGmG4pGlimpTjffW3UdfPl723p+3A5DSoiMlFElolItYhMD9lfJCJP+f2vi0i5Ty8VkVdFZKeI3B/I31tEFgUem0Rklt93iYhsDOz791y+N2O6k8SYihuo757dX4ml7xOPe+65hzlz5nDdddcxevRoFi1axMyZ7p6CiaXv0w3UBwWXvh8zZgzjxo3Laun7u+66i6OPPpoVK1Y0LX3/2c9+lk996lNN+WbNmsU999zD6NGjqa6ubrb0/QUXXMAJJ5zAUUcdxbnnntuiQJiSqubkgbvb4wpgGFAIvA2MSsrzLeAhvz0VeMpv9wQ+B1wB3J/mNRYCJ/ntS9LlDXuMGzdOjTGZPf3mR3rYDb/Tj2p2aTQW18Nu+J3OevGf7fb6S5YsabfX2t/s2rVL4/G4qqrOnTtXJ0+e3KLjw8497u69od+ruRxTOQ6oVn83RxGZB0wBglfrTAFu9tvzgftFRFR1F/BXEUl5wwYROQIYAPwlB3U3xgQk1v7KjwiRPKEwktftu7+6iv1p6fvBwOrA8zXA8anyqLun/TagFNiURfmJlk3wLmPniMhJwD+Ba1R1dfJBInI5cDnAoYcemuVbMaZ7iwaWaQEoKsjrdt1fXZUtfZ+9qcDcwPPfAuWqOhp4EZgTdpCqPqyqFapaUVZW1g7VNKbra/AD9YWRvKa/7b1KcfPfj6Y9tOac5zKorAUOCTwf4tNC84hIPtAXqMlUsIiMAfJVdWEiTVVrVLXeP/05MK71VTfGBEUD91MBNwusMdp+X/LFxcXU1NRYYGlHqkpNTQ3FxcUtOi6X3V9vAiNEZCgueEwFLkjKUwlcDLwGnAu8otn9qzmf5q0URGSQqq73TycDS/eh7saYgOCYCkBBvrRrS2XIkCGsWbOGjRs3tttrGhfMhwwZ0qJjchZU/BjJNOB53Eyw2aq6WERuxc0cqAQeBZ4QkWpgMy7wACAiq4A+QKGInAVMUNXEIP9XgC8lveR/ichkIOrLuiRX782Y7qZpQUk/plIQyWu6ILI9FBQUMHTo0HZ7PdN6Ob2iXlWfA55LSpsZ2K4DzktxbHmacoeFpM0AZrS2rsaY1BpjcSJ5Qp5fobgjxlRM19CVB+qNMe0kGlPyfUAB11JJXGVvTJAFFWNMRo0xpSCy5+uiICLt2v1lug4LKsaYjBpjcQoizVsq1v1lwlhQMcZkFI3HyQ+0VArz85pu3GVMkAUVY0xGjTGlIM9aKiYzCyrGmIwaY3EK8pPGVKIWVMzeLKgYYzIKm/1lLRUTxoKKMSYjN1AfGFOJ2JiKCWdBxRiTUTSuTUu0gLVUTGoWVIwxGSW3VNp77S/TdVhQMcZk1BiLN637Be6+KjZQb8JYUDHGZBSNNe/+sutUTCoWVIwxGe3V/RWx7i8TzoKKMSYjt/ZX0oKScSUet9aKac6CijEmo2g83nR/eqCp1dIYt9aKac6CijEmo8aYNruiPnGvehtXMclyGlREZKKILBORahGZHrK/SESe8vtfF5Fyn14qIq+KyE4RuT/pmD/6Mhf5x4B0ZRlj9p2b/RXs/nLbjTYDzCTJWVARkQjwAHAGMAo4X0RGJWW7FNiiqsOBe4E7fXod8F3g2hTFX6iqY/3jkwxlGWP2UfLsr0Srxe6pYpLlsqVyHFCtqitVtQGYB0xJyjMFmOO35wOnioio6i5V/SsuuGQrtKzWV98Yk5C89H3TmIoFFZMkl0FlMLA68HyNTwvNo6pRYBtQmkXZv/BdX98NBI6syhKRy0WkSkSqNm7c2JL3Y0y31RCNN42jQKD7y8ZUTJKuOFB/oaoeBZzoH19vycGq+rCqVqhqRVlZWU4qaMz+Jhrfe5VigKi1VEySXAaVtcAhgedDfFpoHhHJB/oCNekKVdW1/u8O4H9x3WytKssYkx03prJ395eNqZhkuQwqbwIjRGSoiBQCU4HKpDyVwMV++1zgFVVN2Z4WkXwR6e+3C4BJwHutKcsYkx1VpSEWpzASMvvLur9MkvxcFayqURGZBjwPRIDZqrpYRG4FqlS1EngUeEJEqoHNuMADgIisAvoAhSJyFjAB+BB43geUCPAS8Ig/JGVZxpjWi/mr5sNaKtb9ZZLlLKgAqOpzwHNJaTMD23XAeSmOLU9R7LgU+VOWZYxpvWhTUNnTUklcXW/dXyZZVxyoN8a0o0TgCM7+Ksy37i8TzoKKMSatqA8cNvvLZMOCijEmrUTgCI6pJLq/7OJHk8yCijEmrUY/pmLdXyYbFlSMMWklFo0MG6i3lopJZkHFGJNWNL5391diQUkLKiaZBRVjTFqJLq7Qpe+t+8sksaBijEkr0Rppdo966/4yKVhQMcaklWiNhN1PxYKKSWZBxRiTVjSspWLdXyYFCyrGmLSaxlSs+8tkwYKKMSatxvjeU4rz8oRInjRdbW9MggUVY0xa0abZX82/LvLzxFoqZi8WVIwxaTWNqeRLs/TCSJ6tUmz2YkHFGJNWInDkJ7VUCvLzrPvL7MWCijEmrabur0jzlop1f5kwOQ0qIjJRRJaJSLWITA/ZXyQiT/n9r4tIuU8vFZFXRWSniNwfyN9DRH4vIu+LyGIRuSOw7xIR2Sgii/zj33P53ozpLsKWaQE3G8y6v0yynAUVEYkADwBnAKOA80VkVFK2S4EtqjocuBe406fXAd8Frg0p+m5V/RRwNPBZETkjsO8pVR3rHz9vw7djTLfVkKKlUmjdXyZELlsqxwHVqrpSVRuAecCUpDxTgDl+ez5wqoiIqu5S1b/igksTVa1V1Vf9dgPwFjAkh+/BmG6vaaDeZn+ZLOQyqAwGVgeer/FpoXlUNQpsA0qzKVxEDgDOBF4OJJ8jIu+IyHwROSTFcZeLSJWIVG3cuDGrN2JMdxYNWaYFXPeXXVFvknXJgXoRyQfmAvep6kqf/FugXFVHAy+ypwXUjKo+rKoVqlpRVlbWPhU2pgtrCFmmxT23lorZWy6Dylog2FoY4tNC8/hA0ReoyaLsh4HlqjorkaCqNapa75/+HBjXumobY4KiIcu0JJ5bUDHJchlU3gRGiMhQESkEpgKVSXkqgYv99rnAK6qatj0tIrfhgs/VSemDAk8nA0tbX3VjTEI0HkcEInl7d3/ZQL1Jlp+rglU1KiLTgOeBCDBbVReLyK1AlapWAo8CT4hINbAZF3gAEJFVQB+gUETOAiYA24EbgfeBt0QE4H4/0+u/RGQyEPVlXZKr92ZMd9IY071aKeDGWHY3xjqgRqYzy1lQAVDV54DnktJmBrbrgPNSHFueolgJS1TVGcCMVlXUGJNSYyze7K6PCYWRvKZrWIxJ6JID9caY9hONxfe68BH8mErUur9McxZUjDFpNcZ1rwsfwXV/2UC9SZZVUBGRX4vIl0XEgpAx3UxjNB46plIYyWu614oxCdkGiQeBC4DlInKHiIzMYZ2MMZ1INK57XfgIvqVi3V8mSVZBRVVfUtULgWOAVcBLIvI3Efk3ESnIZQWNMR3LDdSnGFOx7i+TJOvuLBEpxU3T/XfgH8D/4ILMizmpmTGmU2iMhXd/WVAxYbKaUiwizwIjgSeAM1V1vd/1lIhU5apyxpiOF42Fd3+5ZVqs+8s0l+11Ko/4a06aiEiRqtarakUO6mWM6SQa45pySrFdp2KSZdv9dVtI2mttWRFjTOcUjcUpDG2puFWKM6ysZLqZtC0VETkItzx9iYgczZ6r2fsAPXJcN2NMJ9AYi+91f3rYc9OuxphSmB+60IXphjJ1f52OG5wfAtwTSN8BfCdHdTLGdCKNMaW4ILylAm7ByUK7jtp4aYOKqs4B5ojIOar6q3aqkzGmE4nGw2d/JcZZGqMKhe1dK9NZZer++pqq/hIoF5FvJ+9X1XtCDjPG7Ecao+HLtCTGWRpsWrEJyNT91dP/7ZXrihhjOqfGeOoFJQGbAWaaydT99TP/95b2qY4xprOJxjR06ftm3V/GeNkuKPkjEekjIgUi8rKIbBSRr2Vx3EQRWSYi1SIyPWR/kYg85fe/LiLlPr1URF4VkZ0icn/SMeNE5F1/zH3i79QlIgeKyIsistz/7ZfVGTDGpNWYcul7P/vLWiomINspGxNUdTswCbf213DgunQHiEgEeAA4AxgFnC8io5KyXQpsUdXhwL3AnT69DvgucG1I0T8FLgNG+MdEnz4deFlVRwAv++fGmH3kpgyn7v6ypVpMULZBJdFN9mXgGVXdlsUxxwHVqrpSVRuAecCUpDxTgDl+ez5wqoiIqu5S1b/igksTfx/6Pqr6d38v+8eBs0LKmhNIN8bsg8ZYnMI0YyrW/WWCsg0qvxOR94FxwMsiUkbSF36IwcDqwPM1Pi00j6pGgW1AaYYy16Qoc2BgTbKPgYFhBYjI5SJSJSJVGzduzPAWjDFuQcnwtb/Aur9Mc9kufT8d+BegQlUbgV3s3eroNHwrJvTnk6o+rKoVqlpRVlbWzjUzputJPaaSaKlYUDF7ZLugJMCncNerBI95PE3+tcAhgedDfFpYnjW+3L5ATYYyh6Qoc4OIDFLV9b6b7JM05RhjsqCqNMY05dL3gK1UbJrJdvbXE8DdwOeAY/0j0+rEbwIjRGSoiBQCU4HKpDyVwMV++1zgFU2zOp3v3touIuP9rK+LgN+ElHVxIN0Y00rRuPvvGL6gpHV/mb1l21KpAEal+8JPpqpREZkGPA9EgNmqulhEbgWqVLUSeBR4QkSqgc24wAOAiKzCLVxZKCJn4WagLQG+BTwGlAB/8A+AO4CnReRS4EPgK9nW1RgTLjGzK21Lxbq/TEC2QeU94CBgfaaMQf4eLM8lpc0MbNcB56U4tjxFehVwZEh6DXBqS+pnjEkvMbMrXVBJtGaMgeyDSn9giYi8AdQnElV1ck5qZYzpFBqaWiphV9Qnlr63lorZI9ugcnMuK2GM6ZzSdX8lrl1psO4vE5BVUFHVP4nIYcAIVX1JRHrgxkmMMfuxaMy6v0zLZDv76zLcFe8/80mDgQU5qpMxppNo6v4KWabFur9MmGyvqL8S+CywHUBVlwMDclUpY0znkAgYqe5RD9b9ZZrLNqjU+/W7APAXKlqb15j9XDZjKtb9ZYKyDSp/EpHvACUichrwDPDb3FXLGNMZJIJK2DItTd1f1lIxAdkGlenARuBd4Ju4a09uylWljDGdQ0PTdSohU4rzElfUW0vF7JHt7K+4iCwAFqiqLe1rTDeRuFVw2NL3IkJBRGxMxTSTtqUizs0isglYBizzd32cme44Y8z+Id2YCrhgY7O/TFCm7q9rcLO+jlXVA1X1QOB44LMick3Oa2eM6VANaZZpASgqiFhLxTSTKah8HThfVT9IJKjqSuBruBWCjTH7scY0y7QAFOXnUR+NtWeVTCeXKagUqOqm5EQ/rlKQmyoZYzqLjN1f+XnUW0vFBGQKKg2t3GeM2Q80LdMSckU9+JZKowUVs0em2V9jRGR7SLoAxTmojzGmE0m3SjFAUX7Eur9MM2mDiqraopHGdGN7lmlJ3VJpsNlfJiDbix9bRUQmisgyEakWkekh+4tE5Cm//3URKQ/sm+HTl4nI6T5tpIgsCjy2i8jVft/NIrI2sO9LuXxvxnQH6a6oBygqsO4v01y291NpMRGJAA8ApwFrgDdFpNLfEjjhUmCLqg4XkanAncBXRWQU7tbCnwEOBl4SkSNUdRkwNlD+WuDZQHn3qurduXpPxnQ3jbHUV9SDa8Fs3x1tzyqZTi6XLZXjgGpVXekXo5wHTEnKMwWY47fnA6eKiPj0eapa76czV/vygk4FVqjqhzl7B8Z0c4lrUAryUnV/2ZiKaS6XQWUwsDrwfI1PC82jqlFgG1Ca5bFTgblJadNE5B0RmS0i/cIqJSKXi0iViFRt3GgrzhiTTkMsTmEkj7y8FAP1BTal2DSX0zGVXBGRQmAybrXkhJ8Ch+O6x9YDPw47VlUfVtUKVa0oKyvLdVWN6dLqG+MUpZhODH6g3oKKCchlUFkLHBJ4PsSnhebx92jpC9RkcewZwFuquiGRoKobVDWmqnHgEfbuLjPGtFB9NEZRQbqgErGWimkml0HlTWCEiAz1LYupQGVSnkrgYr99LvCKqqpPn+pnhw0FRgBvBI47n6SuLxEZFHh6NvBem70TY7qp+micovzUVxYU5udR32hjKmaPnM3+UtWoiEwDngciwGxVXSwitwJVqloJPAo8ISLVwGZc4MHnexpYAkSBK1U1BiAiPXEzyr6Z9JI/EpGxuDtSrgrZb4xpobrGWMbuL2upmKCcBRUAVX0Od0OvYNrMwHYdcF6KY28Hbg9J34UbzE9O//q+1tcY01x9NE5h2qASIRpXYnElkmIw33QvXXKg3hjTPuqjcYoKUnd/JcZbbLDeJFhQMcakVJ+h+yuxfItdq2ISLKgYY1Kqj8YpzqKlYuMqJsGCijEmJTf7K/2YCmDrf5kmFlSMMSnVRzPP/gJoiFn3l3EsqBhjUnJX1Kfp/vJBpc5aKsazoGKMSSnTFfWJ6cY2pmISLKgYY1LKvPaXH1Ox2V/Gs6BijEkp0zItdp2KSWZBxRgTKh5XGmKZVykG6/4ye1hQMcaEStx7Pv0qxRZUTHMWVIwxoWob3DhJz8LUSwTuuU7FxlSMY0HFGBNqV72793yPwsxTiq2lYhIsqBhjQu1qcEGlZ1EWLRULKsazoGKMCbWr3nVppWupFBcmLn607i/j5DSoiMhEEVkmItUiMj1kf5GIPOX3vy4i5YF9M3z6MhE5PZC+SkTeFZFFIlIVSD9QRF4UkeX+b79cvjdj9ne1WbRUCiN55AnsbrCgYpycBRURiQAP4O4nPwo4X0RGJWW7FNiiqsOBe4E7/bGjcHeB/AwwEXjQl5dwiqqOVdWKQNp04GVVHQG87J8bY1opm5aKiNCjML9pUN+YXLZUjgOqVXWlqjYA84ApSXmmAHP89nzgVBERnz5PVetV9QOg2peXTrCsOcBZ+/4WjOm+mloqaWZ/ARQXRNht3V/Gy2VQGQysDjxf49NC86hqFNiGu1VwumMVeEFEForI5YE8A1V1vd/+GBjYFm/CmO6qafZXUeqWCriWzG4fgIzJ6T3qc+RzqrpWRAYAL4rI+6r652AGVVUR0bCDfSC6HODQQw/NfW2N6aJ2+S6tXmnGVABKrKViAnLZUlkLHBJ4PsSnheYRkXygL1CT7lhVTfz9BHiWPd1iG0RkkC9rEPBJWKVU9WFVrVDVirKysla/OWP2d7X1UUSgOM3aXwAlhREbUzFNchlU3gRGiMhQESnEDbxXJuWpBC722+cCr6iq+vSpfnbYUGAE8IaI9BSR3gAi0hOYALwXUtbFwG9y9L6M6RZ2NcToURAhL0/S5ispiNiUYtMkZ91fqhoVkWnA80AEmK2qi0XkVqBKVSuBR4EnRKQa2IwLPPh8TwNLgChwparGRGQg8Kwbyycf+F9V/T//kncAT4vIpcCHwFdy9d6M6Q521DWmnU6c0KMwwsfbG9uhRqYryOmYiqo+BzyXlDYzsF0HnJfi2NuB25PSVgJjUuSvAU7dxyobY7xtuxs5oEdBxnzFhTamYvawK+qNMaG27W7kgJLC5okNDbBrV7OkHgURu/jRNLGgYowJtbW2kT4lvqXyxhtw2mlQXAy9esExx8CCBYAbqLeWikmwoGKMCbV9dyN9Swpg9mw44QRYvBiuvx5uuQXq6uDss+GaayjJz7PZX6ZJV7xOxRjTDrbtbuSkvz8H934HJkyAZ56BPn3czhkz4NprYdYsTlu3k58NPYtYXIlkmClm9n8WVIwxe2mMxSn/aBlf/t+b4ZRToLISior2ZCgogFmzIBql4sEHmTT5QHY3np7xQkmz/xN3WUj3VFFRoVVVVZkzGtPNbNq8g81HHMngSCM933sbUl0o3NjIJ8eMp/if79P4j0WUjhrRvhU1HUJEFiYt6NvExlSMMXv70Y84ouYj3v3unakDCkBBAW/edh8F8RglN1zXfvUznZYFFWNMcx99xIH33sVvP3UifOlLmfMPHcZP/uWr9Pjdb+D553NfP9OpWVAxxjR3110Qj/ODU75BWe+ijNlLCvP4+bFnUzf0cLjmGojZTLDuzIKKMWaPDRvg5z/nnxPPZn2fsqyCSo/CfBryC1h1zXdg6VKYO7cdKmo6Kwsqxpg97r0XGhp4dfIlFObn0TuL2Vy9i12eVSefDmPGwM03Q6OtBdZdWVAxxjhbtsCDD8JXvsLCwjLKS3vgF29Nq0+xu+p+e0Mcvv99WLECHn8817U1nZQFFWOM88ADsGMHzJhB9Sc7GDGgd1aHJYLKjrooTJoEFRVwxx02ttJNWVAxxsDOne5ixjPPpO7Tn+HDzbUMH9Arq0N7+e6v7bsbQQRuuAGqq+HXv85hhU1nZUHFGAOPPAI1NfCd77Bi405UYcTA7IJKJE/oVZTvWirg1gQbMQLuvBO68cXV3ZUFFWO6u/p6uPtutxzL+PFUf7ITgCMGZtf9BW6wfnudH5yPROC662DhQnjllVzU2HRiOQ0qIjJRRJaJSLWITA/ZXyQiT/n9r4tIeWDfDJ++TERO92mHiMirIrJERBaLyFWB/DeLyFoRWeQfWVy1ZYxhzhxYtw6+8x0AVmzcRZ7AYaU9si6iT3EBO+oCM76+/nU46CDXWjHdSs6CiohEgAeAM4BRwPkiMiop26XAFlUdDtwL3OmPHYW7tfBngInAg768KPDfqjoKGA9cmVTmvao61j+a3XHSGBMiGnVf/McdB6e6G6eu3lzLoL4lFOVHsi6md3E+23dH9yQUF8PVV8OLL8Jbb7VxpU1nlsuWynFAtaquVNUGYB4wJSnPFGCO354PnCpuDuMUYJ6q1qvqB0A1cJyqrlfVtwBUdQewFBicw/dgzP7t6adh5UrXSvHTh1dvruWQA0taVEyfkgJ21Cddm3LFFW6pfGutdCu5DCqDgdWB52vYOwA05VHVKLANKM3mWN9VdjTweiB5moi8IyKzRaRfWKVE5HIRqRKRqo0bN7b4TRmz34jH4Yc/hM98Bs48syl59ZZaDumXfdcXhLRUAPr2hf/4D5g/380GM91ClxyoF5FewK+Aq1V1u0/+KXA4MBZYD/w47FhVfVhVK1S1oizd6qvG7O9++1t47z13w60891WgqtTsbKB/FsuzBJX2LKJmZ/3eO666yt175a672qLGpgvIZVBZCxwSeD7Ep4XmEZF8oC9Qk+5YESnABZQnVbVpIryqblDVmKrGgUdw3W/GmDCq8IMfwLBh8NWvNiXvrI8SjSsH9ihsUXFlvYvY1RCjtiGptTJoEFxyCTz2GKxfv+/1Np1eLoPKm8AIERkqIoW4gffKpDyVwMV++1zgFXV3DasEpvrZYUOBEcAbfrzlUWCpqt4TLEhEBgWeng281+bvyJj9xSuvwBtvuAsV8/es77VllxsXOaBHQYuKSyw8uWlHw947r7vOTQiYNavV1TVdR86Cih8jmQY8jxtQf1pVF4vIrSIy2Wd7FCgVkWrg28B0f+xi4GlgCfB/wJWqGgM+C3wd+ELI1OEfici7IvIOcApwTa7emzFd3u23u1bExRc3S95S64JCvxa2VPr3cvk37qzbe+fhh8NXvgI//Sls3dqq6pquI6c3lPbTep9LSpsZ2K4Dzktx7O3A7UlpfwVCV7hT1a/va32N6Rb+8hd49VW3InFR87GTpqDSs3UtlY07QsZVwLWI5s1zC1b662HM/qlLDtQbY/bBrbfCwIFw+eV77dpa67q/WtpSGdC7GIAN21MElbFjYeJE1wW2a1eLyjZdiwUVY7qTv/0NXnrJjXP02Hva8OZdre/+6lEYYVVNmoAxcyZs3Aj/8z8tKtt0LRZUjOlObrkFysrchYkhttY2IOIuZmwJEeGw0p6s2pQmqJxwglsa/0c/cvduMfslCyrGdBcvvQQvvODGN3r2DM2ypbaRviUFRPIy35wr2dD+PVhVU5s+0+23w7Ztdt3KfsyCijHdQSwG114L5eUwbVrKbFtqG1p8jUpCeWlPVm+upTEWT51p9Gg4/3zXBbY2+bI1sz+woGJMd/DLX8Lbb7tlWYpSXy2/pbahxdeoJIw8qDfRuPLPDTvSZ7zttj1Bzux3LKgYs7/bvBmuv96tRBy4ej7Mll2NHNizdS2V0UMOAODdNdvSZxw2zC0NM2+e3W9lP2RBxZj93XXXubs6/uxnTSsRp+JaKq3t/upB7+J83lmbIaiAC3LDhsG3vgW7d7fq9UznlNOLH0031dgIr73mfoUuXuz6zhsa3BTW8nJ3zcKpp7r+9QxfcmYfvfwyzJ7tBufHjs2YfUttQ6tbKiLC6CF9eWfN1syZS0rgoYdgwgRXt/vua9Vrms7HWiqm7Sxd6lalHTgQTj4Zvv99WLTIzTQ66CCX59VX4b//233BjRzp+vg3bOjIWu+/Nm6Eiy5y94v/3vcyZt/dEKOuMd7ia1SCxh5yAEvX72BnfTRz5tNOczfy+slP4Dm7p97+woKK2XfvvANnnw2jRrn1nU4/HX71K3ctwvLl7u5/v/sd/PnPsHq1e/z853DwwW7JjvJyF4zWrOnod7L/iMXcLX1ratyNuEoy33Rrc9O6X60bqAc4YVh/YnHlzVWbszvghz+Eo45ywc/uubJfsKBiWm/pUjfwO2aMa4HcfLMLDHPnwr/+q7vrX5ghQ+DSS+GPf4Rly9wU0wcfdAsPXnmlTTXdV6rwX/8Fzz/vupWy6PYC2JK4mr6V3V8A4w7rR2Ekj9dW1GR3QHEx/PrXrs6TJrlJBaZLs6BiWu6DD9zqtkce6botbrzRpX3vezBgQMvKOuII1+e/fLm778bDD7vgcs011i3WGqpuOZQHH3QD9CHre6VS44NKa8dUAEoKI4w7rB8vLdmAu4tFFoYPh2efdf+GvvhF17oyXZYFFZO99evdhXMjR7oulWuucV8Et90G/ULv3py98nI3OynRcrnvPjc76IYbYNOmNqn+fi8Wc92It90G//ZvcMcdLTp8/VY3C2tQ3+J9qsa/HjOYlZt2Zd9aATjpJPjNb2DJErdtXWFdlgUVk9mKFS6YHH64++L/xjfcf/q774b+/dv2tYYNg1/8wnWtnX22W85j6FC46SZbLyqdtWvdr/yf/AS+/W03ZpXXsv/e67bVIQID++xbUJk0+mAO7lvMdfPf4dVln2TfYpk4Ef7wB/j4Yzj2WHcdS7bHmk7DgooJ19AACxbAWWe52UMPP+zGT5YudVNBBw/O7esfcYS7Cvy99+CMM9yaUYcc4sZi/vY3+7JJ2LXLBd5PfcrdyXHOHPjxj1scUADWbK5lYO9iCiL79rVQUhjhwa+NA+DffvEmF81+I/3SLUGnnAJVVe7f3Pnnu8/+9df3qT6mnalqzh7ARGAZUA1MD9lfBDzl978OlAf2zfDpy4DTM5UJDPVlVPsyCzPVb9y4cWoC1q1Tffxx1a99TbW0VBVUBwxQnTHD7etIixapfuMbqj17unqVl6teeaXq73+vunVrx9atvcXjqlVVqtddp9qvnzsfkyapVlfvU7Gn3/snvWT2621USdX6xpj+7E/VetgNv9O7/u99jcfj+sr7G/RXC1frJ9vr0h8cjaree6/qgQe69/e5z6k+9JDq2rVtVj/TekCVpvheFc3RLz4RiQD/BE4D1uDuWX++qi4J5PkWMFpVrxCRqcDZqvpVERkFzAWOAw4GXgKO8IeFlikiTwO/VtV5IvIQ8Laq/jRdHSsqKrSqqqoN33UnpequWt6xwz22bnWztFavho8+cv3YixbBunUuf1mZuyjtggvc3/xOdI3sjh3wzDOuFfXyy1DrV8UdPhyOOcb9LS93j9JSN9bTr5+bidaKX+8dQhXq6mD7drei74YNsGoVrFwJb73lfrlv2ACRCEye7K77+exn9+klN+2s5/gfvMyVpwzn26cdkfmAFrj2mbf59VtrGNq/Jys2uqXxexRGuPKU4fz7iUMpyo+kPnjHDtfl+uij8P77Lm34cKiogJEjiR1+OO80FPH6jgj9hw3h5KMPo6zsALe+mV1YmzMislBVK0L35TConADcrKqn++czAFT1h4E8z/s8r4lIPvAxUMaee9X/MJjPH7ZXmcAdwEbgIFWNJr92Kq0OKrNnu/EE9xvKpSVvh6V1xP543H3xxmLh76Ww0A28H320m3r6+c+7KcJd4Qu4rg7++lfX7VNV5QLjRx+lfq8FBe79Fhbu2c7Pd18+iS+g4N9s0uLxPec+8djXtNpatypBmJEj4fjjeaN8DHfkD2drjz6gEFdF8X+bimueFlcAJe73Jf6qwu7GGAo8f/VJDB/Qa18+lb3srI9yzVOLWLVpF5edNIxRg/rwk1eW8/ziDfQuyqe0V2HTUvuqNNU5rko8njjNcQ5fV824lYs4+oN3Gb5hJQdv3UBemu+vhvwCGvILieVFiOflAULcf4ZxEVTyUL8d3KcIGghIxfl5TbdLzqi1gawjjps5M+NacKkPTx1UcvkTdDCwOvB8DXB8qjw+GGwDSn3635OOTXTih5VZCmxV1WhI/mZE5HLgcoBDDz20Ze8ooX9/N53WFdiyL6KO2N+zJ/Tu7X6tJ/4OHgyHHupaJV31F11xsRuc/uIX96RFo27Q+qOP3DUPW7a4x9atbpyosdH9TTyi0czBOl1aXt6e8y6y9/PWpPXo4T6jvn3d37IyN1nhsMOaVhjevmQDg/6xlkECeSIIkCduqRTB/xWfhvjfCOLz7DlmTz7h8yPL2jygAPQqyueRi5p///zs6xX8ZflGXli8gW27G4nFFQT/PmTPe5HA8yMG8MkX/oUXRHhJoKC+ntKa9RzbM8a4nlE2fbCO91d+zPatO8lvqCfSUE9hQz158RiiiqgL4tL0iCdt05QnqE9xAWUH9c78Rlv7A72jjtvXGZspdKJ+jfahqg8DD4NrqbSqkMmT3cN0Pvn57sv3sMM6uiY59cVRA/niqIEdXY19cuKIMk4cUdZm5R3kH6Zj5bKPYy1wSOD5EJ8Wmsd3f/UFatIcmyq9BjjAl5HqtYwxxuRYLoPKm8AIERkqIoXAVKAyKU8lcLHfPhd4xc8sqASmikiRiAwFRgBvpCrTH/OqLwNf5m9y+N6MMcaEyFn3lx8jmQY8D0SA2aq6WERuxU1HqwQeBZ4QkWpgMy5I4PM9DSwBosCVqhoDCCvTv+QNwDwRuQ34hy/bGGNMO8rZ7K+uoNtMKTbGmDaUbvZXF5g3aowxpquwoGKMMabNWFAxxhjTZiyoGGOMaTPdeqBeRDYCHwaS+gOd8eYdVq+W66x166z1gs5bN6tXy+W6boepauiVq906qCQTkapUMxo6ktWr5Tpr3TprvaDz1s3q1XIdWTfr/jLGGNNmLKgYY4xpMxZUmnu4oyuQgtWr5Tpr3TprvaDz1s3q1XIdVjcbUzHGGNNmrKVijDGmzVhQMcYY02a6TVARkfNEZLGIxEWkIpB+mogsFJF3/d8vBPb9UUSWicgi/xjg04tE5CkRqRaR10WkvK3r5ffN8K+xTEROD6RP9GnVIjI9kD7U16fa16+wtfUKqedTgfOwSkQW+fRyEdkd2PdQ4Jhx/rxWi8h9Im1/i0kRuVlE1gZe/0uBfS06fzmo210i8r6IvCMiz4rIAT69Q89ZSD3b5XykeO1DRORVEVni/x9c5dNb/LnmqH6r/OexSESqfNqBIvKiiCz3f/v5dPGfWbX/zI/JUZ1GBs7LIhHZLiJXd5Zz5u9Rvf8/gE8DI4E/AhWB9KOBg/32kcDawL5meQPp3wIe8ttTgadyUK9RwNtAETAUWIFb7j/it4cBhT7PKH/M08BUv/0Q8B85Opc/Bmb67XLgvRT53gDG4+4S+wfgjBzU5Wbg2pD0Fp+/HNRtApDvt+8E7uwM5yzp9drtfKR4/UHAMX67N/BP/9m16HPNYf1WAf2T0n4ETPfb0wOf65f8Zyb+M3y9Hc5fBPgYOKyznLNu01JR1aWquiwk/R+qus4/XQyUiEhRhuKmAHP89nzg1Nb+okxVL/8a81S1XlU/AKqB4/yjWlVXqmoDMA+Y4l//C74++Pqd1Zo6peNf5yvA3Az5BgF9VPXv6v5lP56L+qTRovOXiwqo6guqGvVP/467I2lKHXTO2u18hFHV9ar6lt/eASwFBqc5JNXn2p6C//+D/8+mAI+r83fc3WgH5bgupwIrVPXDNHna9Zx1m6CSpXOAt1S1PpD2C9+U/G4gcAwGVoO7GRmwDSht47o0vYa3xqelSi8Ftga+xBLpbe1EYIOqLg+kDRWRf4jIn0TkxED914TUMxem+e6G2YmuCFp+/nLtG7hfsQkdfc4SOup87EVcN/LRwOs+qSWfa64o8IK4rvHLfdpAVV3vtz8GBnZQ3cD1lAR/4HX4OcvZnR87goi8BBwUsutGVU17e2ER+Qyui2JCIPlCVV0rIr2BXwFfx/16bLd6tacs63k+zf8RrwcOVdUaERkHLPDnsl3qBfwU+D7uP//3cV1z32jL129t3RLnTERuxN3B9Em/L+fnrKsRkV64/2NXq+p2EenQzzXgc/47YADwooi8H9ypqioiHXJdhrgx08nADJ/UKc7ZfhVUVPWLrTlORIYAzwIXqeqKQHlr/d8dIvK/uCbj48Ba4BBgjYjkA32BmjauV+I1Eob4NFKk1+Ca2/m+tRLMn5VM9fTv9V+BcYFj6oF6v71QRFYAR/jXDnb3tLg+2dYrUL9HgN/5py09f62SxTm7BJgEnOq7tNrlnLVAuvPULkSkABdQnlTVXwOo6obA/mw/1zYX+A74RESexX0HbBCRQaq63ndvfdIRdQPOwPWsbPB17BTnrNt3f4mbkfN73MDb/wuk54tIf79dgPtieM/vrgQu9tvnAq8kvjDaUCUwVdxMs6HACNwg7pvACHEzvQpxzd9K//qv+vrg69fWraAvAu+ralMXjYiUiUjEbw/z9Vzpuwe2i8h43214UQ7qkxiHSDib5p9R1uevrevl6zYRuB6YrKq1gfQOPWdJ2u18hPHv81FgqareE0hv6eeai7r19L0UiEhPXC/GezT//x/8f1YJXCTOeGBboJssF5r1GnSGcwZ0q9lfZ+P6EuuBDcDzPv0mYBewKPAYAPQEFgLv4Abw/wc/YwIoBp7BDXi9AQxr63r5fTfiZmosIzALCDfL5J9+342B9GG+PtW+fkVtfA4fA65ISjvHn59FwFvAmYF9Fbh/2CuA+/ErOLRxnZ4A3vWfUyUwqLXnLwd1q8b1ZSf+XSVmDHboOQupZ7ucjxSv/Tlcd807gfP0pdZ8rjmo2zDcrKm3/ed1o08vBV4GlgMvAQf6dAEe8HV7l5CZo21Yt5643om+gbQOP2eqasu0GGOMaTvdvvvLGGNM27GgYowxps1YUDHGGNNmLKgYY4xpMxZUjDHGtBkLKsYYY9qMBRVjjDFt5v8DB6LALhwtNDUAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# The distribution of the variable changed\n", "# after the transformation.\n", "\n", "fig = plt.figure()\n", "ax = fig.add_subplot(111)\n", "X_train['LotFrontage'].plot(kind='kde', ax=ax)\n", "train_t['LotFrontage'].plot(kind='kde', ax=ax, color='red')\n", "lines, labels = ax.get_legend_handles_labels()\n", "ax.legend(lines, labels, loc='best')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Automatically select all variables\n", "\n", "We can impute all numerical variables with the same value automatically with this transformer. We need to leave the parameter `variables` to None." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "ArbitraryNumberImputer(arbitrary_number=-1)" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Let's create an instance of the imputer where we impute\n", "# 2 variables with the same arbitraty number.\n", "\n", "imputer = ArbitraryNumberImputer(\n", " arbitrary_number=-1,\n", ")\n", "\n", "imputer.fit(X_train)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['MSSubClass',\n", " 'LotFrontage',\n", " 'LotArea',\n", " 'OverallQual',\n", " 'OverallCond',\n", " 'YearBuilt',\n", " 'YearRemodAdd',\n", " 'MasVnrArea',\n", " 'BsmtFinSF1',\n", " 'BsmtFinSF2',\n", " 'BsmtUnfSF',\n", " 'TotalBsmtSF',\n", " '1stFlrSF',\n", " '2ndFlrSF',\n", " 'LowQualFinSF',\n", " 'GrLivArea',\n", " 'BsmtFullBath',\n", " 'BsmtHalfBath',\n", " 'FullBath',\n", " 'HalfBath',\n", " 'BedroomAbvGr',\n", " 'KitchenAbvGr',\n", " 'TotRmsAbvGrd',\n", " 'Fireplaces',\n", " 'GarageYrBlt',\n", " 'GarageCars',\n", " 'GarageArea',\n", " 'WoodDeckSF',\n", " 'OpenPorchSF',\n", " 'EnclosedPorch',\n", " '3SsnPorch',\n", " 'ScreenPorch',\n", " 'PoolArea',\n", " 'MiscVal',\n", " 'MoSold',\n", " 'YrSold']" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# The imputer finds all numerical variables\n", "# automatically.\n", "\n", "imputer.variables_" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'MSSubClass': -1,\n", " 'LotFrontage': -1,\n", " 'LotArea': -1,\n", " 'OverallQual': -1,\n", " 'OverallCond': -1,\n", " 'YearBuilt': -1,\n", " 'YearRemodAdd': -1,\n", " 'MasVnrArea': -1,\n", " 'BsmtFinSF1': -1,\n", " 'BsmtFinSF2': -1,\n", " 'BsmtUnfSF': -1,\n", " 'TotalBsmtSF': -1,\n", " '1stFlrSF': -1,\n", " '2ndFlrSF': -1,\n", " 'LowQualFinSF': -1,\n", " 'GrLivArea': -1,\n", " 'BsmtFullBath': -1,\n", " 'BsmtHalfBath': -1,\n", " 'FullBath': -1,\n", " 'HalfBath': -1,\n", " 'BedroomAbvGr': -1,\n", " 'KitchenAbvGr': -1,\n", " 'TotRmsAbvGrd': -1,\n", " 'Fireplaces': -1,\n", " 'GarageYrBlt': -1,\n", " 'GarageCars': -1,\n", " 'GarageArea': -1,\n", " 'WoodDeckSF': -1,\n", " 'OpenPorchSF': -1,\n", " 'EnclosedPorch': -1,\n", " '3SsnPorch': -1,\n", " 'ScreenPorch': -1,\n", " 'PoolArea': -1,\n", " 'MiscVal': -1,\n", " 'MoSold': -1,\n", " 'YrSold': -1}" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# We find the imputation value in the dictionary\n", "\n", "imputer.imputer_dict_" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "# now we impute the missing data\n", "\n", "train_t = imputer.transform(X_train)\n", "test_t = imputer.transform(X_test)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Sanity check:\n", "\n", "# No numerical variable with NA is left in the\n", "# transformed data.\n", "\n", "[v for v in train_t.columns if train_t[v].dtypes !=\n", " 'O' and train_t[v].isnull().sum() > 1]" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['MSSubClass',\n", " 'MSZoning',\n", " 'LotFrontage',\n", " 'LotArea',\n", " 'Street',\n", " 'Alley',\n", " 'LotShape',\n", " 'LandContour',\n", " 'Utilities',\n", " 'LotConfig',\n", " 'LandSlope',\n", " 'Neighborhood',\n", " 'Condition1',\n", " 'Condition2',\n", " 'BldgType',\n", " 'HouseStyle',\n", " 'OverallQual',\n", " 'OverallCond',\n", " 'YearBuilt',\n", " 'YearRemodAdd',\n", " 'RoofStyle',\n", " 'RoofMatl',\n", " 'Exterior1st',\n", " 'Exterior2nd',\n", " 'MasVnrType',\n", " 'MasVnrArea',\n", " 'ExterQual',\n", " 'ExterCond',\n", " 'Foundation',\n", " 'BsmtQual',\n", " 'BsmtCond',\n", " 'BsmtExposure',\n", " 'BsmtFinType1',\n", " 'BsmtFinSF1',\n", " 'BsmtFinType2',\n", " 'BsmtFinSF2',\n", " 'BsmtUnfSF',\n", " 'TotalBsmtSF',\n", " 'Heating',\n", " 'HeatingQC',\n", " 'CentralAir',\n", " 'Electrical',\n", " '1stFlrSF',\n", " '2ndFlrSF',\n", " 'LowQualFinSF',\n", " 'GrLivArea',\n", " 'BsmtFullBath',\n", " 'BsmtHalfBath',\n", " 'FullBath',\n", " 'HalfBath',\n", " 'BedroomAbvGr',\n", " 'KitchenAbvGr',\n", " 'KitchenQual',\n", " 'TotRmsAbvGrd',\n", " 'Functional',\n", " 'Fireplaces',\n", " 'FireplaceQu',\n", " 'GarageType',\n", " 'GarageYrBlt',\n", " 'GarageFinish',\n", " 'GarageCars',\n", " 'GarageArea',\n", " 'GarageQual',\n", " 'GarageCond',\n", " 'PavedDrive',\n", " 'WoodDeckSF',\n", " 'OpenPorchSF',\n", " 'EnclosedPorch',\n", " '3SsnPorch',\n", " 'ScreenPorch',\n", " 'PoolArea',\n", " 'PoolQC',\n", " 'Fence',\n", " 'MiscFeature',\n", " 'MiscVal',\n", " 'MoSold',\n", " 'YrSold',\n", " 'SaleType',\n", " 'SaleCondition']" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# New: we can get the name of the features in the final output\n", "imputer.get_feature_names_out()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "fenotebook", "language": "python", "name": "fenotebook" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.2" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": true } }, "nbformat": 4, "nbformat_minor": 4 }