{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Tabular data handling"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This module defines the main class to handle tabular data in the fastai library: [`TabularDataBunch`](/tabular.data.html#TabularDataBunch). As always, there is also a helper function to quickly get your data.\n",
    "\n",
    "To allow you to easily create a [`Learner`](/basic_train.html#Learner) for your data, it provides [`tabular_learner`](/tabular.data.html#tabular_learner)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hide_input": true
   },
   "outputs": [],
   "source": [
    "from fastai.gen_doc.nbdoc import *\n",
    "from fastai.tabular import * \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hide_input": true
   },
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<h2 id=\"TabularDataBunch\" class=\"doc_header\"><code>class</code> <code>TabularDataBunch</code><a href=\"https://github.com/fastai/fastai/blob/master/fastai/tabular/data.py#L85\" class=\"source_link\" style=\"float:right\">[source]</a><a class=\"source_link\" data-toggle=\"collapse\" data-target=\"#TabularDataBunch-pytest\" style=\"float:right; padding-right:10px\">[test]</a></h2>\n",
       "\n",
       "> <code>TabularDataBunch</code>(**`train_dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), **`valid_dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), **`fix_dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader)=***`None`***, **`test_dl`**:`Optional`\\[[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader)\\]=***`None`***, **`device`**:[`device`](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device)=***`None`***, **`dl_tfms`**:`Optional`\\[`Collection`\\[`Callable`\\]\\]=***`None`***, **`path`**:`PathOrStr`=***`'.'`***, **`collate_fn`**:`Callable`=***`'data_collate'`***, **`no_check`**:`bool`=***`False`***) :: [`DataBunch`](/basic_data.html#DataBunch)\n",
       "\n",
       "<div class=\"collapse\" id=\"TabularDataBunch-pytest\"><div class=\"card card-body pytest_card\"><a type=\"button\" data-toggle=\"collapse\" data-target=\"#TabularDataBunch-pytest\" class=\"close\" aria-label=\"Close\"><span aria-hidden=\"true\">&times;</span></a><p>No tests found for <code>TabularDataBunch</code>. To contribute a test please refer to <a href=\"/dev/test.html\">this guide</a> and <a href=\"https://forums.fast.ai/t/improving-expanding-functional-tests/32929\">this discussion</a>.</p></div></div>\n",
       "\n",
       "Create a [`DataBunch`](/basic_data.html#DataBunch) suitable for tabular data.  "
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_doc(TabularDataBunch)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The best way to quickly get your data in a [`DataBunch`](/basic_data.html#DataBunch) suitable for tabular data is to organize it in two (or three) dataframes. One for training, one for validation, and if you have it, one for testing. Here we are interested in a subsample of the [adult dataset](https://archive.ics.uci.edu/ml/datasets/adult)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capital-gain</th>\n",
       "      <th>capital-loss</th>\n",
       "      <th>hours-per-week</th>\n",
       "      <th>native-country</th>\n",
       "      <th>salary</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>49</td>\n",
       "      <td>Private</td>\n",
       "      <td>101320</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>12.0</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Wife</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>1902</td>\n",
       "      <td>40</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;=50k</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>44</td>\n",
       "      <td>Private</td>\n",
       "      <td>236746</td>\n",
       "      <td>Masters</td>\n",
       "      <td>14.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>10520</td>\n",
       "      <td>0</td>\n",
       "      <td>45</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;=50k</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>38</td>\n",
       "      <td>Private</td>\n",
       "      <td>96185</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>32</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;50k</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>38</td>\n",
       "      <td>Self-emp-inc</td>\n",
       "      <td>112847</td>\n",
       "      <td>Prof-school</td>\n",
       "      <td>15.0</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>40</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;=50k</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>42</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>82297</td>\n",
       "      <td>7th-8th</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Wife</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;50k</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   age          workclass  fnlwgt     education  education-num  \\\n",
       "0   49            Private  101320    Assoc-acdm           12.0   \n",
       "1   44            Private  236746       Masters           14.0   \n",
       "2   38            Private   96185       HS-grad            NaN   \n",
       "3   38       Self-emp-inc  112847   Prof-school           15.0   \n",
       "4   42   Self-emp-not-inc   82297       7th-8th            NaN   \n",
       "\n",
       "        marital-status        occupation    relationship                 race  \\\n",
       "0   Married-civ-spouse               NaN            Wife                White   \n",
       "1             Divorced   Exec-managerial   Not-in-family                White   \n",
       "2             Divorced               NaN       Unmarried                Black   \n",
       "3   Married-civ-spouse    Prof-specialty         Husband   Asian-Pac-Islander   \n",
       "4   Married-civ-spouse     Other-service            Wife                Black   \n",
       "\n",
       "       sex  capital-gain  capital-loss  hours-per-week  native-country salary  \n",
       "0   Female             0          1902              40   United-States  >=50k  \n",
       "1     Male         10520             0              45   United-States  >=50k  \n",
       "2   Female             0             0              32   United-States   <50k  \n",
       "3     Male             0             0              40   United-States  >=50k  \n",
       "4   Female             0             0              50   United-States   <50k  "
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "path = untar_data(URLs.ADULT_SAMPLE)\n",
    "df = pd.read_csv(path/'adult.csv')\n",
    "valid_idx = range(len(df)-2000, len(df))\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cat_names = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race', 'sex', 'native-country']\n",
    "dep_var = 'salary'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The initialization of [`TabularDataBunch`](/tabular.data.html#TabularDataBunch) is the same as [`DataBunch`](/basic_data.html#DataBunch) so you really want to use the factory method instead."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hide_input": true
   },
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<h4 id=\"TabularDataBunch.from_df\" class=\"doc_header\"><code>from_df</code><a href=\"https://github.com/fastai/fastai/blob/master/fastai/tabular/data.py#L87\" class=\"source_link\" style=\"float:right\">[source]</a><a class=\"source_link\" data-toggle=\"collapse\" data-target=\"#TabularDataBunch-from_df-pytest\" style=\"float:right; padding-right:10px\">[test]</a></h4>\n",
       "\n",
       "> <code>from_df</code>(**`path`**, **`df`**:`DataFrame`, **`dep_var`**:`str`, **`valid_idx`**:`Collection`\\[`int`\\], **`procs`**:`Optional`\\[`Collection`\\[[`TabularProc`](/tabular.transform.html#TabularProc)\\]\\]=***`None`***, **`cat_names`**:`OptStrList`=***`None`***, **`cont_names`**:`OptStrList`=***`None`***, **`classes`**:`Collection`\\[`T_co`\\]=***`None`***, **`test_df`**=***`None`***, **`bs`**:`int`=***`64`***, **`val_bs`**:`int`=***`None`***, **`num_workers`**:`int`=***`4`***, **`dl_tfms`**:`Optional`\\[`Collection`\\[`Callable`\\]\\]=***`None`***, **`device`**:[`device`](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device)=***`None`***, **`collate_fn`**:`Callable`=***`'data_collate'`***, **`no_check`**:`bool`=***`False`***) → [`DataBunch`](/basic_data.html#DataBunch)\n",
       "\n",
       "<div class=\"collapse\" id=\"TabularDataBunch-from_df-pytest\"><div class=\"card card-body pytest_card\"><a type=\"button\" data-toggle=\"collapse\" data-target=\"#TabularDataBunch-from_df-pytest\" class=\"close\" aria-label=\"Close\"><span aria-hidden=\"true\">&times;</span></a><p>Tests found for <code>from_df</code>:</p><p>Some other tests where <code>from_df</code> is used:</p><ul><li><code>pytest -sv tests/test_tabular_data.py::test_from_df</code> <a href=\"https://github.com/fastai/fastai/blob/master/tests/test_tabular_data.py#L5\" class=\"source_link\" style=\"float:right\">[source]</a></li></ul><p>To run tests please refer to this <a href=\"/dev/test.html#quick-guide\">guide</a>.</p></div></div>\n",
       "\n",
       "Create a [`DataBunch`](/basic_data.html#DataBunch) from `df` and `valid_idx` with `dep_var`. `kwargs` are passed to [`DataBunch.create`](/basic_data.html#DataBunch.create).  "
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_doc(TabularDataBunch.from_df)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Optionally, use `test_df` for the test set. The dependent variable is `dep_var`, while the categorical and continuous variables are in the `cat_names` columns and `cont_names` columns respectively. If `cont_names` is None then we assume all variables that aren't dependent or categorical are continuous. The [`TabularProcessor`](/tabular.data.html#TabularProcessor) in `procs` are applied to the dataframes as preprocessing, then the categories are replaced by their codes+1 (leaving 0 for `nan`) and the continuous variables are normalized. \n",
    "\n",
    "Note that the [`TabularProcessor`](/tabular.data.html#TabularProcessor) should be passed as `Callable`: the actual initialization with `cat_names` and `cont_names` is done during the preprocessing."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "procs = [FillMissing, Categorify, Normalize]\n",
    "data = TabularDataBunch.from_df(path, df, dep_var, valid_idx=valid_idx, procs=procs, cat_names=cat_names)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " You can then easily create a [`Learner`](/basic_train.html#Learner) for this data with [`tabular_learner`](/tabular.data.html#tabular_learner)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hide_input": true
   },
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<h4 id=\"tabular_learner\" class=\"doc_header\"><code>tabular_learner</code><a href=\"https://github.com/fastai/fastai/blob/master/fastai/tabular/data.py#L171\" class=\"source_link\" style=\"float:right\">[source]</a><a class=\"source_link\" data-toggle=\"collapse\" data-target=\"#tabular_learner-pytest\" style=\"float:right; padding-right:10px\">[test]</a></h4>\n",
       "\n",
       "> <code>tabular_learner</code>(**`data`**:[`DataBunch`](/basic_data.html#DataBunch), **`layers`**:`Collection`\\[`int`\\], **`emb_szs`**:`Dict`\\[`str`, `int`\\]=***`None`***, **`metrics`**=***`None`***, **`ps`**:`Collection`\\[`float`\\]=***`None`***, **`emb_drop`**:`float`=***`0.0`***, **`y_range`**:`OptRange`=***`None`***, **`use_bn`**:`bool`=***`True`***, **\\*\\*`learn_kwargs`**)\n",
       "\n",
       "<div class=\"collapse\" id=\"tabular_learner-pytest\"><div class=\"card card-body pytest_card\"><a type=\"button\" data-toggle=\"collapse\" data-target=\"#tabular_learner-pytest\" class=\"close\" aria-label=\"Close\"><span aria-hidden=\"true\">&times;</span></a><p>No tests found for <code>tabular_learner</code>. To contribute a test please refer to <a href=\"/dev/test.html\">this guide</a> and <a href=\"https://forums.fast.ai/t/improving-expanding-functional-tests/32929\">this discussion</a>.</p></div></div>\n",
       "\n",
       "Get a [`Learner`](/basic_train.html#Learner) using `data`, with `metrics`, including a [`TabularModel`](/tabular.models.html#TabularModel) created using the remaining params.  "
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_doc(tabular_learner)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`emb_szs` is a `dict` mapping categorical column names to embedding sizes; you only need to pass sizes for columns where you want to override the default behaviour of the model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hide_input": true
   },
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<h2 id=\"TabularList\" class=\"doc_header\"><code>class</code> <code>TabularList</code><a href=\"https://github.com/fastai/fastai/blob/master/fastai/tabular/data.py#L104\" class=\"source_link\" style=\"float:right\">[source]</a><a class=\"source_link\" data-toggle=\"collapse\" data-target=\"#TabularList-pytest\" style=\"float:right; padding-right:10px\">[test]</a></h2>\n",
       "\n",
       "> <code>TabularList</code>(**`items`**:`Iterator`\\[`T_co`\\], **`cat_names`**:`OptStrList`=***`None`***, **`cont_names`**:`OptStrList`=***`None`***, **`procs`**=***`None`***, **\\*\\*`kwargs`**) → `TabularList` :: [`ItemList`](/data_block.html#ItemList)\n",
       "\n",
       "<div class=\"collapse\" id=\"TabularList-pytest\"><div class=\"card card-body pytest_card\"><a type=\"button\" data-toggle=\"collapse\" data-target=\"#TabularList-pytest\" class=\"close\" aria-label=\"Close\"><span aria-hidden=\"true\">&times;</span></a><p>Tests found for <code>TabularList</code>:</p><p>Some other tests where <code>TabularList</code> is used:</p><ul><li><code>pytest -sv tests/test_tabular_data.py::test_from_df</code> <a href=\"https://github.com/fastai/fastai/blob/master/tests/test_tabular_data.py#L5\" class=\"source_link\" style=\"float:right\">[source]</a></li></ul><p>To run tests please refer to this <a href=\"/dev/test.html#quick-guide\">guide</a>.</p></div></div>\n",
       "\n",
       "Basic [`ItemList`](/data_block.html#ItemList) for tabular data.  "
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_doc(TabularList)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Basic class to create a list of inputs in `items` for tabular data. `cat_names` and `cont_names` are the names of the categorical and the continuous variables respectively. `processor` will be applied to the inputs or one will be created from the transforms in `procs`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hide_input": true
   },
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<h4 id=\"TabularList.from_df\" class=\"doc_header\"><code>from_df</code><a href=\"https://github.com/fastai/fastai/blob/master/fastai/tabular/data.py#L119\" class=\"source_link\" style=\"float:right\">[source]</a><a class=\"source_link\" data-toggle=\"collapse\" data-target=\"#TabularList-from_df-pytest\" style=\"float:right; padding-right:10px\">[test]</a></h4>\n",
       "\n",
       "> <code>from_df</code>(**`df`**:`DataFrame`, **`cat_names`**:`OptStrList`=***`None`***, **`cont_names`**:`OptStrList`=***`None`***, **`procs`**=***`None`***, **\\*\\*`kwargs`**) → `ItemList`\n",
       "\n",
       "<div class=\"collapse\" id=\"TabularList-from_df-pytest\"><div class=\"card card-body pytest_card\"><a type=\"button\" data-toggle=\"collapse\" data-target=\"#TabularList-from_df-pytest\" class=\"close\" aria-label=\"Close\"><span aria-hidden=\"true\">&times;</span></a><p>Tests found for <code>from_df</code>:</p><ul><li><code>pytest -sv tests/test_tabular_data.py::test_from_df</code> <a href=\"https://github.com/fastai/fastai/blob/master/tests/test_tabular_data.py#L5\" class=\"source_link\" style=\"float:right\">[source]</a></li></ul><p>To run tests please refer to this <a href=\"/dev/test.html#quick-guide\">guide</a>.</p></div></div>\n",
       "\n",
       "Get the list of inputs in the `col` of `path/csv_name`.  "
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_doc(TabularList.from_df)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hide_input": true
   },
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<h4 id=\"TabularList.get_emb_szs\" class=\"doc_header\"><code>get_emb_szs</code><a href=\"https://github.com/fastai/fastai/blob/master/fastai/tabular/data.py#L130\" class=\"source_link\" style=\"float:right\">[source]</a><a class=\"source_link\" data-toggle=\"collapse\" data-target=\"#TabularList-get_emb_szs-pytest\" style=\"float:right; padding-right:10px\">[test]</a></h4>\n",
       "\n",
       "> <code>get_emb_szs</code>(**`sz_dict`**=***`None`***)\n",
       "\n",
       "<div class=\"collapse\" id=\"TabularList-get_emb_szs-pytest\"><div class=\"card card-body pytest_card\"><a type=\"button\" data-toggle=\"collapse\" data-target=\"#TabularList-get_emb_szs-pytest\" class=\"close\" aria-label=\"Close\"><span aria-hidden=\"true\">&times;</span></a><p>No tests found for <code>get_emb_szs</code>. To contribute a test please refer to <a href=\"/dev/test.html\">this guide</a> and <a href=\"https://forums.fast.ai/t/improving-expanding-functional-tests/32929\">this discussion</a>.</p></div></div>\n",
       "\n",
       "Return the default embedding sizes suitable for this data or takes the ones in `sz_dict`.  "
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_doc(TabularList.get_emb_szs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hide_input": true
   },
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<h4 id=\"TabularList.show_xys\" class=\"doc_header\"><code>show_xys</code><a href=\"https://github.com/fastai/fastai/blob/master/fastai/tabular/data.py#L137\" class=\"source_link\" style=\"float:right\">[source]</a><a class=\"source_link\" data-toggle=\"collapse\" data-target=\"#TabularList-show_xys-pytest\" style=\"float:right; padding-right:10px\">[test]</a></h4>\n",
       "\n",
       "> <code>show_xys</code>(**`xs`**, **`ys`**)\n",
       "\n",
       "<div class=\"collapse\" id=\"TabularList-show_xys-pytest\"><div class=\"card card-body pytest_card\"><a type=\"button\" data-toggle=\"collapse\" data-target=\"#TabularList-show_xys-pytest\" class=\"close\" aria-label=\"Close\"><span aria-hidden=\"true\">&times;</span></a><p>No tests found for <code>show_xys</code>. To contribute a test please refer to <a href=\"/dev/test.html\">this guide</a> and <a href=\"https://forums.fast.ai/t/improving-expanding-functional-tests/32929\">this discussion</a>.</p></div></div>\n",
       "\n",
       "Show the `xs` (inputs) and `ys` (targets).  "
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_doc(TabularList.show_xys)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hide_input": true
   },
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<h4 id=\"TabularList.show_xyzs\" class=\"doc_header\"><code>show_xyzs</code><a href=\"https://github.com/fastai/fastai/blob/master/fastai/tabular/data.py#L154\" class=\"source_link\" style=\"float:right\">[source]</a><a class=\"source_link\" data-toggle=\"collapse\" data-target=\"#TabularList-show_xyzs-pytest\" style=\"float:right; padding-right:10px\">[test]</a></h4>\n",
       "\n",
       "> <code>show_xyzs</code>(**`xs`**, **`ys`**, **`zs`**)\n",
       "\n",
       "<div class=\"collapse\" id=\"TabularList-show_xyzs-pytest\"><div class=\"card card-body pytest_card\"><a type=\"button\" data-toggle=\"collapse\" data-target=\"#TabularList-show_xyzs-pytest\" class=\"close\" aria-label=\"Close\"><span aria-hidden=\"true\">&times;</span></a><p>No tests found for <code>show_xyzs</code>. To contribute a test please refer to <a href=\"/dev/test.html\">this guide</a> and <a href=\"https://forums.fast.ai/t/improving-expanding-functional-tests/32929\">this discussion</a>.</p></div></div>\n",
       "\n",
       "Show `xs` (inputs), `ys` (targets) and `zs` (predictions).  "
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_doc(TabularList.show_xyzs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hide_input": true
   },
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<h2 id=\"TabularLine\" class=\"doc_header\"><code>class</code> <code>TabularLine</code><a href=\"https://github.com/fastai/fastai/blob/master/fastai/tabular/data.py#L24\" class=\"source_link\" style=\"float:right\">[source]</a><a class=\"source_link\" data-toggle=\"collapse\" data-target=\"#TabularLine-pytest\" style=\"float:right; padding-right:10px\">[test]</a></h2>\n",
       "\n",
       "> <code>TabularLine</code>(**`cats`**, **`conts`**, **`classes`**, **`names`**) :: [`ItemBase`](/core.html#ItemBase)\n",
       "\n",
       "<div class=\"collapse\" id=\"TabularLine-pytest\"><div class=\"card card-body pytest_card\"><a type=\"button\" data-toggle=\"collapse\" data-target=\"#TabularLine-pytest\" class=\"close\" aria-label=\"Close\"><span aria-hidden=\"true\">&times;</span></a><p>No tests found for <code>TabularLine</code>. To contribute a test please refer to <a href=\"/dev/test.html\">this guide</a> and <a href=\"https://forums.fast.ai/t/improving-expanding-functional-tests/32929\">this discussion</a>.</p></div></div>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_doc(TabularLine, doc_string=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An object that will contain the encoded `cats`, the continuous variables `conts`, the `classes` and the `names` of the columns. This is the basic input for a dataset dealing with tabular data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hide_input": true
   },
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<h2 id=\"TabularProcessor\" class=\"doc_header\"><code>class</code> <code>TabularProcessor</code><a href=\"https://github.com/fastai/fastai/blob/master/fastai/tabular/data.py#L38\" class=\"source_link\" style=\"float:right\">[source]</a><a class=\"source_link\" data-toggle=\"collapse\" data-target=\"#TabularProcessor-pytest\" style=\"float:right; padding-right:10px\">[test]</a></h2>\n",
       "\n",
       "> <code>TabularProcessor</code>(**`ds`**:[`ItemBase`](/core.html#ItemBase)=***`None`***, **`procs`**=***`None`***) :: [`PreProcessor`](/data_block.html#PreProcessor)\n",
       "\n",
       "<div class=\"collapse\" id=\"TabularProcessor-pytest\"><div class=\"card card-body pytest_card\"><a type=\"button\" data-toggle=\"collapse\" data-target=\"#TabularProcessor-pytest\" class=\"close\" aria-label=\"Close\"><span aria-hidden=\"true\">&times;</span></a><p>No tests found for <code>TabularProcessor</code>. To contribute a test please refer to <a href=\"/dev/test.html\">this guide</a> and <a href=\"https://forums.fast.ai/t/improving-expanding-functional-tests/32929\">this discussion</a>.</p></div></div>\n",
       "\n",
       "Regroup the `procs` in one [`PreProcessor`](/data_block.html#PreProcessor).  "
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_doc(TabularProcessor)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create a [`PreProcessor`](/data_block.html#PreProcessor) from `procs`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Undocumented Methods - Methods moved below this line will intentionally be hidden"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<h4 id=\"TabularProcessor.process_one\" class=\"doc_header\"><code>process_one</code><a href=\"https://github.com/fastai/fastai/blob/master/fastai/tabular/data.py#L44\" class=\"source_link\" style=\"float:right\">[source]</a><a class=\"source_link\" data-toggle=\"collapse\" data-target=\"#TabularProcessor-process_one-pytest\" style=\"float:right; padding-right:10px\">[test]</a></h4>\n",
       "\n",
       "> <code>process_one</code>(**`item`**)\n",
       "\n",
       "<div class=\"collapse\" id=\"TabularProcessor-process_one-pytest\"><div class=\"card card-body pytest_card\"><a type=\"button\" data-toggle=\"collapse\" data-target=\"#TabularProcessor-process_one-pytest\" class=\"close\" aria-label=\"Close\"><span aria-hidden=\"true\">&times;</span></a><p>No tests found for <code>process_one</code>. To contribute a test please refer to <a href=\"/dev/test.html\">this guide</a> and <a href=\"https://forums.fast.ai/t/improving-expanding-functional-tests/32929\">this discussion</a>.</p></div></div>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_doc(TabularProcessor.process_one)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<h4 id=\"ItemList.new\" class=\"doc_header\"><code>new</code><a href=\"https://github.com/fastai/fastai/blob/master/fastai/data_block.py#L101\" class=\"source_link\" style=\"float:right\">[source]</a><a class=\"source_link\" data-toggle=\"collapse\" data-target=\"#ItemList-new-pytest\" style=\"float:right; padding-right:10px\">[test]</a></h4>\n",
       "\n",
       "> <code>new</code>(**`items`**:`Iterator`\\[`T_co`\\], **`processor`**:`Union`\\[[`PreProcessor`](/data_block.html#PreProcessor), `Collection`\\[[`PreProcessor`](/data_block.html#PreProcessor)\\]\\]=***`None`***, **\\*\\*`kwargs`**) → `ItemList`\n",
       "\n",
       "<div class=\"collapse\" id=\"ItemList-new-pytest\"><div class=\"card card-body pytest_card\"><a type=\"button\" data-toggle=\"collapse\" data-target=\"#ItemList-new-pytest\" class=\"close\" aria-label=\"Close\"><span aria-hidden=\"true\">&times;</span></a><p>No tests found for <code>new</code>. To contribute a test please refer to <a href=\"/dev/test.html\">this guide</a> and <a href=\"https://forums.fast.ai/t/improving-expanding-functional-tests/32929\">this discussion</a>.</p></div></div>\n",
       "\n",
       "Create a new [`ItemList`](/data_block.html#ItemList) from `items`, keeping the same attributes.  "
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_doc(TabularList.new)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<h4 id=\"TabularList.get\" class=\"doc_header\"><code>get</code><a href=\"https://github.com/fastai/fastai/blob/master/fastai/tabular/data.py#L124\" class=\"source_link\" style=\"float:right\">[source]</a><a class=\"source_link\" data-toggle=\"collapse\" data-target=\"#TabularList-get-pytest\" style=\"float:right; padding-right:10px\">[test]</a></h4>\n",
       "\n",
       "> <code>get</code>(**`o`**)\n",
       "\n",
       "<div class=\"collapse\" id=\"TabularList-get-pytest\"><div class=\"card card-body pytest_card\"><a type=\"button\" data-toggle=\"collapse\" data-target=\"#TabularList-get-pytest\" class=\"close\" aria-label=\"Close\"><span aria-hidden=\"true\">&times;</span></a><p>No tests found for <code>get</code>. To contribute a test please refer to <a href=\"/dev/test.html\">this guide</a> and <a href=\"https://forums.fast.ai/t/improving-expanding-functional-tests/32929\">this discussion</a>.</p></div></div>\n",
       "\n",
       "Subclass if you want to customize how to create item `i` from `self.items`.  "
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_doc(TabularList.get)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<h4 id=\"TabularProcessor.process\" class=\"doc_header\"><code>process</code><a href=\"https://github.com/fastai/fastai/blob/master/fastai/tabular/data.py#L57\" class=\"source_link\" style=\"float:right\">[source]</a><a class=\"source_link\" data-toggle=\"collapse\" data-target=\"#TabularProcessor-process-pytest\" style=\"float:right; padding-right:10px\">[test]</a></h4>\n",
       "\n",
       "> <code>process</code>(**`ds`**)\n",
       "\n",
       "<div class=\"collapse\" id=\"TabularProcessor-process-pytest\"><div class=\"card card-body pytest_card\"><a type=\"button\" data-toggle=\"collapse\" data-target=\"#TabularProcessor-process-pytest\" class=\"close\" aria-label=\"Close\"><span aria-hidden=\"true\">&times;</span></a><p>No tests found for <code>process</code>. To contribute a test please refer to <a href=\"/dev/test.html\">this guide</a> and <a href=\"https://forums.fast.ai/t/improving-expanding-functional-tests/32929\">this discussion</a>.</p></div></div>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_doc(TabularProcessor.process)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<h4 id=\"TabularList.reconstruct\" class=\"doc_header\"><code>reconstruct</code><a href=\"https://github.com/fastai/fastai/blob/master/fastai/tabular/data.py#L134\" class=\"source_link\" style=\"float:right\">[source]</a><a class=\"source_link\" data-toggle=\"collapse\" data-target=\"#TabularList-reconstruct-pytest\" style=\"float:right; padding-right:10px\">[test]</a></h4>\n",
       "\n",
       "> <code>reconstruct</code>(**`t`**:`Tensor`)\n",
       "\n",
       "<div class=\"collapse\" id=\"TabularList-reconstruct-pytest\"><div class=\"card card-body pytest_card\"><a type=\"button\" data-toggle=\"collapse\" data-target=\"#TabularList-reconstruct-pytest\" class=\"close\" aria-label=\"Close\"><span aria-hidden=\"true\">&times;</span></a><p>No tests found for <code>reconstruct</code>. To contribute a test please refer to <a href=\"/dev/test.html\">this guide</a> and <a href=\"https://forums.fast.ai/t/improving-expanding-functional-tests/32929\">this discussion</a>.</p></div></div>\n",
       "\n",
       "Reconstruct one of the underlying item for its data `t`.  "
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_doc(TabularList.reconstruct)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## New Methods - Please document or move to the undocumented section"
   ]
  }
 ],
 "metadata": {
  "jekyll": {
   "keywords": "fastai",
   "summary": "Base class to deal with tabular data and get a DataBunch",
   "title": "tabular.data"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}