{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Tabular data handling" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This module defines the main class to handle tabular data in the fastai library: [`TabularDataBunch`](/tabular.data.html#TabularDataBunch). As always, there is also a helper function to quickly get your data.\n", "\n", "To allow you to easily create a [`Learner`](/basic_train.html#Learner) for your data, it provides [`tabular_learner`](/tabular.data.html#tabular_learner)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [], "source": [ "from fastai.gen_doc.nbdoc import *\n", "from fastai.tabular import * \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "
class TabularDataBunch[source][test]TabularDataBunch(**`train_dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), **`valid_dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), **`fix_dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader)=***`None`***, **`test_dl`**:`Optional`\\[[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader)\\]=***`None`***, **`device`**:[`device`](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device)=***`None`***, **`dl_tfms`**:`Optional`\\[`Collection`\\[`Callable`\\]\\]=***`None`***, **`path`**:`PathOrStr`=***`'.'`***, **`collate_fn`**:`Callable`=***`'data_collate'`***, **`no_check`**:`bool`=***`False`***) :: [`DataBunch`](/basic_data.html#DataBunch)\n",
"\n",
"No tests found for TabularDataBunch. To contribute a test please refer to this guide and this discussion.
| \n", " | age | \n", "workclass | \n", "fnlwgt | \n", "education | \n", "education-num | \n", "marital-status | \n", "occupation | \n", "relationship | \n", "race | \n", "sex | \n", "capital-gain | \n", "capital-loss | \n", "hours-per-week | \n", "native-country | \n", "salary | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "49 | \n", "Private | \n", "101320 | \n", "Assoc-acdm | \n", "12.0 | \n", "Married-civ-spouse | \n", "NaN | \n", "Wife | \n", "White | \n", "Female | \n", "0 | \n", "1902 | \n", "40 | \n", "United-States | \n", ">=50k | \n", "
| 1 | \n", "44 | \n", "Private | \n", "236746 | \n", "Masters | \n", "14.0 | \n", "Divorced | \n", "Exec-managerial | \n", "Not-in-family | \n", "White | \n", "Male | \n", "10520 | \n", "0 | \n", "45 | \n", "United-States | \n", ">=50k | \n", "
| 2 | \n", "38 | \n", "Private | \n", "96185 | \n", "HS-grad | \n", "NaN | \n", "Divorced | \n", "NaN | \n", "Unmarried | \n", "Black | \n", "Female | \n", "0 | \n", "0 | \n", "32 | \n", "United-States | \n", "<50k | \n", "
| 3 | \n", "38 | \n", "Self-emp-inc | \n", "112847 | \n", "Prof-school | \n", "15.0 | \n", "Married-civ-spouse | \n", "Prof-specialty | \n", "Husband | \n", "Asian-Pac-Islander | \n", "Male | \n", "0 | \n", "0 | \n", "40 | \n", "United-States | \n", ">=50k | \n", "
| 4 | \n", "42 | \n", "Self-emp-not-inc | \n", "82297 | \n", "7th-8th | \n", "NaN | \n", "Married-civ-spouse | \n", "Other-service | \n", "Wife | \n", "Black | \n", "Female | \n", "0 | \n", "0 | \n", "50 | \n", "United-States | \n", "<50k | \n", "
from_df[source][test]from_df(**`path`**, **`df`**:`DataFrame`, **`dep_var`**:`str`, **`valid_idx`**:`Collection`\\[`int`\\], **`procs`**:`Optional`\\[`Collection`\\[[`TabularProc`](/tabular.transform.html#TabularProc)\\]\\]=***`None`***, **`cat_names`**:`OptStrList`=***`None`***, **`cont_names`**:`OptStrList`=***`None`***, **`classes`**:`Collection`\\[`T_co`\\]=***`None`***, **`test_df`**=***`None`***, **`bs`**:`int`=***`64`***, **`val_bs`**:`int`=***`None`***, **`num_workers`**:`int`=***`4`***, **`dl_tfms`**:`Optional`\\[`Collection`\\[`Callable`\\]\\]=***`None`***, **`device`**:[`device`](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device)=***`None`***, **`collate_fn`**:`Callable`=***`'data_collate'`***, **`no_check`**:`bool`=***`False`***) → [`DataBunch`](/basic_data.html#DataBunch)\n",
"\n",
"No tests found for from_df. To contribute a test please refer to this guide and this discussion.
tabular_learner[source][test]tabular_learner(**`data`**:[`DataBunch`](/basic_data.html#DataBunch), **`layers`**:`Collection`\\[`int`\\], **`emb_szs`**:`Dict`\\[`str`, `int`\\]=***`None`***, **`metrics`**=***`None`***, **`ps`**:`Collection`\\[`float`\\]=***`None`***, **`emb_drop`**:`float`=***`0.0`***, **`y_range`**:`OptRange`=***`None`***, **`use_bn`**:`bool`=***`True`***, **\\*\\*`learn_kwargs`**)\n",
"\n",
"No tests found for tabular_learner. To contribute a test please refer to this guide and this discussion.
class TabularList[source][test]TabularList(**`items`**:`Iterator`\\[`T_co`\\], **`cat_names`**:`OptStrList`=***`None`***, **`cont_names`**:`OptStrList`=***`None`***, **`procs`**=***`None`***, **\\*\\*`kwargs`**) → `TabularList` :: [`ItemList`](/data_block.html#ItemList)\n",
"\n",
"No tests found for TabularList. To contribute a test please refer to this guide and this discussion.
from_df[source][test]from_df(**`df`**:`DataFrame`, **`cat_names`**:`OptStrList`=***`None`***, **`cont_names`**:`OptStrList`=***`None`***, **`procs`**=***`None`***, **\\*\\*`kwargs`**) → `ItemList`\n",
"\n",
"No tests found for from_df. To contribute a test please refer to this guide and this discussion.
get_emb_szs[source][test]get_emb_szs(**`sz_dict`**=***`None`***)\n",
"\n",
"No tests found for get_emb_szs. To contribute a test please refer to this guide and this discussion.
show_xys[source][test]show_xys(**`xs`**, **`ys`**)\n",
"\n",
"No tests found for show_xys. To contribute a test please refer to this guide and this discussion.
show_xyzs[source][test]show_xyzs(**`xs`**, **`ys`**, **`zs`**)\n",
"\n",
"No tests found for show_xyzs. To contribute a test please refer to this guide and this discussion.
class TabularLine[source][test]TabularLine(**`cats`**, **`conts`**, **`classes`**, **`names`**) :: [`ItemBase`](/core.html#ItemBase)\n",
"\n",
"No tests found for TabularLine. To contribute a test please refer to this guide and this discussion.
class TabularProcessor[source][test]TabularProcessor(**`ds`**:[`ItemBase`](/core.html#ItemBase)=***`None`***, **`procs`**=***`None`***) :: [`PreProcessor`](/data_block.html#PreProcessor)\n",
"\n",
"No tests found for TabularProcessor. To contribute a test please refer to this guide and this discussion.
process_one[source][test]process_one(**`item`**)\n",
"\n",
"No tests found for process_one. To contribute a test please refer to this guide and this discussion.
new[source][test]new(**`items`**:`Iterator`\\[`T_co`\\], **`processor`**:`Union`\\[[`PreProcessor`](/data_block.html#PreProcessor), `Collection`\\[[`PreProcessor`](/data_block.html#PreProcessor)\\]\\]=***`None`***, **\\*\\*`kwargs`**) → `ItemList`\n",
"\n",
"No tests found for new. To contribute a test please refer to this guide and this discussion.
get[source][test]get(**`o`**)\n",
"\n",
"No tests found for get. To contribute a test please refer to this guide and this discussion.
process[source][test]process(**`ds`**)\n",
"\n",
"No tests found for process. To contribute a test please refer to this guide and this discussion.
reconstruct[source][test]reconstruct(**`t`**:`Tensor`)\n",
"\n",
"No tests found for reconstruct. To contribute a test please refer to this guide and this discussion.