{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Tabular data handling" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This module defines the main class to handle tabular data in the fastai library: [`TabularDataBunch`](/tabular.data.html#TabularDataBunch). As always, there is also a helper function to quickly get your data.\n", "\n", "To allow you to easily create a [`Learner`](/basic_train.html#Learner) for your data, it provides [`tabular_learner`](/tabular.learner.html#tabular_learner)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [], "source": [ "from fastai.gen_doc.nbdoc import *\n", "from fastai.tabular import * \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "
class
TabularDataBunch
[source][test]TabularDataBunch
(**`train_dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), **`valid_dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), **`fix_dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader)=***`None`***, **`test_dl`**:`Optional`\\[[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader)\\]=***`None`***, **`device`**:[`device`](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device)=***`None`***, **`dl_tfms`**:`Optional`\\[`Collection`\\[`Callable`\\]\\]=***`None`***, **`path`**:`PathOrStr`=***`'.'`***, **`collate_fn`**:`Callable`=***`'data_collate'`***, **`no_check`**:`bool`=***`False`***) :: [`DataBunch`](/basic_data.html#DataBunch)\n",
"\n",
"No tests found for TabularDataBunch
. To contribute a test please refer to this guide and this discussion.
\n", " | age | \n", "workclass | \n", "fnlwgt | \n", "education | \n", "education-num | \n", "marital-status | \n", "occupation | \n", "relationship | \n", "race | \n", "sex | \n", "capital-gain | \n", "capital-loss | \n", "hours-per-week | \n", "native-country | \n", "salary | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "49 | \n", "Private | \n", "101320 | \n", "Assoc-acdm | \n", "12.0 | \n", "Married-civ-spouse | \n", "NaN | \n", "Wife | \n", "White | \n", "Female | \n", "0 | \n", "1902 | \n", "40 | \n", "United-States | \n", ">=50k | \n", "
1 | \n", "44 | \n", "Private | \n", "236746 | \n", "Masters | \n", "14.0 | \n", "Divorced | \n", "Exec-managerial | \n", "Not-in-family | \n", "White | \n", "Male | \n", "10520 | \n", "0 | \n", "45 | \n", "United-States | \n", ">=50k | \n", "
2 | \n", "38 | \n", "Private | \n", "96185 | \n", "HS-grad | \n", "NaN | \n", "Divorced | \n", "NaN | \n", "Unmarried | \n", "Black | \n", "Female | \n", "0 | \n", "0 | \n", "32 | \n", "United-States | \n", "<50k | \n", "
3 | \n", "38 | \n", "Self-emp-inc | \n", "112847 | \n", "Prof-school | \n", "15.0 | \n", "Married-civ-spouse | \n", "Prof-specialty | \n", "Husband | \n", "Asian-Pac-Islander | \n", "Male | \n", "0 | \n", "0 | \n", "40 | \n", "United-States | \n", ">=50k | \n", "
4 | \n", "42 | \n", "Self-emp-not-inc | \n", "82297 | \n", "7th-8th | \n", "NaN | \n", "Married-civ-spouse | \n", "Other-service | \n", "Wife | \n", "Black | \n", "Female | \n", "0 | \n", "0 | \n", "50 | \n", "United-States | \n", "<50k | \n", "
from_df
[source][test]from_df
(**`path`**, **`df`**:`DataFrame`, **`dep_var`**:`str`, **`valid_idx`**:`Collection`\\[`int`\\], **`procs`**:`Optional`\\[`Collection`\\[[`TabularProc`](/tabular.transform.html#TabularProc)\\]\\]=***`None`***, **`cat_names`**:`OptStrList`=***`None`***, **`cont_names`**:`OptStrList`=***`None`***, **`classes`**:`Collection`\\[`T_co`\\]=***`None`***, **`test_df`**=***`None`***, **`bs`**:`int`=***`64`***, **`val_bs`**:`int`=***`None`***, **`num_workers`**:`int`=***`16`***, **`dl_tfms`**:`Optional`\\[`Collection`\\[`Callable`\\]\\]=***`None`***, **`device`**:[`device`](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device)=***`None`***, **`collate_fn`**:`Callable`=***`'data_collate'`***, **`no_check`**:`bool`=***`False`***) → [`DataBunch`](/basic_data.html#DataBunch)\n",
"\n",
"tabular_learner
[source][test]tabular_learner
(**`data`**:[`DataBunch`](/basic_data.html#DataBunch), **`layers`**:`Collection`\\[`int`\\], **`emb_szs`**:`Dict`\\[`str`, `int`\\]=***`None`***, **`metrics`**=***`None`***, **`ps`**:`Collection`\\[`float`\\]=***`None`***, **`emb_drop`**:`float`=***`0.0`***, **`y_range`**:`OptRange`=***`None`***, **`use_bn`**:`bool`=***`True`***, **\\*\\*`learn_kwargs`**)\n",
"\n",
"No tests found for tabular_learner
. To contribute a test please refer to this guide and this discussion.
class
TabularList
[source][test]TabularList
(**`items`**:`Iterator`\\[`T_co`\\], **`cat_names`**:`OptStrList`=***`None`***, **`cont_names`**:`OptStrList`=***`None`***, **`procs`**=***`None`***, **\\*\\*`kwargs`**) → `TabularList` :: [`ItemList`](/data_block.html#ItemList)\n",
"\n",
"from_df
[source][test]from_df
(**`df`**:`DataFrame`, **`cat_names`**:`OptStrList`=***`None`***, **`cont_names`**:`OptStrList`=***`None`***, **`procs`**=***`None`***, **\\*\\*`kwargs`**) → `ItemList`\n",
"\n",
"get_emb_szs
[source][test]get_emb_szs
(**`sz_dict`**=***`None`***)\n",
"\n",
"No tests found for get_emb_szs
. To contribute a test please refer to this guide and this discussion.
show_xys
[source][test]show_xys
(**`xs`**, **`ys`**)\n",
"\n",
"No tests found for show_xys
. To contribute a test please refer to this guide and this discussion.
show_xyzs
[source][test]show_xyzs
(**`xs`**, **`ys`**, **`zs`**)\n",
"\n",
"No tests found for show_xyzs
. To contribute a test please refer to this guide and this discussion.
class
TabularLine
[source][test]TabularLine
(**`cats`**, **`conts`**, **`classes`**, **`names`**) :: [`ItemBase`](/core.html#ItemBase)\n",
"\n",
"No tests found for TabularLine
. To contribute a test please refer to this guide and this discussion.
class
TabularProcessor
[source][test]TabularProcessor
(**`ds`**:[`ItemBase`](/core.html#ItemBase)=***`None`***, **`procs`**=***`None`***) :: [`PreProcessor`](/data_block.html#PreProcessor)\n",
"\n",
"No tests found for TabularProcessor
. To contribute a test please refer to this guide and this discussion.
process_one
[source][test]process_one
(**`item`**)\n",
"\n",
"No tests found for process_one
. To contribute a test please refer to this guide and this discussion.
new
[source][test]new
(**`items`**:`Iterator`\\[`T_co`\\], **`processor`**:`Union`\\[[`PreProcessor`](/data_block.html#PreProcessor), `Collection`\\[[`PreProcessor`](/data_block.html#PreProcessor)\\]\\]=***`None`***, **\\*\\*`kwargs`**) → `ItemList`\n",
"\n",
"No tests found for new
. To contribute a test please refer to this guide and this discussion.
get
[source][test]get
(**`o`**)\n",
"\n",
"No tests found for get
. To contribute a test please refer to this guide and this discussion.
process
[source][test]process
(**`ds`**)\n",
"\n",
"No tests found for process
. To contribute a test please refer to this guide and this discussion.
reconstruct
[source][test]reconstruct
(**`t`**:`Tensor`)\n",
"\n",
"No tests found for reconstruct
. To contribute a test please refer to this guide and this discussion.