{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Creating a [`DataFrame`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html#pandas.DataFrame)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There are actually quite a few ways to create a [`DataFrame`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html) from existing objects.\n",
"\n",
"Let's explore!"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# Setup\n",
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## From a 2-dimensional object"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If your data is already in rows and columns, like a list of lists, you can just pass it along to the constructor. Labels and Column headings will be automatically generated as a range."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" 0 | \n",
" 1 | \n",
" 2 | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" Craig | \n",
" Dennis | \n",
" 42.42 | \n",
"
\n",
" \n",
" 1 | \n",
" Treasure | \n",
" Porth | \n",
" 25.00 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" 0 1 2\n",
"0 Craig Dennis 42.42\n",
"1 Treasure Porth 25.00"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"test_users_list = [\n",
" ['Craig', 'Dennis', 42.42],\n",
" ['Treasure', 'Porth', 25.00]\n",
"]\n",
"\n",
"pd.DataFrame(test_users_list)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Notice how both the labels and column headings are autogenerated. You can specify the `index` and `columns`."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" first_name | \n",
" last_name | \n",
" balance | \n",
"
\n",
" \n",
" \n",
" \n",
" craigsdennis | \n",
" Craig | \n",
" Dennis | \n",
" 42.42 | \n",
"
\n",
" \n",
" treasure | \n",
" Treasure | \n",
" Porth | \n",
" 25.00 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" first_name last_name balance\n",
"craigsdennis Craig Dennis 42.42\n",
"treasure Treasure Porth 25.00"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pd.DataFrame(test_users_list, index=['craigsdennis', 'treasure'],\n",
" columns=['first_name', 'last_name', 'balance'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## From a dictionary"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Much like a `Series`, if you do not specify the index it will be autogenerated in range format."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" first_name | \n",
" last_name | \n",
" balance | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" Craig | \n",
" Dennis | \n",
" 42.42 | \n",
"
\n",
" \n",
" 1 | \n",
" Treasure | \n",
" Porth | \n",
" 25.00 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" first_name last_name balance\n",
"0 Craig Dennis 42.42\n",
"1 Treasure Porth 25.00"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Default expected Dictionary layout is column name, to ordered values\n",
"test_user_data = {\n",
" 'first_name': ['Craig', 'Treasure'],\n",
" 'last_name': ['Dennis', 'Porth'],\n",
" 'balance': [42.42, 25.00]\n",
"}\n",
"\n",
"pd.DataFrame(test_user_data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And remember that you can specify the index by supplying the `index` keyword argument."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" first_name | \n",
" last_name | \n",
" balance | \n",
"
\n",
" \n",
" \n",
" \n",
" craigsdennis | \n",
" Craig | \n",
" Dennis | \n",
" 42.42 | \n",
"
\n",
" \n",
" treasure | \n",
" Treasure | \n",
" Porth | \n",
" 25.00 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" first_name last_name balance\n",
"craigsdennis Craig Dennis 42.42\n",
"treasure Treasure Porth 25.00"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pd.DataFrame(test_user_data, index=['craigsdennis', 'treasure'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### [`DataFrame.from_dict`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.from_dict.html#pandas.DataFrame.from_dict) adds more options"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### The `orient` keyword\n",
"The orient keyword allows you to specify whether the keys of your dictionary are part of the labels (`index`) or the column titles (`columns`). Note how the nested dictionaries have been used to define the columns. You could also pass a list to the `columns` "
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" first_name | \n",
" last_name | \n",
" balance | \n",
"
\n",
" \n",
" \n",
" \n",
" craigsdennis | \n",
" Craig | \n",
" Dennis | \n",
" 42.42 | \n",
"
\n",
" \n",
" treasure | \n",
" Treasure | \n",
" Porth | \n",
" 25.00 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" first_name last_name balance\n",
"craigsdennis Craig Dennis 42.42\n",
"treasure Treasure Porth 25.00"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"by_username = {\n",
" 'craigsdennis': {\n",
" 'first_name': 'Craig',\n",
" 'last_name': 'Dennis',\n",
" 'balance': 42.42\n",
" },\n",
" 'treasure': {\n",
" 'first_name': 'Treasure',\n",
" 'last_name': 'Porth',\n",
" 'balance': 25.00\n",
" }\n",
"}\n",
"\n",
"pd.DataFrame.from_dict(by_username, orient='index')"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}