{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Creating a Series\n", "There are couple of ways to create a new [`Series`](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.html) from scratch. \n", "\n", "Let's explore." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Just like how NumPy is almost always abbreviated as np...\n", "import numpy as np\n", "# pandas is usually shortened to pd\n", "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating from a dictionary\n", "\n", "Let's use this sample data here. In our example, `test_balance_data` is just a standard Python dictionary the key is username, and the value is that user's current account balance. " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "test_balance_data = {\n", " 'pasan': 20.00,\n", " 'treasure': 20.18,\n", " 'ashley': 1.05,\n", " 'craig': 42.42,\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `Series` constructor accepts any dict-like object" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "balances = pd.Series(test_balance_data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that labels have been set from the `test_balance_data.keys()` and the values are set from `test_balance_data.values()`" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pasan 20.00\n", "treasure 20.18\n", "ashley 1.05\n", "craig 42.42\n", "dtype: float64" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "balances" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating from an Iterable\n", "\n", "You can pass any iterable as the first argument" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "unlabeled_balances = pd.Series([20.00, 20.18, 1.05, 42.42])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*NOTE*: When labels are not present they're defaulted to incremental integers starting at 0" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 20.00\n", "1 20.18\n", "2 1.05\n", "3 42.42\n", "dtype: float64" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "unlabeled_balances" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also provide the `index` argument which requires an iterable the same size as your data." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "labeled_balances = pd.Series(\n", " [20.00, 20.18, 1.05, 42.42],\n", " index=['pasan', 'treasure', 'ashley', 'craig']\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note, the order of the labels is guaranteed. " ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pasan 20.00\n", "treasure 20.18\n", "ashley 1.05\n", "craig 42.42\n", "dtype: float64" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "labeled_balances" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One thing to remember is that a NumPy array is also iterable. In fact, you'll find NumPy and Pandas get along really well together." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 20.00\n", "1 20.18\n", "2 1.05\n", "3 42.42\n", "dtype: float64" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ndbalances = np.array([20.00, 20.18, 1.05, 42.42])\n", "pd.Series(ndbalances)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating from a scalar and an index\n", "\n", "If you pass in a scalar that value will be broadcasted to the keys specified in the index argument" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "guil 20.0\n", "jay 20.0\n", "james 20.0\n", "ben 20.0\n", "nick 20.0\n", "dtype: float64" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.Series(20.00, index=[\"guil\", \"jay\", \"james\", \"ben\", \"nick\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Learn More\n", "* [Introduction to Data Structures - Series (pandas documentation)](https://pandas.pydata.org/pandas-docs/stable/dsintro.html#series)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.0" } }, "nbformat": 4, "nbformat_minor": 2 }