{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "___\n", "\n", " \n", "___\n", "# Series" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first main data type we will learn about for pandas is the Series data type. Let's import Pandas and explore the Series object.\n", "\n", "A Series is very similar to a NumPy array (in fact it is built on top of the NumPy array object). What differentiates the NumPy array from a Series, is that a Series can have axis labels, meaning it can be indexed by a label, instead of just a number location. It also doesn't need to hold numeric data, it can hold any arbitrary Python Object.\n", "\n", "Let's explore this concept through some examples:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating a Series\n", "\n", "You can convert a list,numpy array, or dictionary to a Series:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "labels = ['a','b','c']\n", "my_list = [10,20,30]\n", "arr = np.array([10,20,30])\n", "d = {'a':10,'b':20,'c':30}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** Using Lists**" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 10\n", "1 20\n", "2 30\n", "dtype: int64" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.Series(data=my_list)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "a 10\n", "b 20\n", "c 30\n", "dtype: int64" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.Series(data=my_list,index=labels)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "a 10\n", "b 20\n", "c 30\n", "dtype: int64" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.Series(my_list,labels)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** NumPy Arrays **" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 10\n", "1 20\n", "2 30\n", "dtype: int64" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.Series(arr)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "a 10\n", "b 20\n", "c 30\n", "dtype: int64" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.Series(arr,labels)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** Dictionary**" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "a 10\n", "b 20\n", "c 30\n", "dtype: int64" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.Series(d)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data in a Series\n", "\n", "A pandas Series can hold a variety of object types:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 a\n", "1 b\n", "2 c\n", "dtype: object" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.Series(data=labels)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 \n", "1 \n", "2 \n", "dtype: object" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Even functions (although unlikely that you will use this)\n", "pd.Series([sum,print,len])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using an Index\n", "\n", "The key to using a Series is understanding its index. Pandas makes use of these index names or numbers by allowing for fast look ups of information (works like a hash table or dictionary).\n", "\n", "Let's see some examples of how to grab information from a Series. Let us create two sereis, ser1 and ser2:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "ser1 = pd.Series([1,2,3,4],index = ['USA', 'Germany','USSR', 'Japan']) " ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "USA 1\n", "Germany 2\n", "USSR 3\n", "Japan 4\n", "dtype: int64" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ser1" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": true }, "outputs": [], "source": [ "ser2 = pd.Series([1,2,5,4],index = ['USA', 'Germany','Italy', 'Japan']) " ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "USA 1\n", "Germany 2\n", "Italy 5\n", "Japan 4\n", "dtype: int64" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ser2" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ser1['USA']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Operations are then also done based off of index:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Germany 4.0\n", "Italy NaN\n", "Japan 8.0\n", "USA 2.0\n", "USSR NaN\n", "dtype: float64" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ser1 + ser2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's stop here for now and move on to DataFrames, which will expand on the concept of Series!\n", "# Great Job!" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 1 }