\n",
"\n",
"From [Rob Pike](http://doc.cat-v.org/bell_labs/pikestylehttp://en.wikipedia.org/wiki/Rob_Pike)'s\n",
"[Notes on Programming in C](http://doc.cat-v.org/bell_labs/pikestyle):\n",
"\n",
"> Rule 5. Data dominates. If you've chosen the right data structures and\n",
" organized things well, the algorithms will almost always be\n",
" self-evident. Data structures, not algorithms, are central to programming.\n",
"\n",
"Pandas is built on a hierarchy of a few powerful data structures. Each of these\n",
"structures is composed of, and designed to interoperate with, the simpler\n",
"structures.\n",
"\n",
"- `Index` (1-Dimensional immutable ordered hash table)\n",
"- `Series` (1-Dimensional Labelled Array)\n",
"- `DataFrame` (2-Dimensional Labelled Array)\n",
"- `Panel` (3-Dimensional Labelled Array)"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Tell IPython to display mapltplotlib plots inline.\n",
"%matplotlib inline\n",
"\n",
"# Set default font attributes.\n",
"import matplotlib\n",
"font = {'family' : 'normal',\n",
" 'weight' : 'bold',\n",
" 'size' : 13}\n",
"matplotlib.rc('font', **font)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 1
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"randn = np.random.randn\n",
"\n",
"pd.set_option('display.mpl_style', 'default')\n",
"pd.set_option('display.max_rows', 15)\n",
"\n",
"# Make a default figure size for later use.\n",
"DEFAULT_FIGSIZE = (12, 6)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 2
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# `Series`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Basics"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"s = pd.Series([3,5,7,2])\n",
"s"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 3,
"text": [
"0 3\n",
"1 5\n",
"2 7\n",
"3 2\n",
"dtype: int64"
]
}
],
"prompt_number": 3
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# An important concept to understand when working with a `Series` is that it's\n",
"# actually composed of two pieces: an index array, and a data array.\n",
"\n",
"print \"The index is {0}.\".format(s.index)\n",
"print \"The values are {0}.\".format(s.values)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"The index is Int64Index([0, 1, 2, 3], dtype='int64').\n",
"The values are [3 5 7 2].\n"
]
}
],
"prompt_number": 4
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# You can explicitly pass your own labels to use as an index. If you don't\n",
"# Pandas will construct a default index with integer labels.\n",
"pd.Series(np.random.randn(4), index=['a', 'b', 'c', 'd'])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 5,
"text": [
"a 1.185723\n",
"b -0.180358\n",
"c 0.762084\n",
"d 1.277645\n",
"dtype: float64"
]
}
],
"prompt_number": 5
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# You can also construct a Series from a dictionary.\n",
"# The keys are used as the index, and the values are used as the Series' values\n",
"pd.Series(\n",
" {\n",
" 'a': 1, \n",
" 'b': 2,\n",
" 'c': 3,\n",
" }\n",
")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 6,
"text": [
"a 1\n",
"b 2\n",
"c 3\n",
"dtype: int64"
]
}
],
"prompt_number": 6
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# You get performance (and code clarity!) benefits if your Series'\n",
"# labels/values are homogenously-typed, but mixed-type arrays are supported.\n",
"pd.Series(\n",
" [1, 2.6, 'a', {'a': 'b'}], \n",
" index=[1, 'a', 2, 2.5],\n",
")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 7,
"text": [
"1 1\n",
"a 2.6\n",
"2 a\n",
"2.5 {u'a': u'b'}\n",
"dtype: object"
]
}
],
"prompt_number": 7
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Slicing `Series` with `__getitem__` (aka `[]`)\n",
"Pandas objects support a wide range of selection and filtering methods. An important idea to keep in mind is the following:\n",
"\n",
"**If you have an N-dimensional object**: \n",
"- Indexing with a scalar returns a value of dimension **N-1**.\n",
"- Indexing with a slice filters the object, but maintains the original dimension."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"s = pd.Series(range(10), index=list('ABCDEFGHIJ'))\n",
"s\n"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 8,
"text": [
"A 0\n",
"B 1\n",
"C 2\n",
"D 3\n",
"E 4\n",
"F 5\n",
"G 6\n",
"H 7\n",
"I 8\n",
"J 9\n",
"dtype: int64"
]
}
],
"prompt_number": 8
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Lookups by key work as you'd expect.\n",
"s['E']"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 9,
"text": [
"4"
]
}
],
"prompt_number": 9
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# We can look up multiple values at a time by passing a list of keys.\n",
"# The resulting value is a new `Series`.\n",
"s[['E', 'I', 'B']]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 10,
"text": [
"E 4\n",
"I 8\n",
"B 1\n",
"dtype: int64"
]
}
],
"prompt_number": 10
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Because the Index is ordered, we can use Python's slicing syntax.\n",
"s['E':]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 11,
"text": [
"E 4\n",
"F 5\n",
"G 6\n",
"H 7\n",
"I 8\n",
"J 9\n",
"dtype: int64"
]
}
],
"prompt_number": 11
},
{
"cell_type": "code",
"collapsed": false,
"input": [
" # Label-based slicing is inclusive of both endpoints.\n",
"s[:'I']"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 12,
"text": [
"A 0\n",
"B 1\n",
"C 2\n",
"D 3\n",
"E 4\n",
"F 5\n",
"G 6\n",
"H 7\n",
"I 8\n",
"dtype: int64"
]
}
],
"prompt_number": 12
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"s['E':'I']"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 13,
"text": [
"E 4\n",
"F 5\n",
"G 6\n",
"H 7\n",
"I 8\n",
"dtype: int64"
]
}
],
"prompt_number": 13
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Step arguments work just like Python lists.\n",
"s['E':'I':2]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 14,
"text": [
"E 4\n",
"G 6\n",
"I 8\n",
"dtype: int64"
]
}
],
"prompt_number": 14
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# If you don't know the label you want, but you do know the position, you can\n",
"# use `iloc`.\n",
"print \"The first entry is: %d\" % s.iloc[0]\n",
"print \"The last entry is: %d\" % s.iloc[-1]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"The first entry is: 0\n",
"The last entry is: 9\n"
]
}
],
"prompt_number": 15
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Slicing works with `iloc` as well.\n",
"\n",
"# Note that, unlike with label-based slicing, integer-based slices are\n",
"# right-open intervals, i.e. doing s.iloc[X:Y] gives you elements with indices\n",
"# in [X, Y). This is the same as the semantics for list slicing.\n",
"s.iloc[5:]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 16,
"text": [
"F 5\n",
"G 6\n",
"H 7\n",
"I 8\n",
"J 9\n",
"dtype: int64"
]
}
],
"prompt_number": 16
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print s.iloc[:5]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"A 0\n",
"B 1\n",
"C 2\n",
"D 3\n",
"E 4\n",
"dtype: int64\n"
]
}
],
"prompt_number": 17
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"s.iloc[-3:]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 18,
"text": [
"H 7\n",
"I 8\n",
"J 9\n",
"dtype: int64"
]
}
],
"prompt_number": 18
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Numerical Operations"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Create two Series objects containing 100 samples each of sine and cosine.\n",
"sine = pd.Series(np.sin(np.linspace(0, 3.14 * 2, 100)), name='sine')\n",
"cosine = pd.Series(np.cos(np.linspace(0, 3.14 * 2, 100)), name='cosine')"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 19
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"sine"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 20,
"text": [
"0 0.000000\n",
"1 0.063392\n",
"2 0.126529\n",
"3 0.189156\n",
"4 0.251023\n",
"...\n",
"94 -0.314905\n",
"95 -0.254105\n",
"96 -0.192283\n",
"97 -0.129688\n",
"98 -0.066570\n",
"99 -0.003185\n",
"Name: sine, Length: 100, dtype: float64"
]
}
],
"prompt_number": 20
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"cosine"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 21,
"text": [
"0 1.000000\n",
"1 0.997989\n",
"2 0.991963\n",
"3 0.981947\n",
"4 0.967981\n",
"...\n",
"94 0.949123\n",
"95 0.967177\n",
"96 0.981339\n",
"97 0.991555\n",
"98 0.997782\n",
"99 0.999995\n",
"Name: cosine, Length: 100, dtype: float64"
]
}
],
"prompt_number": 21
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Multiplying two Series objects produces a new Series by multiplying values that have the same keys.\n",
"product = cosine * sine\n",
"product"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 22,
"text": [
"0 0.000000\n",
"1 0.063264\n",
"2 0.125512\n",
"3 0.185742\n",
"4 0.242986\n",
"...\n",
"94 -0.298884\n",
"95 -0.245765\n",
"96 -0.188695\n",
"97 -0.128592\n",
"98 -0.066423\n",
"99 -0.003185\n",
"Length: 100, dtype: float64"
]
}
],
"prompt_number": 22
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Adding or multiplying a Series by a scalar applies that operation to each value in the Series.\n",
"cosine_plus_one = cosine + 1\n",
"cosine_plus_one"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 23,
"text": [
"0 2.000000\n",
"1 1.997989\n",
"2 1.991963\n",
"3 1.981947\n",
"4 1.967981\n",
"...\n",
"94 1.949123\n",
"95 1.967177\n",
"96 1.981339\n",
"97 1.991555\n",
"98 1.997782\n",
"99 1.999995\n",
"Name: cosine, Length: 100, dtype: float64"
]
}
],
"prompt_number": 23
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Other binary operators work as you'd expect. \n",
"\n",
"# Note how much cleaner and clearer this is\n",
"# compared to looping over two containers and \n",
"# performing multiple operations on elements \n",
"# from each.\n",
"identity = (sine ** 2) + (cosine ** 2)\n",
"identity"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 24,
"text": [
"0 1\n",
"1 1\n",
"2 1\n",
"3 1\n",
"4 1\n",
"...\n",
"94 1\n",
"95 1\n",
"96 1\n",
"97 1\n",
"98 1\n",
"99 1\n",
"Length: 100, dtype: float64"
]
}
],
"prompt_number": 24
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"####All of the `pandas` data structures have `plot` methods that provide a user-friendly interface to `matplotlib`."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Plot our sines values.\n",
"trigplot = sine.plot(\n",
" ylim=(-1.2, 1.2),\n",
" legend=True,\n",
" figsize=DEFAULT_FIGSIZE,\n",
" linewidth=3,\n",
" label='sine',\n",
")\n",
"# Add our other Series' to the same plot.\n",
"cosine.plot(ax=trigplot, legend=True, linewidth=3)\n",
"product.plot(ax=trigplot, legend=True, linewidth=3, label='product')\n",
"identity.plot(ax=trigplot, legend=True, linewidth=3, label='identity')"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 25,
"text": [
"| \n", " | s1 | \n", "s2 | \n", "s1 > s2 | \n", "s1 == s2 | \n", "s1 < s2 | \n", "s1 == 3 | \n", "
|---|---|---|---|---|---|---|
| A | \n", "1 | \n", "2 | \n", "False | \n", "False | \n", "True | \n", "False | \n", "
| B | \n", "2 | \n", "2 | \n", "False | \n", "True | \n", "False | \n", "False | \n", "
| C | \n", "3 | \n", "2 | \n", "True | \n", "False | \n", "False | \n", "True | \n", "
| D | \n", "4 | \n", "2 | \n", "True | \n", "False | \n", "False | \n", "False | \n", "
| E | \n", "3 | \n", "2 | \n", "True | \n", "False | \n", "False | \n", "True | \n", "
| F | \n", "2 | \n", "2 | \n", "False | \n", "True | \n", "False | \n", "False | \n", "
| G | \n", "1 | \n", "2 | \n", "False | \n", "False | \n", "True | \n", "False | \n", "
| \n", " | Open | \n", "High | \n", "Low | \n", "Close | \n", "Volume | \n", "Adj_Ratio | \n", "
|---|---|---|---|---|---|---|
| Date | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
| 2011-01-03 | \n", "117.923577 | \n", "118.751861 | \n", "116.983613 | \n", "118.24 | \n", "138725200 | \n", "0.930657 | \n", "
| 2011-01-04 | \n", "118.495717 | \n", "118.532941 | \n", "117.434811 | \n", "118.17 | \n", "137409700 | \n", "0.930619 | \n", "
| 2011-01-05 | \n", "117.803496 | \n", "118.864453 | \n", "117.691816 | \n", "118.79 | \n", "133975300 | \n", "0.930664 | \n", "
| 2011-01-06 | \n", "118.839206 | \n", "118.969502 | \n", "118.206340 | \n", "118.56 | \n", "122519000 | \n", "0.930685 | \n", "
| 2011-01-07 | \n", "118.710864 | \n", "118.906295 | \n", "117.398679 | \n", "118.32 | \n", "156034600 | \n", "0.930628 | \n", "
| 2011-01-10 | \n", "117.797752 | \n", "118.337511 | \n", "117.444117 | \n", "118.17 | \n", "122401700 | \n", "0.930619 | \n", "
| 2011-01-11 | \n", "118.599306 | \n", "118.878495 | \n", "118.143298 | \n", "118.59 | \n", "110287000 | \n", "0.930629 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 2013-12-20 | \n", "179.037954 | \n", "180.326069 | \n", "178.919052 | \n", "179.90 | \n", "197087000 | \n", "0.990857 | \n", "
| 2013-12-23 | \n", "180.780732 | \n", "180.968994 | \n", "180.404209 | \n", "180.86 | \n", "85598000 | \n", "0.990851 | \n", "
| 2013-12-24 | \n", "180.873560 | \n", "181.339270 | \n", "180.863652 | \n", "181.26 | \n", "45368800 | \n", "0.990871 | \n", "
| 2013-12-26 | \n", "181.664751 | \n", "182.279086 | \n", "181.644934 | \n", "182.18 | \n", "63365000 | \n", "0.990863 | \n", "
| 2013-12-27 | \n", "182.417716 | \n", "182.496984 | \n", "181.981736 | \n", "182.17 | \n", "61814000 | \n", "0.990862 | \n", "
| 2013-12-30 | \n", "182.189543 | \n", "182.338172 | \n", "181.902193 | \n", "182.14 | \n", "56857000 | \n", "0.990861 | \n", "
| 2013-12-31 | \n", "182.385673 | \n", "183.000000 | \n", "182.246954 | \n", "183.00 | \n", "86119900 | \n", "0.990850 | \n", "
754 rows \u00d7 6 columns
\n", "| \n", " | Close | \n", "Volume | \n", "
|---|---|---|
| Date | \n", "\n", " | \n", " |
| 2011-01-03 | \n", "118.24 | \n", "138725200 | \n", "
| 2011-01-04 | \n", "118.17 | \n", "137409700 | \n", "
| 2011-01-05 | \n", "118.79 | \n", "133975300 | \n", "
| 2011-01-06 | \n", "118.56 | \n", "122519000 | \n", "
| 2011-01-07 | \n", "118.32 | \n", "156034600 | \n", "
| 2011-01-10 | \n", "118.17 | \n", "122401700 | \n", "
| 2011-01-11 | \n", "118.59 | \n", "110287000 | \n", "
| ... | \n", "... | \n", "... | \n", "
| 2013-12-20 | \n", "179.90 | \n", "197087000 | \n", "
| 2013-12-23 | \n", "180.86 | \n", "85598000 | \n", "
| 2013-12-24 | \n", "181.26 | \n", "45368800 | \n", "
| 2013-12-26 | \n", "182.18 | \n", "63365000 | \n", "
| 2013-12-27 | \n", "182.17 | \n", "61814000 | \n", "
| 2013-12-30 | \n", "182.14 | \n", "56857000 | \n", "
| 2013-12-31 | \n", "183.00 | \n", "86119900 | \n", "
754 rows \u00d7 2 columns
\n", "| \n", " | Open | \n", "High | \n", "Low | \n", "Close | \n", "Volume | \n", "Adj_Ratio | \n", "
|---|---|---|---|---|---|---|
| Date | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
| 2013-02-01 | \n", "146.316970 | \n", "147.064823 | \n", "146.064448 | \n", "146.89 | \n", "131173000 | \n", "0.971238 | \n", "
| 2013-02-04 | \n", "145.997571 | \n", "146.920254 | \n", "145.133163 | \n", "145.24 | \n", "159073600 | \n", "0.971245 | \n", "
| 2013-02-05 | \n", "146.030113 | \n", "147.127645 | \n", "145.971836 | \n", "146.71 | \n", "113912400 | \n", "0.971268 | \n", "
| 2013-02-06 | \n", "146.188418 | \n", "146.907122 | \n", "146.081583 | \n", "146.81 | \n", "138762800 | \n", "0.971223 | \n", "
| 2013-02-07 | \n", "146.862813 | \n", "146.998788 | \n", "145.551624 | \n", "146.62 | \n", "162490000 | \n", "0.971251 | \n", "
| 2013-02-08 | \n", "146.876659 | \n", "147.527415 | \n", "146.876659 | \n", "147.44 | \n", "103133700 | \n", "0.971278 | \n", "
| 2013-02-11 | \n", "147.380862 | \n", "147.536265 | \n", "147.040917 | \n", "147.41 | \n", "73775000 | \n", "0.971272 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 2013-02-20 | \n", "148.738262 | \n", "148.786825 | \n", "146.912299 | \n", "146.99 | \n", "160574800 | \n", "0.971257 | \n", "
| 2013-02-21 | \n", "146.614456 | \n", "147.061214 | \n", "145.623817 | \n", "146.09 | \n", "183257000 | \n", "0.971214 | \n", "
| 2013-02-22 | \n", "146.801290 | \n", "147.520000 | \n", "146.160279 | \n", "147.52 | \n", "106356600 | \n", "0.971229 | \n", "
| 2013-02-25 | \n", "148.245729 | \n", "148.469122 | \n", "144.720000 | \n", "144.72 | \n", "245824800 | \n", "0.971275 | \n", "
| 2013-02-26 | \n", "145.418619 | \n", "145.884829 | \n", "144.457061 | \n", "145.71 | \n", "186596200 | \n", "0.971270 | \n", "
| 2013-02-27 | \n", "145.578109 | \n", "147.947918 | \n", "145.451849 | \n", "147.54 | \n", "150781900 | \n", "0.971233 | \n", "
| 2013-02-28 | \n", "147.531660 | \n", "148.473765 | \n", "147.055752 | \n", "147.25 | \n", "126866000 | \n", "0.971242 | \n", "
19 rows \u00d7 6 columns
\n", "| \n", " | Open | \n", "High | \n", "Low | \n", "
|---|---|---|---|
| Date | \n", "\n", " | \n", " | \n", " |
| 2013-02-01 | \n", "146.316970 | \n", "147.064823 | \n", "146.064448 | \n", "
| 2013-02-04 | \n", "145.997571 | \n", "146.920254 | \n", "145.133163 | \n", "
| 2013-02-05 | \n", "146.030113 | \n", "147.127645 | \n", "145.971836 | \n", "
| 2013-02-06 | \n", "146.188418 | \n", "146.907122 | \n", "146.081583 | \n", "
| 2013-02-07 | \n", "146.862813 | \n", "146.998788 | \n", "145.551624 | \n", "
| 2013-02-08 | \n", "146.876659 | \n", "147.527415 | \n", "146.876659 | \n", "
| 2013-02-11 | \n", "147.380862 | \n", "147.536265 | \n", "147.040917 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "
| 2013-02-20 | \n", "148.738262 | \n", "148.786825 | \n", "146.912299 | \n", "
| 2013-02-21 | \n", "146.614456 | \n", "147.061214 | \n", "145.623817 | \n", "
| 2013-02-22 | \n", "146.801290 | \n", "147.520000 | \n", "146.160279 | \n", "
| 2013-02-25 | \n", "148.245729 | \n", "148.469122 | \n", "144.720000 | \n", "
| 2013-02-26 | \n", "145.418619 | \n", "145.884829 | \n", "144.457061 | \n", "
| 2013-02-27 | \n", "145.578109 | \n", "147.947918 | \n", "145.451849 | \n", "
| 2013-02-28 | \n", "147.531660 | \n", "148.473765 | \n", "147.055752 | \n", "
19 rows \u00d7 3 columns
\n", "| \n", " | Open | \n", "Low | \n", "
|---|---|---|
| Date | \n", "\n", " | \n", " |
| 2013-12-03 | \n", "177.327241 | \n", "176.568422 | \n", "
| 2013-12-04 | \n", "176.509114 | \n", "175.769963 | \n", "
| 2013-12-05 | \n", "176.813197 | \n", "176.182461 | \n", "
| 2013-12-06 | \n", "178.053910 | \n", "177.541439 | \n", "
| 2013-12-09 | \n", "178.838985 | \n", "178.533480 | \n", "
| 2013-12-10 | \n", "178.356666 | \n", "178.021594 | \n", "
| 2013-12-11 | \n", "178.199567 | \n", "175.913188 | \n", "
| 2013-12-12 | \n", "176.052613 | \n", "175.185359 | \n", "
| 2013-12-13 | \n", "175.914351 | \n", "175.194925 | \n", "
| 2013-12-16 | \n", "176.353917 | \n", "176.304642 | \n", "
| \n", " | Open | \n", "High | \n", "Low | \n", "Close | \n", "Volume | \n", "Adj_Ratio | \n", "
|---|---|---|---|---|---|---|
| Date | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
| 2011-01-03 | \n", "117.923577 | \n", "118.751861 | \n", "116.983613 | \n", "118.24 | \n", "138725200 | \n", "0.930657 | \n", "
| 2011-01-05 | \n", "117.803496 | \n", "118.864453 | \n", "117.691816 | \n", "118.79 | \n", "133975300 | \n", "0.930664 | \n", "
| 2011-01-10 | \n", "117.797752 | \n", "118.337511 | \n", "117.444117 | \n", "118.17 | \n", "122401700 | \n", "0.930619 | \n", "
| 2011-01-12 | \n", "119.315668 | \n", "119.790288 | \n", "118.617698 | \n", "119.66 | \n", "107929200 | \n", "0.930627 | \n", "
| 2011-01-14 | \n", "119.297005 | \n", "120.357919 | \n", "119.213248 | \n", "120.33 | \n", "117677900 | \n", "0.930626 | \n", "
| 2011-01-18 | \n", "120.223573 | \n", "120.651680 | \n", "120.083973 | \n", "120.54 | \n", "114401300 | \n", "0.930667 | \n", "
| 2011-01-20 | \n", "119.088320 | \n", "119.497814 | \n", "118.315865 | \n", "119.20 | \n", "175745700 | \n", "0.930668 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 2013-12-18 | \n", "176.330239 | \n", "179.099566 | \n", "174.753398 | \n", "179.07 | \n", "234906000 | \n", "0.985526 | \n", "
| 2013-12-19 | \n", "178.554492 | \n", "179.066957 | \n", "178.091303 | \n", "178.86 | \n", "136531200 | \n", "0.985509 | \n", "
| 2013-12-20 | \n", "179.037954 | \n", "180.326069 | \n", "178.919052 | \n", "179.90 | \n", "197087000 | \n", "0.990857 | \n", "
| 2013-12-23 | \n", "180.780732 | \n", "180.968994 | \n", "180.404209 | \n", "180.86 | \n", "85598000 | \n", "0.990851 | \n", "
| 2013-12-24 | \n", "180.873560 | \n", "181.339270 | \n", "180.863652 | \n", "181.26 | \n", "45368800 | \n", "0.990871 | \n", "
| 2013-12-26 | \n", "181.664751 | \n", "182.279086 | \n", "181.644934 | \n", "182.18 | \n", "63365000 | \n", "0.990863 | \n", "
| 2013-12-31 | \n", "182.385673 | \n", "183.000000 | \n", "182.246954 | \n", "183.00 | \n", "86119900 | \n", "0.990850 | \n", "
423 rows \u00d7 6 columns
\n", "| \n", " | Open | \n", "High | \n", "
|---|---|---|
| Date | \n", "\n", " | \n", " |
| 2013-12-03 | \n", "177.327241 | \n", "177.770707 | \n", "
| 2013-12-04 | \n", "176.509114 | \n", "177.869150 | \n", "
| 2013-12-05 | \n", "176.813197 | \n", "177.138421 | \n", "
| 2013-12-06 | \n", "178.053910 | \n", "178.487538 | \n", "
| 2013-12-09 | \n", "178.838985 | \n", "179.036085 | \n", "
| 2013-12-10 | \n", "178.356666 | \n", "178.731158 | \n", "
| 2013-12-11 | \n", "178.199567 | \n", "178.229132 | \n", "
| 2013-12-12 | \n", "176.052613 | \n", "176.269427 | \n", "
| 2013-12-13 | \n", "175.914351 | \n", "176.072033 | \n", "
| 2013-12-16 | \n", "176.353917 | \n", "177.201441 | \n", "
| \n", " | backA_2Day | \n", "backB_5Day | \n", "backD_50Day | \n", "backE_100Day | \n", "backF_200Day | \n", "backG_300Day | \n", "forward_30Day | \n", "
|---|---|---|---|---|---|---|---|
| backA_2Day | \n", "1.000000 | \n", "0.595697 | \n", "0.188921 | \n", "0.071506 | \n", "0.162341 | \n", "0.103689 | \n", "-0.069359 | \n", "
| backB_5Day | \n", "0.595697 | \n", "1.000000 | \n", "0.303945 | \n", "0.106169 | \n", "0.209141 | \n", "0.133395 | \n", "-0.132127 | \n", "
| backD_50Day | \n", "0.188921 | \n", "0.303945 | \n", "1.000000 | \n", "0.443526 | \n", "0.299287 | \n", "0.324473 | \n", "-0.364206 | \n", "
| backE_100Day | \n", "0.071506 | \n", "0.106169 | \n", "0.443526 | \n", "1.000000 | \n", "0.116012 | \n", "0.288164 | \n", "-0.533907 | \n", "
| backF_200Day | \n", "0.162341 | \n", "0.209141 | \n", "0.299287 | \n", "0.116012 | \n", "1.000000 | \n", "0.276626 | \n", "-0.016570 | \n", "
| backG_300Day | \n", "0.103689 | \n", "0.133395 | \n", "0.324473 | \n", "0.288164 | \n", "0.276626 | \n", "1.000000 | \n", "0.135996 | \n", "
| forward_30Day | \n", "-0.069359 | \n", "-0.132127 | \n", "-0.364206 | \n", "-0.533907 | \n", "-0.016570 | \n", "0.135996 | \n", "1.000000 | \n", "
| \n", " | closes | \n", "log_returns | \n", "prev_closes | \n", "raw_returns | \n", "
|---|---|---|---|---|
| Date | \n", "\n", " | \n", " | \n", " | \n", " |
| 2010-01-04 | \n", "24.84 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 2010-01-05 | \n", "24.54 | \n", "-0.012151 | \n", "24.84 | \n", "-0.012077 | \n", "
| 2010-01-06 | \n", "24.53 | \n", "-0.000408 | \n", "24.54 | \n", "-0.000407 | \n", "
| 2010-01-07 | \n", "24.47 | \n", "-0.002449 | \n", "24.53 | \n", "-0.002446 | \n", "
| 2010-01-08 | \n", "24.02 | \n", "-0.018561 | \n", "24.47 | \n", "-0.018390 | \n", "
| 2010-01-11 | \n", "24.50 | \n", "0.019786 | \n", "24.02 | \n", "0.019983 | \n", "
| 2010-01-12 | \n", "24.77 | \n", "0.010960 | \n", "24.50 | \n", "0.011020 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 2014-09-10 | \n", "41.86 | \n", "0.005269 | \n", "41.64 | \n", "0.005283 | \n", "
| 2014-09-11 | \n", "41.95 | \n", "0.002148 | \n", "41.86 | \n", "0.002150 | \n", "
| 2014-09-12 | \n", "41.46 | \n", "-0.011749 | \n", "41.95 | \n", "-0.011681 | \n", "
| 2014-09-15 | \n", "41.50 | \n", "0.000964 | \n", "41.46 | \n", "0.000965 | \n", "
| 2014-09-16 | \n", "41.64 | \n", "0.003368 | \n", "41.50 | \n", "0.003373 | \n", "
| 2014-09-17 | \n", "41.61 | \n", "-0.000721 | \n", "41.64 | \n", "-0.000720 | \n", "
| 2014-09-18 | \n", "41.79 | \n", "0.004317 | \n", "41.61 | \n", "0.004326 | \n", "
1186 rows \u00d7 4 columns
\n", "| \n", " | KO | \n", "PEP | \n", "KO_lr | \n", "PEP_lr | \n", "KO_rr | \n", "PEP_rr | \n", "
|---|---|---|---|---|---|---|
| Date | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
| 2010-01-05 | \n", "24.54 | \n", "53.80 | \n", "-0.012151 | \n", "0.011967 | \n", "-0.012077 | \n", "0.012039 | \n", "
| 2010-01-06 | \n", "24.53 | \n", "53.26 | \n", "-0.000408 | \n", "-0.010088 | \n", "-0.000407 | \n", "-0.010037 | \n", "
| 2010-01-07 | \n", "24.47 | \n", "52.93 | \n", "-0.002449 | \n", "-0.006215 | \n", "-0.002446 | \n", "-0.006196 | \n", "
| 2010-01-08 | \n", "24.02 | \n", "52.75 | \n", "-0.018561 | \n", "-0.003407 | \n", "-0.018390 | \n", "-0.003401 | \n", "
| 2010-01-11 | \n", "24.50 | \n", "52.69 | \n", "0.019786 | \n", "-0.001138 | \n", "0.019983 | \n", "-0.001137 | \n", "
| 2010-01-12 | \n", "24.77 | \n", "53.43 | \n", "0.010960 | \n", "0.013947 | \n", "0.011020 | \n", "0.014044 | \n", "
| 2010-01-13 | \n", "24.84 | \n", "53.86 | \n", "0.002822 | \n", "0.008016 | \n", "0.002826 | \n", "0.008048 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 2014-09-10 | \n", "41.86 | \n", "91.79 | \n", "0.005269 | \n", "0.004039 | \n", "0.005283 | \n", "0.004047 | \n", "
| 2014-09-11 | \n", "41.95 | \n", "91.65 | \n", "0.002148 | \n", "-0.001526 | \n", "0.002150 | \n", "-0.001525 | \n", "
| 2014-09-12 | \n", "41.46 | \n", "90.87 | \n", "-0.011749 | \n", "-0.008547 | \n", "-0.011681 | \n", "-0.008511 | \n", "
| 2014-09-15 | \n", "41.50 | \n", "91.20 | \n", "0.000964 | \n", "0.003625 | \n", "0.000965 | \n", "0.003632 | \n", "
| 2014-09-16 | \n", "41.64 | \n", "92.57 | \n", "0.003368 | \n", "0.014910 | \n", "0.003373 | \n", "0.015022 | \n", "
| 2014-09-17 | \n", "41.61 | \n", "92.85 | \n", "-0.000721 | \n", "0.003020 | \n", "-0.000720 | \n", "0.003025 | \n", "
| 2014-09-18 | \n", "41.79 | \n", "93.37 | \n", "0.004317 | \n", "0.005585 | \n", "0.004326 | \n", "0.005600 | \n", "
1185 rows \u00d7 6 columns
\n", "