{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Time Series Functionality in daru\n", "\n", "This notebook describes the time series functionality of daru. We'll go through some examples of creating and interacting with a time series and also see the functionality that is offered by the specialized index that deals with time series data, called DateTimeIndex. A few functions that are particularly useful for analyzing time-based data will also be demoed.\n", "\n", "At the end we'll see how a time series can be visualized using the excellent GNU plot gem." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "data": { "application/javascript": [ "if(window['d3'] === undefined ||\n", " window['Nyaplot'] === undefined){\n", " var path = {\"d3\":\"https://cdnjs.cloudflare.com/ajax/libs/d3/3.5.5/d3.min\",\"downloadable\":\"https://cdn.rawgit.com/domitry/d3-downloadable/master/d3-downloadable\"};\n", "\n", "\n", "\n", " var shim = {\"d3\":{\"exports\":\"d3\"},\"downloadable\":{\"exports\":\"downloadable\"}};\n", "\n", " require.config({paths: path, shim:shim});\n", "\n", "\n", "require(['d3'], function(d3){window['d3']=d3;console.log('finished loading d3');require(['downloadable'], function(downloadable){window['downloadable']=downloadable;console.log('finished loading downloadable');\n", "\n", "\tvar script = d3.select(\"head\")\n", "\t .append(\"script\")\n", "\t .attr(\"src\", \"https://cdn.rawgit.com/domitry/Nyaplotjs/master/release/nyaplot.js\")\n", "\t .attr(\"async\", true);\n", "\n", "\tscript[0][0].onload = script[0][0].onreadystatechange = function(){\n", "\n", "\n", "\t var event = document.createEvent(\"HTMLEvents\");\n", "\t event.initEvent(\"load_nyaplot\",false,false);\n", "\t window.dispatchEvent(event);\n", "\t console.log('Finished loading Nyaplotjs');\n", "\n", "\t};\n", "\n", "\n", "});});\n", "}\n" ], "text/plain": [ "\"if(window['d3'] === undefined ||\\n window['Nyaplot'] === undefined){\\n var path = {\\\"d3\\\":\\\"https://cdnjs.cloudflare.com/ajax/libs/d3/3.5.5/d3.min\\\",\\\"downloadable\\\":\\\"https://cdn.rawgit.com/domitry/d3-downloadable/master/d3-downloadable\\\"};\\n\\n\\n\\n var shim = {\\\"d3\\\":{\\\"exports\\\":\\\"d3\\\"},\\\"downloadable\\\":{\\\"exports\\\":\\\"downloadable\\\"}};\\n\\n require.config({paths: path, shim:shim});\\n\\n\\nrequire(['d3'], function(d3){window['d3']=d3;console.log('finished loading d3');require(['downloadable'], function(downloadable){window['downloadable']=downloadable;console.log('finished loading downloadable');\\n\\n\\tvar script = d3.select(\\\"head\\\")\\n\\t .append(\\\"script\\\")\\n\\t .attr(\\\"src\\\", \\\"https://cdn.rawgit.com/domitry/Nyaplotjs/master/release/nyaplot.js\\\")\\n\\t .attr(\\\"async\\\", true);\\n\\n\\tscript[0][0].onload = script[0][0].onreadystatechange = function(){\\n\\n\\n\\t var event = document.createEvent(\\\"HTMLEvents\\\");\\n\\t event.initEvent(\\\"load_nyaplot\\\",false,false);\\n\\t window.dispatchEvent(event);\\n\\t console.log('Finished loading Nyaplotjs');\\n\\n\\t};\\n\\n\\n});});\\n}\\n\"" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "true" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "require 'daru'\n", "require 'awesome_print'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For a Daru::Vector or DataFrame to qualify as timeseries, it must be indexed using the Daru::DateTimeIndex class. A DateTimeIndex class can be created by using the .date_range function or by using the class constructor directly." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating DateTimeIndex with .date_range\n", "\n", "The DateTimeIndex.date_range function accepts the following options as parameters:\n", "* :start - A DateTime object or date-like string that defines the start of the date range.\n", "* :end - A DateTime object or date-like string that defines the end of the date range.\n", "* :freq - The interval between each date in the index. This can either be a string specifying the frequency (i.e. one of the frequency aliases) or a Daru::Offset object.\n", "* :periods - The number of periods that should go into this index. Takes precedence over `:end`.\n", "\n", "If you specify :start and :end options as strings, they can be complete or partial dates and daru will intelligently infer the date from the string directly. However, note that the date-like string must be in the format `YYYY-MM-DD HH:MM:SS`. Currently the precision of DateTimeIndex is upto seconds only, though this will improve in the future." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "#" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# In the code below we will create a DateTimeIndex starting from 2012-4-4 to 2012-4-19\n", "# with a daily frequency. The 'D' supplied to the :freq argument specifies that frequency\n", "# has to be daily. It can be any of the string offset alaises amongst those supported. See\n", "# the section below for a complete overview of date offsets.\n", "\n", "index = Daru::DateTimeIndex.date_range(:start => '2012-4-4', :end => '2012-4-19', :freq => 'D')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you see above `.date_range` has created a DateTimeIndex with 16 dates (or periods) with a daily frequency between each date.\n", "\n", "Converting this index to an Array shows that this is true:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[\n", " \u001b[1;37m[ 0] \u001b[0m#,\n", " \u001b[1;37m[ 1] \u001b[0m#,\n", " \u001b[1;37m[ 2] \u001b[0m#,\n", " \u001b[1;37m[ 3] \u001b[0m#,\n", " \u001b[1;37m[ 4] \u001b[0m#,\n", " \u001b[1;37m[ 5] \u001b[0m#,\n", " \u001b[1;37m[ 6] \u001b[0m#,\n", " \u001b[1;37m[ 7] \u001b[0m#,\n", " \u001b[1;37m[ 8] \u001b[0m#,\n", " \u001b[1;37m[ 9] \u001b[0m#,\n", " \u001b[1;37m[10] \u001b[0m#,\n", " \u001b[1;37m[11] \u001b[0m#,\n", " \u001b[1;37m[12] \u001b[0m#,\n", " \u001b[1;37m[13] \u001b[0m#,\n", " \u001b[1;37m[14] \u001b[0m#,\n", " \u001b[1;37m[15] \u001b[0m#\n", "]\n" ] } ], "source": [ "ap index.to_a\n", "nil" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Specifying a number before the date alias in the `:freq` option will set the frequency to a multiple of the offset. Using this you can create date ranges with frequency in multiples of whatever you want." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[\n", " \u001b[1;37m[0] \u001b[0m#,\n", " \u001b[1;37m[1] \u001b[0m#,\n", " \u001b[1;37m[2] \u001b[0m#,\n", " \u001b[1;37m[3] \u001b[0m#,\n", " \u001b[1;37m[4] \u001b[0m#\n", "]\n" ] } ], "source": [ "# The following code will create a range between 2014-5-1 00:00:00 and 2014,5,2 00:00:00,\n", "# with a difference of 6 hours between each date.\n", "\n", "index = Daru::DateTimeIndex.date_range(:start => DateTime.new(2014,5,1), :end => DateTime.new(2014,5,2), :freq => '6H')\n", "ap index.to_a; nil" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The freqeuncy strings that you just saw are translated under the hood into objects of type `Daru::Offsets`. These offsets determine the distance with which the dates are shifted. See this blog post for a detailed coverage of date offsets and thier string aliases.\n", "\n", "`:freq` can also accept a `Daru::DateOffset` object or any of the objects under the namespace of `Daru::Offsets`. For example, to create a date range that has a frequency of 6 seconds:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "#" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "offset = Daru::Offsets::Second.new(6)\n", "index = Daru::DateTimeIndex.date_range(:start => '2012-5-6', :end => '2012-5-6 20:00:00', :freq => offset)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another way to specify the range of the date index is to use the `:periods` option. This option will decide exactly how many index objects will go into DateTimeIndex, and will take precedence over whatever date is specified in the `:end` option.\n", "\n", "So to create an index of 50 periods starting from the date '2012-5-2' with a frequency of one month end between each:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "#" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "index = Daru::DateTimeIndex.date_range(:start => '2012-5-2', :periods => 50, :freq => 'ME')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can ask for the frequency of an index with the `#frequency` method." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "\"ME\"" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "index.frequency" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating a DateTimeIndex with the DateTimeIndex constructor.\n", "\n", "The DateTimeIndex constructor allows you to create DateTimeIndex even if the dates are not separated by a particular frequency." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "#" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "index = Daru::DateTimeIndex.new(\n", " [DateTime.new(2012,4,5), DateTime.new(2012,4,6), DateTime.new(2012,4,7), DateTime.new(2012,4,8)])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The constructor also accepts an optional `:freq` option that allows you to either pass a frequency string alias or an offset object. If you want daru to infer the frequency of your data by itself, pass it the `:infer` option and it will try to figure out the frequency of the data by itself (if a frequency cannot be inferred it will be set to `nil`)." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "#" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "index = Daru::DateTimeIndex.new([\n", " DateTime.new(2012,4,5), DateTime.new(2012,4,6), DateTime.new(2012,4,7), DateTime.new(2012,4,8),\n", " DateTime.new(2012,4,9), DateTime.new(2012,4,10), DateTime.new(2012,4,11), DateTime.new(2012,4,12)\n", " ], freq: :infer)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## DateTimeIndex methods\n", "\n", "The DateTimeIndex offers a host of methods for manipulating and knowing more about the data contained in the index. Let us consider a sample DateTimeIndex and demonstrate:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "#" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "index = Daru::DateTimeIndex.date_range(:start => '2012', :periods => 10, :freq => 'YEAR')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can get a Ruby Array of all the years that each of the indexes belongs to with the `#year` method:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "[2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "index.year" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similarly you can query for `#month`, `#day`, `#hour`, `#min` or `#sec` using the respective methods." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To move all the data points of a DateTimeIndex to the future, the `#shift` method can be used, or to move all of them to the past, use the `#lag` method.\n", "\n", "Passing an offset to #shift will offset each data point by the offset value:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "#" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "index.shift(Daru::Offsets::Hour.new(3))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Passing a positive integer into #shift will offset each data point by the same offset that it was created with:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "#" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "index.shift(4) # Shift by 4 years" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`#lag` works in a similar manner:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "#" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "index.lag(Daru::DateOffset.new(days: 4))" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "#" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "index.lag(2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using DateTimeIndex with Vector and DataFrame\n", "\n", "When used with `Daru::Vector` or `Daru::DataFrame`, DateTimeIndex functions exactly like any other index. You can query individual dates, slices, etc. and retrieve the relevant data by specifying the date either completely or partially.\n", "\n", "One of the salient features of indexing time-based data with the DateTimeIndex is that it lets you retrieve data of a given time period by specifying just a partial data. We'll see how exactly this can be done with some examples:\n", "\n", "For starters lets create a basic Daru::Vector that is indexed on DateTimeIndex." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
Daru::Vector:36599420 size: 10
nil
2012-03-04T00:00:00+00:001
2012-03-04T01:00:00+00:002
2012-03-04T02:00:00+00:003
2012-03-04T03:00:00+00:004
2012-03-04T04:00:00+00:005
2012-03-04T05:00:00+00:001
2012-03-04T06:00:00+00:002
2012-03-04T07:00:00+00:003
2012-03-04T08:00:00+00:004
2012-03-04T09:00:00+00:005
" ], "text/plain": [ "\n", "#\n", " nil\n", "2012-03-04T00:00:00+ 1\n", "2012-03-04T01:00:00+ 2\n", "2012-03-04T02:00:00+ 3\n", "2012-03-04T03:00:00+ 4\n", "2012-03-04T04:00:00+ 5\n", "2012-03-04T05:00:00+ 1\n", "2012-03-04T06:00:00+ 2\n", "2012-03-04T07:00:00+ 3\n", "2012-03-04T08:00:00+ 4\n", "2012-03-04T09:00:00+ 5\n" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "index = Daru::DateTimeIndex.date_range(:start => '2012-3-4', :periods => 50000, :freq => 'H')\n", "vector = Daru::Vector.new([1,2,3,4,5]*10000, index: index)\n", "vector.head" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can retrieve data by specifying the date completely or partially. Specifying it partially will retrive all the data that falls under that time period. For example, to retreive all the data that falls under April 2012 (thats '2012-4'):" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
Daru::Vector:37106300 size: 720
nil
2012-04-01T00:00:00+00:003
2012-04-01T01:00:00+00:004
2012-04-01T02:00:00+00:005
2012-04-01T03:00:00+00:001
2012-04-01T04:00:00+00:002
2012-04-01T05:00:00+00:003
2012-04-01T06:00:00+00:004
2012-04-01T07:00:00+00:005
2012-04-01T08:00:00+00:001
2012-04-01T09:00:00+00:002
2012-04-01T10:00:00+00:003
2012-04-01T11:00:00+00:004
2012-04-01T12:00:00+00:005
2012-04-01T13:00:00+00:001
2012-04-01T14:00:00+00:002
2012-04-01T15:00:00+00:003
2012-04-01T16:00:00+00:004
2012-04-01T17:00:00+00:005
2012-04-01T18:00:00+00:001
2012-04-01T19:00:00+00:002
2012-04-01T20:00:00+00:003
2012-04-01T21:00:00+00:004
2012-04-01T22:00:00+00:005
2012-04-01T23:00:00+00:001
2012-04-02T00:00:00+00:002
2012-04-02T01:00:00+00:003
2012-04-02T02:00:00+00:004
2012-04-02T03:00:00+00:005
2012-04-02T04:00:00+00:001
2012-04-02T05:00:00+00:002
2012-04-02T06:00:00+00:003
2012-04-02T07:00:00+00:004
......
2012-04-30T23:00:00+00:002
" ], "text/plain": [ "\n", "#\n", " nil\n", "2012-04-01T00:00:00+ 3\n", "2012-04-01T01:00:00+ 4\n", "2012-04-01T02:00:00+ 5\n", "2012-04-01T03:00:00+ 1\n", "2012-04-01T04:00:00+ 2\n", "2012-04-01T05:00:00+ 3\n", "2012-04-01T06:00:00+ 4\n", "2012-04-01T07:00:00+ 5\n", "2012-04-01T08:00:00+ 1\n", "2012-04-01T09:00:00+ 2\n", "2012-04-01T10:00:00+ 3\n", "2012-04-01T11:00:00+ 4\n", "2012-04-01T12:00:00+ 5\n", "2012-04-01T13:00:00+ 1\n", "2012-04-01T14:00:00+ 2\n", "2012-04-01T15:00:00+ 3\n", "2012-04-01T16:00:00+ 4\n", " ... ...\n" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "vector['2012-4']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see only the data with an index on April 2012 was retreived.\n", "\n", "Now, say you want all the data under the year 2013. You can just specify the year as a string:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
Daru::Vector:38535340 size: 8760
nil
2013-01-01T00:00:00+00:003
2013-01-01T01:00:00+00:004
2013-01-01T02:00:00+00:005
2013-01-01T03:00:00+00:001
2013-01-01T04:00:00+00:002
2013-01-01T05:00:00+00:003
2013-01-01T06:00:00+00:004
2013-01-01T07:00:00+00:005
2013-01-01T08:00:00+00:001
2013-01-01T09:00:00+00:002
2013-01-01T10:00:00+00:003
2013-01-01T11:00:00+00:004
2013-01-01T12:00:00+00:005
2013-01-01T13:00:00+00:001
2013-01-01T14:00:00+00:002
2013-01-01T15:00:00+00:003
2013-01-01T16:00:00+00:004
2013-01-01T17:00:00+00:005
2013-01-01T18:00:00+00:001
2013-01-01T19:00:00+00:002
2013-01-01T20:00:00+00:003
2013-01-01T21:00:00+00:004
2013-01-01T22:00:00+00:005
2013-01-01T23:00:00+00:001
2013-01-02T00:00:00+00:002
2013-01-02T01:00:00+00:003
2013-01-02T02:00:00+00:004
2013-01-02T03:00:00+00:005
2013-01-02T04:00:00+00:001
2013-01-02T05:00:00+00:002
2013-01-02T06:00:00+00:003
2013-01-02T07:00:00+00:004
......
2013-12-31T23:00:00+00:002
" ], "text/plain": [ "\n", "#\n", " nil\n", "2013-01-01T00:00:00+ 3\n", "2013-01-01T01:00:00+ 4\n", "2013-01-01T02:00:00+ 5\n", "2013-01-01T03:00:00+ 1\n", "2013-01-01T04:00:00+ 2\n", "2013-01-01T05:00:00+ 3\n", "2013-01-01T06:00:00+ 4\n", "2013-01-01T07:00:00+ 5\n", "2013-01-01T08:00:00+ 1\n", "2013-01-01T09:00:00+ 2\n", "2013-01-01T10:00:00+ 3\n", "2013-01-01T11:00:00+ 4\n", "2013-01-01T12:00:00+ 5\n", "2013-01-01T13:00:00+ 1\n", "2013-01-01T14:00:00+ 2\n", "2013-01-01T15:00:00+ 3\n", "2013-01-01T16:00:00+ 4\n", " ... ...\n" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "vector['2013']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Passing a string to `#[]` evaluates it to the greatest possible accuracy and then retrieves the relevant data. Now say you want the data that happens to be on 4th February 2013. Just specify this as a string:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
Daru::Vector:40565680 size: 24
nil
2013-02-04T00:00:00+00:004
2013-02-04T01:00:00+00:005
2013-02-04T02:00:00+00:001
2013-02-04T03:00:00+00:002
2013-02-04T04:00:00+00:003
2013-02-04T05:00:00+00:004
2013-02-04T06:00:00+00:005
2013-02-04T07:00:00+00:001
2013-02-04T08:00:00+00:002
2013-02-04T09:00:00+00:003
2013-02-04T10:00:00+00:004
2013-02-04T11:00:00+00:005
2013-02-04T12:00:00+00:001
2013-02-04T13:00:00+00:002
2013-02-04T14:00:00+00:003
2013-02-04T15:00:00+00:004
2013-02-04T16:00:00+00:005
2013-02-04T17:00:00+00:001
2013-02-04T18:00:00+00:002
2013-02-04T19:00:00+00:003
2013-02-04T20:00:00+00:004
2013-02-04T21:00:00+00:005
2013-02-04T22:00:00+00:001
2013-02-04T23:00:00+00:002
" ], "text/plain": [ "\n", "#\n", " nil\n", "2013-02-04T00:00:00+ 4\n", "2013-02-04T01:00:00+ 5\n", "2013-02-04T02:00:00+ 1\n", "2013-02-04T03:00:00+ 2\n", "2013-02-04T04:00:00+ 3\n", "2013-02-04T05:00:00+ 4\n", "2013-02-04T06:00:00+ 5\n", "2013-02-04T07:00:00+ 1\n", "2013-02-04T08:00:00+ 2\n", "2013-02-04T09:00:00+ 3\n", "2013-02-04T10:00:00+ 4\n", "2013-02-04T11:00:00+ 5\n", "2013-02-04T12:00:00+ 1\n", "2013-02-04T13:00:00+ 2\n", "2013-02-04T14:00:00+ 3\n", "2013-02-04T15:00:00+ 4\n", "2013-02-04T16:00:00+ 5\n", " ... ...\n" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "vector['2013-2-4']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Passing accuracy upto minutes will return precisely that data point, because the highest accuracy of the index is minutes." ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "vector['2013-2-4 22']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For specifying dates precisely, it is even possible to pass a DateTime object into `#[]`:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "vector[DateTime.new(2012,5,1)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "DateTimeIndex can be used with DataFrame the way it was used with Vector. We can index both rows and columns of a DataFrame using a DateTimeIndex:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
Daru::DataFrame:40773500 rows: 50 cols: 3
abc
2012-04-05T00:00:00+00:001afoo
2012-04-06T00:00:00+00:002bbar
2012-04-07T00:00:00+00:003cbaz
2012-04-08T00:00:00+00:004drazz
2012-04-09T00:00:00+00:005ejazz
2012-04-10T00:00:00+00:001afoo
2012-04-11T00:00:00+00:002bbar
2012-04-12T00:00:00+00:003cbaz
2012-04-13T00:00:00+00:004drazz
2012-04-14T00:00:00+00:005ejazz
2012-04-15T00:00:00+00:001afoo
2012-04-16T00:00:00+00:002bbar
2012-04-17T00:00:00+00:003cbaz
2012-04-18T00:00:00+00:004drazz
2012-04-19T00:00:00+00:005ejazz
2012-04-20T00:00:00+00:001afoo
2012-04-21T00:00:00+00:002bbar
2012-04-22T00:00:00+00:003cbaz
2012-04-23T00:00:00+00:004drazz
2012-04-24T00:00:00+00:005ejazz
2012-04-25T00:00:00+00:001afoo
2012-04-26T00:00:00+00:002bbar
2012-04-27T00:00:00+00:003cbaz
2012-04-28T00:00:00+00:004drazz
2012-04-29T00:00:00+00:005ejazz
2012-04-30T00:00:00+00:001afoo
2012-05-01T00:00:00+00:002bbar
2012-05-02T00:00:00+00:003cbaz
2012-05-03T00:00:00+00:004drazz
2012-05-04T00:00:00+00:005ejazz
2012-05-05T00:00:00+00:001afoo
2012-05-06T00:00:00+00:002bbar
............
2012-05-24T00:00:00+00:005ejazz
" ], "text/plain": [ "\n", "#\n", " a b c \n", "2012-04-05 1 a foo \n", "2012-04-06 2 b bar \n", "2012-04-07 3 c baz \n", "2012-04-08 4 d razz \n", "2012-04-09 5 e jazz \n", "2012-04-10 1 a foo \n", "2012-04-11 2 b bar \n", "2012-04-12 3 c baz \n", "2012-04-13 4 d razz \n", "2012-04-14 5 e jazz \n", "2012-04-15 1 a foo \n", "2012-04-16 2 b bar \n", "2012-04-17 3 c baz \n", "2012-04-18 4 d razz \n", "2012-04-19 5 e jazz \n", " ... ... ... ... \n" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "index = Daru::DateTimeIndex.date_range(:start => '2012-4-5', :periods => 50, :freq => 'D')\n", "df = Daru::DataFrame.new({\n", " a: [1,2,3,4,5]*10,\n", " b: ['a','b','c','d','e']*10,\n", " c: ['foo', 'bar','baz','razz','jazz']*10\n", " }, index: index)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Rows can be retreived using a syntax similar to that of Daru::Vector:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
Daru::DataFrame:40777240 rows: 24 cols: 3
abc
2012-05-01T00:00:00+00:002bbar
2012-05-02T00:00:00+00:003cbaz
2012-05-03T00:00:00+00:004drazz
2012-05-04T00:00:00+00:005ejazz
2012-05-05T00:00:00+00:001afoo
2012-05-06T00:00:00+00:002bbar
2012-05-07T00:00:00+00:003cbaz
2012-05-08T00:00:00+00:004drazz
2012-05-09T00:00:00+00:005ejazz
2012-05-10T00:00:00+00:001afoo
2012-05-11T00:00:00+00:002bbar
2012-05-12T00:00:00+00:003cbaz
2012-05-13T00:00:00+00:004drazz
2012-05-14T00:00:00+00:005ejazz
2012-05-15T00:00:00+00:001afoo
2012-05-16T00:00:00+00:002bbar
2012-05-17T00:00:00+00:003cbaz
2012-05-18T00:00:00+00:004drazz
2012-05-19T00:00:00+00:005ejazz
2012-05-20T00:00:00+00:001afoo
2012-05-21T00:00:00+00:002bbar
2012-05-22T00:00:00+00:003cbaz
2012-05-23T00:00:00+00:004drazz
2012-05-24T00:00:00+00:005ejazz
" ], "text/plain": [ "\n", "#\n", " a b c \n", "2012-05-01 2 b bar \n", "2012-05-02 3 c baz \n", "2012-05-03 4 d razz \n", "2012-05-04 5 e jazz \n", "2012-05-05 1 a foo \n", "2012-05-06 2 b bar \n", "2012-05-07 3 c baz \n", "2012-05-08 4 d razz \n", "2012-05-09 5 e jazz \n", "2012-05-10 1 a foo \n", "2012-05-11 2 b bar \n", "2012-05-12 3 c baz \n", "2012-05-13 4 d razz \n", "2012-05-14 5 e jazz \n", "2012-05-15 1 a foo \n", " ... ... ... ... \n" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.row['2012-5']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] } ], "metadata": { "kernelspec": { "display_name": "Ruby 2.2.1", "language": "ruby", "name": "ruby" }, "language_info": { "file_extension": ".rb", "mimetype": "application/x-ruby", "name": "ruby", "version": "2.2.1" } }, "nbformat": 4, "nbformat_minor": 0 }