{ "metadata": { "name": "", "signature": "sha256:7fe0e5698985c84a39b76b65a46b97645400543c129f540348a77452fa1efc1e" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Converting Strings To Datetime\n", "\n", "- **Author:** [Chris Albon](http://www.chrisalbon.com/), [@ChrisAlbon](https://twitter.com/chrisalbon)\n", "- **Date:** -\n", "- **Repo:** [Python 3 code snippets for data science](https://github.com/chrisalbon/code_py)\n", "- **Note:**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Datetime formating" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** Type - Description** \n", "- %Y - 4-digit year\n", "- %y - 2-digit year\n", "- %m - 2-digit month [01, 12]\n", "- %d - 2-digit day [01, 31]\n", "- %H - Hour (24-hour clock) [00, 23]\n", "- %I - Hour (12-hour clock) [01, 12]\n", "- %M - 2-digit minute [00, 59]\n", "- %S - Second [00, 61] (seconds 60, 61 account for leap seconds) \n", "- %w - Weekday as integer [0 (Sunday), 6]\n", "- %U - Week number of the year [00, 53]. Sunday is considered the first day of the week, and days before the first Sunday of the year are \u201cweek 0\u201d.\n", "- %W - Week number of the year [00, 53]. Monday is considered the first day of the week, and days before the first Monday of the year are \u201cweek 0\u201d.\n", "- %z - UTC time zone offset as +HHMM or -HHMM, empty if time zone naive %F \n", "- %F - Shortcut for %Y-%m-%d, for example 2012-4-18\n", "- %D - Shortcut for %m/%d/%y, for example 04/18/12\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", " \n", " \n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Import modules" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from datetime import datetime\n", "from dateutil.parser import parse\n", "import pandas as pd" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 24 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a string variable with the war start time" ] }, { "cell_type": "code", "collapsed": false, "input": [ "war_start = '2011-01-03'" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 25 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Convert the string to datetime format" ] }, { "cell_type": "code", "collapsed": false, "input": [ "datetime.strptime(war_start, '%Y-%m-%d')" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 26, "text": [ "datetime.datetime(2011, 1, 3, 0, 0)" ] } ], "prompt_number": 26 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a list of strings as dates" ] }, { "cell_type": "code", "collapsed": false, "input": [ "attack_dates = ['7/2/2011', '8/6/2012', '11/13/2013', '5/26/2011', '5/2/2001']" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 27 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Convert attack_dates strings into datetime format" ] }, { "cell_type": "code", "collapsed": false, "input": [ "[datetime.strptime(x, '%m/%d/%Y') for x in attack_dates]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 28, "text": [ "[datetime.datetime(2011, 7, 2, 0, 0),\n", " datetime.datetime(2012, 8, 6, 0, 0),\n", " datetime.datetime(2013, 11, 13, 0, 0),\n", " datetime.datetime(2011, 5, 26, 0, 0),\n", " datetime.datetime(2001, 5, 2, 0, 0)]" ] } ], "prompt_number": 28 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Use parse() to attempt to auto-convert common string formats" ] }, { "cell_type": "code", "collapsed": false, "input": [ "parse(war_start)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 29, "text": [ "datetime.datetime(2011, 1, 3, 0, 0)" ] } ], "prompt_number": 29 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Use parse() on every element of the attack_dates string" ] }, { "cell_type": "code", "collapsed": false, "input": [ "[parse(x) for x in attack_dates]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 30, "text": [ "[datetime.datetime(2011, 7, 2, 0, 0),\n", " datetime.datetime(2012, 8, 6, 0, 0),\n", " datetime.datetime(2013, 11, 13, 0, 0),\n", " datetime.datetime(2011, 5, 26, 0, 0),\n", " datetime.datetime(2001, 5, 2, 0, 0)]" ] } ], "prompt_number": 30 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Use parse, but designate that the day is first" ] }, { "cell_type": "code", "collapsed": false, "input": [ "parse(war_start, dayfirst=True)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 31, "text": [ "datetime.datetime(2011, 1, 3, 0, 0)" ] } ], "prompt_number": 31 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a dataframe" ] }, { "cell_type": "code", "collapsed": false, "input": [ "data = {'date': ['2014-05-01 18:47:05.069722', '2014-05-01 18:47:05.119994', '2014-05-02 18:47:05.178768', '2014-05-02 18:47:05.230071', '2014-05-02 18:47:05.230071', '2014-05-02 18:47:05.280592', '2014-05-03 18:47:05.332662', '2014-05-03 18:47:05.385109', '2014-05-04 18:47:05.436523', '2014-05-04 18:47:05.486877'], \n", " 'value': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}\n", "df = pd.DataFrame(data, columns = ['date', 'value'])\n", "print(df)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ " date value\n", "0 2014-05-01 18:47:05.069722 1\n", "1 2014-05-01 18:47:05.119994 1\n", "2 2014-05-02 18:47:05.178768 1\n", "3 2014-05-02 18:47:05.230071 1\n", "4 2014-05-02 18:47:05.230071 1\n", "5 2014-05-02 18:47:05.280592 1\n", "6 2014-05-03 18:47:05.332662 1\n", "7 2014-05-03 18:47:05.385109 1\n", "8 2014-05-04 18:47:05.436523 1\n", "9 2014-05-04 18:47:05.486877 1\n", "\n", "[10 rows x 2 columns]\n" ] } ], "prompt_number": 32 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Convert df['date'] from string to datetime" ] }, { "cell_type": "code", "collapsed": false, "input": [ "pd.to_datetime(df['date'])" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 37, "text": [ "0 2014-05-01 18:47:05.069722\n", "1 2014-05-01 18:47:05.119994\n", "2 2014-05-02 18:47:05.178768\n", "3 2014-05-02 18:47:05.230071\n", "4 2014-05-02 18:47:05.230071\n", "5 2014-05-02 18:47:05.280592\n", "6 2014-05-03 18:47:05.332662\n", "7 2014-05-03 18:47:05.385109\n", "8 2014-05-04 18:47:05.436523\n", "9 2014-05-04 18:47:05.486877\n", "Name: date, dtype: datetime64[ns]" ] } ], "prompt_number": 37 } ], "metadata": {} } ] }