{ "metadata": { "celltoolbar": "Slideshow", "name": "", "signature": "sha256:785efca9f09b791cf721f1e717731eb149a8a53e1c505018e7721102915ffdde" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Analyzing Data on Toronto's Public Transportation System" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "This is a project by [Eric Ye](https://ericye16.com/)." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "A nice thing about living in Toronto is that the TTC (Toronto Transit Commission) offers data on their bus and streetcar schedules, stops, and routes to the general public. They offer it both in static, six-week-block form in [Google Transit Feed Specification format](https://developers.google.com/transit/gtfs/) [here](http://www1.toronto.ca/wps/portal/contentonly?vgnextoid=96f236899e02b210VgnVCM1000003dd60f89RCRD&vgnextchannel=1a66e03bb8d1e310VgnVCM10000071d60f89RCRD), as well as real-time data from their Next Vehicle Arrival System [here](http://www1.toronto.ca/wps/portal/contentonly?vgnextoid=4427790e6f21d210VgnVCM1000003dd60f89RCRD&vgnextchannel=1a66e03bb8d1e310VgnVCM10000071d60f89RCRD).\n", "\n", "Combining the two, a good deal of data could be collected and analyzed." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "How do cloud cover, precipitation, temperature, date of week, time of day, bus routes, bus route direction, route ridership, number of stops, number of trips per six weeks and geographic area affect the punctuality as determined by average time between scheduled arrival at stops and actual arrival at stops, and speed of buses and streetcars in the Toronto Transit Commission in Spring-Summer 2014?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Some Facts\n", "* Toronto has over 10,000 stops of some variety\n", "* The TTC operates between ~100 and ~1800 vehicles at any given time\n", "* Vehicles include subways, streetcars, and buses\n", "* 23% of Torontonians use public transit to commute to work every day" ] }, { "cell_type": "heading", "level": 2, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Methodology" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Data was found from:\n", "* Toronto Open Data\n", "* NextBus (Next Vehicle Arrival System)\n", "* TTC" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Collection Method\n", "Every two minutes:\n", "* Poll NextBus Servers\n", "* For all vehicles:\n", " * Position\n", " * Route\n", " * Direction\n", "* Takes ~50 seconds\n", "* For three weeks" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Sampling Method\n", "* Census\n", "\n", "or\n", "* Systematic" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Terms and Definitions\n", "\n", "### Velocity\n", "$$ v_x = \\frac{x_1 - x_0}{\\Delta t} $$\n", "\n", "$$ v_y = \\frac{y_1 - y_0}{\\Delta t} $$\n", "\n", "$$ v = \\sqrt{v_x ^ 2 + v_y ^ 2} $$\n", "\n", "Where $v$ is the final velocity, $(x_0, y_0)$ is the position (in metres) of the vehicle originally, $(x_1, y_1)$ is the position of the vehicle at the next time step and $\\Delta t$ is the time difference in seconds. Note that since data is collected every two minutes, and that vehicles rarely travel in perfectly straight lines, this is really a minimum speed in the timeframe." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Punctuality\n", "Punctuality is defined as how late a vehicle is to its scheduled stop. This calculation is simple:\n", "$$ p = t_a - t_s $$\n", "where $p$ is the punctuality metric, $t_a$ is the time at which the vehicle was found to be **close** to the bus stop, and $t_s$ was the time of the latest scheduled stop before $t_a$.\n", "\n", "Note: **close** is defined as being within a 20x20m square around the coordinates of the stop." ] } ], "metadata": {} } ] }