{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# ECON 323 Final Project\n", "\n", "###### Factors affecting bike share in San Francisco\n", "\n", "## Background \n", "Bike sharing is an increasingly popular service, especially as cities are moving to be more sustainable. [Eren and Uz (2020)](https://www.sciencedirect.com/science/article/abs/pii/S2210670719312387) identified several factors affecting bike sharing, including weather (e.g. seasons, precipitation, and temperature), public transportation, safety, and other temporal factors. Bay Wheels is a bike sharing service in San Francisco provided by Lyft (similar to the Shaw Mobi bikes in Vancouver). Some differences are that Bay Wheels provides both electric bikes and regular bikes, and that the service is accessible through the Lyft application and Clipper cards (similar to Vancouver’s Compass card). Bay Wheels’ dataset is publicly available on Lyft’s website at https://www.lyft.com/bikes/bay-wheels/system-data. This project aims to look at factors that influence bike sharing in San Francisco.\n", "\n", "## Literature Review\n", "Based on Eren and Uz (2020)’s findings, the optimal temperature for bike sharing hovers around 20 – 30 °C with no precipitation, while the 10-20 °C range also has a positive correlation in bike sharing. Eren and Uz also found that bike sharing can be used as a substitute for public transportation in areas with poor public transport options or at times where services are reduced (e.g. late night/early morning). However, bike sharing services are also used as complements to public transportation to reduce travel time. Eren and Uz found that the correlation between income level and usage of bike share services tend to be strongly correlated as income goes up. However, Guo et al (2017) found that in Ningbo, China found that the use of bike sharing services are highest among lower income residents, followed by middle-income and high-income respectively. \n", "\n", "## Hypothesis/Research\n", "\n", "Using Eren and Uz (2020)’s findings as a starting point, I will look at how the following factors affect bike sharing in San Francisco. \n", "\n", "* **Weather** – San Francisco’s mild coastal weather ranges around 8 – 21 °C all year round, with an average of 8 days of precipitation in the winter months and 0 days in the summer months. These numbers make San Francisco’s weather the ideal conditions for bike sharing. I will examine whether seasonality still influences bike sharing in San Francisco by cross-examining weather data with start/end times for bike share sessions.\n", "\n", "* **Public transportation** – I will look bike share usage along BART (rail/subway), Caltrain, and MUNI (light rail and cable car) stops and the time of day. Since many working in the city are commuters, I would expect higher usages along major transit stops before/after work hours. \n", "\n", "* **Safety** – I will see whether crime rate (from San Francisco Police Department) affects bike share usage. Safety is often cited as a concern for bike share users, and I expect there to be a negative correlation between bike share usage and crime rate. \n", "\n", "* **Temporal factors** - I will break down bike share usage by weekdays/weekends, and also the time of day. \n", "\n", "## Dataset\n", "The main dataset used in this analysis comes from Lyft's Bay Wheels bike share system data from June 2019 to March 2021 (the most recently available). It is available at https://www.lyft.com/bikes/bay-wheels/system-data. I have chosen to exclude data prior to Lyft's acquisitation of Bay Wheels in June 2019 (previously Ford GoBike), as the rebrand led to slight changes to the bike share model (e.g. docked and dockless bikes, increased electrical bikes, etc...) and the rental access methods (i.e. the bikes can now be rented through the Lyft app). Bay Wheels operates in San Francisco, Oakland, Berkeley, Emeryville, and San Jose. I will focus this study on bike sharing in San Francisco only. \n", "\n", "Other datasets used:\n", "* [San Francisco GeoPolygon](http://www2.census.gov/geo/tiger/PREVGENZ/ma/ma99/ce00shp/ce99_d00_shp.zip) for filtering for San Francisco data only\n", "* [Historical weather records from San Francisco Bay Area Weather Forecast Office](https://w2.weather.gov/climate/xmacis.php?wfo=mtr)\n", "* [Passenger rail stations data from California's Metropolitan Transportation Commission (MTC)](https://opendata.mtc.ca.gov/datasets/efd75b7bb3c04dbda06c6e7cd73e9336_0?geometry=-122.954%2C37.659%2C-121.976%2C37.849), which includes Bay Area Rapid Transit (BART), Caltrain, and San Francisco Municipal Railway (MUNI - light rail and cable car) data\n", "* [Incidents report from San Francisco Police Department (2018 to Present)](https://data.sfgov.org/Public-Safety/Police-Department-Incident-Reports-2018-to-Present/wg3w-h783)\n", "* [San Francisco Neighborhoods from San Francisco Open Data (DataSF)](https://data.sfgov.org/Geographic-Locations-and-Boundaries/Analysis-Neighborhoods/p5b7-5n3h)\n", "\n", "## Outline\n", "1. **Examining temporal factors:** visualizations of time of day and weekday/weekend usage of bike shares, as well as user types over weekday/weekends (casual versus subscribed member)\n", "\n", "2. **Examining the role of weather:** visualization of daily temperature, precipitation, and rate of bike shares\n", "\n", "3. **Examining safety and public transportation as factors affeting bike sharing:** map visualization of pubic transportation (BART, MUNI, Caltrain) stations, crime rate for each neighborhood, and bike share usage based on start location \n", "\n", "4. **Training a linear regression model to predict bike share usage** comparing linear regression and one with polynomial transformations applied\n", "\n", "5. **Conclusion:** discussion of limitations, areas for further research" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", " | started_at | \n", "ended_at | \n", "start_station_id | \n", "start_station_name | \n", "start_lat | \n", "start_lng | \n", "end_station_id | \n", "end_station_name | \n", "end_lat | \n", "end_lng | \n", "bike_id | \n", "member_casual | \n", "bike_share_for_all_trip | \n", "rental_access_method | \n", "rideable_type | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "2019-06-30 18:16:09.7730 | \n", "2019-07-01 16:57:45.5920 | \n", "109 | \n", "17th St at Valencia St | \n", "37.763316 | \n", "-122.421904 | \n", "56 | \n", "Koshland Park | \n", "37.773414 | \n", "-122.427317 | \n", "1502.0 | \n", "member | \n", "No | \n", "<NA> | \n", "NaN | \n", "
1 | \n", "2019-06-30 18:09:55.8300 | \n", "2019-07-01 14:47:36.6810 | \n", "50 | \n", "2nd St at Townsend St | \n", "37.780526 | \n", "-122.390288 | \n", "101 | \n", "15th St at Potrero Ave | \n", "37.767079 | \n", "-122.407359 | \n", "2526.0 | \n", "casual | \n", "No | \n", "<NA> | \n", "NaN | \n", "
2 | \n", "2019-06-30 15:40:31.0380 | \n", "2019-07-01 08:13:54.3490 | \n", "23 | \n", "The Embarcadero at Steuart St | \n", "37.791464 | \n", "-122.391034 | \n", "30 | \n", "San Francisco Caltrain (Townsend St at 4th St) | \n", "37.776598 | \n", "-122.395282 | \n", "2427.0 | \n", "member | \n", "No | \n", "<NA> | \n", "NaN | \n", "
4 | \n", "2019-06-30 17:21:00.0550 | \n", "2019-07-01 06:55:54.9960 | \n", "15 | \n", "San Francisco Ferry Building (Harry Bridges Pl... | \n", "37.795392 | \n", "-122.394203 | \n", "30 | \n", "San Francisco Caltrain (Townsend St at 4th St) | \n", "37.776598 | \n", "-122.395282 | \n", "1070.0 | \n", "casual | \n", "No | \n", "<NA> | \n", "NaN | \n", "
6 | \n", "2019-06-30 14:31:39.5730 | \n", "2019-07-01 00:53:02.2520 | \n", "6 | \n", "The Embarcadero at Sansome St | \n", "37.804770 | \n", "-122.403234 | \n", "400 | \n", "Buchanan St at North Point St | \n", "37.804272 | \n", "-122.433537 | \n", "1980.0 | \n", "casual | \n", "No | \n", "<NA> | \n", "NaN | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
3944933 | \n", "2021-03-17 15:28:27 | \n", "2021-03-17 15:29:10 | \n", "SF-C28-2 | \n", "Broadway at Battery St | \n", "37.798572 | \n", "-122.400869 | \n", "SF-C28-2 | \n", "Broadway at Battery St | \n", "37.798572 | \n", "-122.400869 | \n", "NaN | \n", "casual | \n", "NaN | \n", "NaN | \n", "classic_bike | \n", "
3944935 | \n", "2021-03-26 11:27:13 | \n", "2021-03-26 11:38:59 | \n", "SF-C28-2 | \n", "Broadway at Battery St | \n", "37.798509 | \n", "-122.400834 | \n", "SF-C28-2 | \n", "Broadway at Battery St | \n", "37.798637 | \n", "-122.400695 | \n", "NaN | \n", "casual | \n", "NaN | \n", "NaN | \n", "electric_bike | \n", "
3944936 | \n", "2021-03-21 11:24:06 | \n", "2021-03-21 11:25:33 | \n", "SF-E18 | \n", "Divisadero St at Clay St | \n", "37.789588 | \n", "-122.440683 | \n", "SF-E18 | \n", "Divisadero St at Clay St | \n", "37.789588 | \n", "-122.440683 | \n", "NaN | \n", "casual | \n", "NaN | \n", "NaN | \n", "classic_bike | \n", "
3944937 | \n", "2021-03-29 20:34:27 | \n", "2021-03-29 20:47:19 | \n", "SF-BB17 | \n", "London St at Geneva Ave | \n", "37.716183 | \n", "-122.440112 | \n", "SF-BB17 | \n", "London St at Geneva Ave | \n", "37.716209 | \n", "-122.440078 | \n", "NaN | \n", "casual | \n", "NaN | \n", "NaN | \n", "electric_bike | \n", "
3944938 | \n", "2021-03-15 18:18:24 | \n", "2021-03-15 18:27:03 | \n", "SF-N27 | \n", "Rhode Island St at 17th St | \n", "37.764478 | \n", "-122.402570 | \n", "SF-N27 | \n", "Rhode Island St at 17th St | \n", "37.764478 | \n", "-122.402570 | \n", "NaN | \n", "casual | \n", "NaN | \n", "NaN | \n", "classic_bike | \n", "
3118591 rows × 15 columns
\n", "\n", " | started_at | \n", "ended_at | \n", "start_station_id | \n", "start_station_name | \n", "start_lat | \n", "start_lng | \n", "end_station_id | \n", "end_station_name | \n", "end_lat | \n", "end_lng | \n", "... | \n", "member_casual | \n", "bike_share_for_all_trip | \n", "rental_access_method | \n", "rideable_type | \n", "date | \n", "year | \n", "month | \n", "day_of_week | \n", "is_weekend | \n", "hour_of_day | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "2019-06-30 18:16:09.7730 | \n", "2019-07-01 16:57:45.5920 | \n", "109 | \n", "17th St at Valencia St | \n", "37.763316 | \n", "-122.421904 | \n", "56 | \n", "Koshland Park | \n", "37.773414 | \n", "-122.427317 | \n", "... | \n", "member | \n", "No | \n", "<NA> | \n", "NaN | \n", "2019-06-30 | \n", "2019 | \n", "6 | \n", "6 | \n", "True | \n", "18 | \n", "
1 | \n", "2019-06-30 18:09:55.8300 | \n", "2019-07-01 14:47:36.6810 | \n", "50 | \n", "2nd St at Townsend St | \n", "37.780526 | \n", "-122.390288 | \n", "101 | \n", "15th St at Potrero Ave | \n", "37.767079 | \n", "-122.407359 | \n", "... | \n", "casual | \n", "No | \n", "<NA> | \n", "NaN | \n", "2019-06-30 | \n", "2019 | \n", "6 | \n", "6 | \n", "True | \n", "18 | \n", "
2 | \n", "2019-06-30 15:40:31.0380 | \n", "2019-07-01 08:13:54.3490 | \n", "23 | \n", "The Embarcadero at Steuart St | \n", "37.791464 | \n", "-122.391034 | \n", "30 | \n", "San Francisco Caltrain (Townsend St at 4th St) | \n", "37.776598 | \n", "-122.395282 | \n", "... | \n", "member | \n", "No | \n", "<NA> | \n", "NaN | \n", "2019-06-30 | \n", "2019 | \n", "6 | \n", "6 | \n", "True | \n", "15 | \n", "
4 | \n", "2019-06-30 17:21:00.0550 | \n", "2019-07-01 06:55:54.9960 | \n", "15 | \n", "San Francisco Ferry Building (Harry Bridges Pl... | \n", "37.795392 | \n", "-122.394203 | \n", "30 | \n", "San Francisco Caltrain (Townsend St at 4th St) | \n", "37.776598 | \n", "-122.395282 | \n", "... | \n", "casual | \n", "No | \n", "<NA> | \n", "NaN | \n", "2019-06-30 | \n", "2019 | \n", "6 | \n", "6 | \n", "True | \n", "17 | \n", "
6 | \n", "2019-06-30 14:31:39.5730 | \n", "2019-07-01 00:53:02.2520 | \n", "6 | \n", "The Embarcadero at Sansome St | \n", "37.804770 | \n", "-122.403234 | \n", "400 | \n", "Buchanan St at North Point St | \n", "37.804272 | \n", "-122.433537 | \n", "... | \n", "casual | \n", "No | \n", "<NA> | \n", "NaN | \n", "2019-06-30 | \n", "2019 | \n", "6 | \n", "6 | \n", "True | \n", "14 | \n", "
5 rows × 21 columns
\n", "\n", " | month | \n", "year | \n", "temp_average | \n", "precipitation | \n", "
---|---|---|---|---|
0 | \n", "1 | \n", "2019 | \n", "12.768817 | \n", "0.374445 | \n", "
1 | \n", "1 | \n", "2020 | \n", "11.568100 | \n", "0.200742 | \n", "
2 | \n", "1 | \n", "2021 | \n", "11.818996 | \n", "0.239252 | \n", "
3 | \n", "2 | \n", "2019 | \n", "9.990079 | \n", "0.652236 | \n", "
4 | \n", "2 | \n", "2020 | \n", "13.534483 | \n", "0.000000 | \n", "
\n", " | is_weekend | \n", "count | \n", "temp_average | \n", "precipitation | \n", "month | \n", "member_casual_member | \n", "hour_of_day_01 | \n", "hour_of_day_02 | \n", "hour_of_day_03 | \n", "hour_of_day_04 | \n", "... | \n", "hour_of_day_16 | \n", "hour_of_day_17 | \n", "hour_of_day_18 | \n", "hour_of_day_19 | \n", "hour_of_day_20 | \n", "hour_of_day_21 | \n", "hour_of_day_22 | \n", "hour_of_day_23 | \n", "rideable_type_docked_bike | \n", "rideable_type_electric_bike | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "0.0 | \n", "3.0 | \n", "11.388889 | \n", "0.0 | \n", "4.0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "
1 | \n", "0.0 | \n", "6.0 | \n", "11.388889 | \n", "0.0 | \n", "4.0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "
2 rows × 31 columns
\n", "\n", " | 1 | \n", "is_weekend | \n", "temp_average | \n", "precipitation | \n", "month | \n", "member_casual_member | \n", "hour_of_day_01 | \n", "hour_of_day_02 | \n", "hour_of_day_03 | \n", "hour_of_day_04 | \n", "... | \n", "hour_of_day_22^2 | \n", "hour_of_day_22 hour_of_day_23 | \n", "hour_of_day_22 rideable_type_docked_bike | \n", "hour_of_day_22 rideable_type_electric_bike | \n", "hour_of_day_23^2 | \n", "hour_of_day_23 rideable_type_docked_bike | \n", "hour_of_day_23 rideable_type_electric_bike | \n", "rideable_type_docked_bike^2 | \n", "rideable_type_docked_bike rideable_type_electric_bike | \n", "rideable_type_electric_bike^2 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "1.0 | \n", "0.0 | \n", "11.388889 | \n", "0.0 | \n", "4.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "... | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "
1 | \n", "1.0 | \n", "0.0 | \n", "11.388889 | \n", "0.0 | \n", "4.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "... | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "
2 | \n", "1.0 | \n", "0.0 | \n", "11.388889 | \n", "0.0 | \n", "4.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "0.0 | \n", "... | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "0.0 | \n", "
3 | \n", "1.0 | \n", "0.0 | \n", "11.388889 | \n", "0.0 | \n", "4.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "0.0 | \n", "... | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "
4 | \n", "1.0 | \n", "0.0 | \n", "11.388889 | \n", "0.0 | \n", "4.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "... | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "
5 rows × 496 columns
\n", "