{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# NYC taxi data in winter and summer\n", "\n", "A client has approached your team for help in investigating a taxi customer's seasonal spending habits in New York City. They want to know: **Do yellow taxi passengers in New York City tip drivers more in the winter or summer?**\n", "\n", "Your team is in the [capturing](../../data-science/data-science-lifecycle/introduction.md#Capturing) stage of the Data Science Lifecycle and you are in charge of handling the dataset. You have been provided a notebook and data to explore.\n", "\n", "We will use Python to load yellow taxi trip data from the [NYC Taxi & Limousine Commission](https://docs.microsoft.com/en-us/azure/open-datasets/dataset-taxi-yellow?tabs=azureml-opendatasets).\n", "You can also open the taxi data file in a text editor or spreadsheet software like Excel." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Instructions\n", "\n", "- Assess whether or not the data in this dataset can help answer the question.\n", "- Explore the [NYC Open Data catalog](https://data.cityofnewyork.us/browse?sortBy=most_accessed&utf8=%E2%9C%93). Identify an additional dataset that could potentially be helpful in answering the client's question.\n", "- Write 3 questions that you would ask the client for more clarification and a better understanding of the problem. \n", "\n", "Refer to the [dataset's dictionary](https://www1.nyc.gov/assets/tlc/downloads/pdf/data_dictionary_trip_records_yellow.pdf) and [user guide](https://www1.nyc.gov/assets/tlc/downloads/pdf/trip_record_user_guide.pdf) for more information about the data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "\n", "path = '../../assets/data/taxi.csv'\n", "\n", "#Load the csv file into a dataframe\n", "df = pd.read_csv(path)\n", "\n", "#Print the dataframe\n", "print(df)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Rubric\n", "\n", "Exemplary | Adequate | Needs Improvement\n", "--- | --- | -- |\n", "--- | --- | -- |\n", "\n", "## Acknowledgments\n", "\n", "Thanks to Microsoft for creating the open-source course [Data Science for Beginners](https://github.com/microsoft/Data-Science-For-Beginners). It inspires the majority of the content in this chapter." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.9.13 64-bit", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" }, "name": "04-nyc-taxi-join-weather-in-pandas", "notebookId": 1709144033725344, "vscode": { "interpreter": { "hash": "aee8b7b246df8f9039afb4144a1f6fd8d2ca17a180786b69acc140d282b71a49" } } }, "nbformat": 4, "nbformat_minor": 2 }