{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Exploring for answers\n", "\n", "This is a continuation of the previous section's [assignment](./nyc-taxi-data-in-winter-and-summer.ipynb), where we briefly took a look at the data set. Now we will be taking a deeper look at the data.\n", "\n", "Again, the question the client wants to know: **Do yellow taxi passengers in New York City tip drivers more in the winter or summer?**\n", "\n", "Your team is in the [Analyzing](../../data-science/data-science-lifecycle/analyzing.md) stage of the Data Science Lifecycle, where you are responsible for doing exploratory data analysis on the dataset. You have been provided a notebook and dataset that contains 200 taxi transactions from January and July 2019.\n", "\n", "## Instructions\n", "\n", "Below data is from the [Taxi & Limousine Commission](https://docs.microsoft.com/en-us/azure/open-datasets/dataset-taxi-yellow?tabs=azureml-opendatasets). Refer to the [dataset's dictionary](https://www1.nyc.gov/assets/tlc/downloads/pdf/data_dictionary_trip_records_yellow.pdf) and [user guide](https://www1.nyc.gov/assets/tlc/downloads/pdf/trip_record_user_guide.pdf) for more information about the data.\n", "\n", "\n", "Use some the techniques in this section to do your own EDA in the notebook (add cells if you'd like) and answer the following questions:\n", "\n", "- What other influences in the data could affect the tip amount?\n", "- What columns will most likely not be needed to answer the client's questions?\n", "- Based on what has been provided so far, does the data seem to provide any evidence of seasonal tipping behavior?\n", "\n", "Use the cells below to do your own Exploratory data analysis" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "\n", "path = '../../data/taxi.csv'\n", "\n", "#Load the csv file into a dataframe\n", "df = pd.read_csv(path)\n", "\n", "#Print the dataframe\n", "print(df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## Rubric\n", "\n", "Exemplary | Adequate | Needs Improvement\n", "--- | --- | -- |\n", "--- | --- | -- |\n", "\n", "## Acknowledgments\n", "\n", "Thanks to Microsoft for creating the open source course [Data Science for Beginners](https://github.com/microsoft/Data-Science-For-Beginners). It inspires the majority of the content in this chapter." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.9.13 64-bit", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.9.13" }, "orig_nbformat": 4, "vscode": { "interpreter": { "hash": "aee8b7b246df8f9039afb4144a1f6fd8d2ca17a180786b69acc140d282b71a49" } } }, "nbformat": 4, "nbformat_minor": 2 }