{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import folium" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": " observed \\\n0 Ed L. was salmon fishing with a companion in P... \n1 heh i kinda feel a little dumb that im reporti... \n2 I was on my way to Claremont from Lebanon on R... \n3 I was northeast of Macy Nebraska along the Mis... \n4 While this incident occurred a long time ago, ... \n\n location_details \\\n0 East side of Prince William Sound \n1 the road is off us rt 80, i dont know the exit... \n2 Close to Claremont down 120 not far from Kings... \n3 Latitude & Longitude : 42.158230 -96.344197 \n4 Ward County, Just outside of a the Minuteman T... \n\n county state season \\\n0 Valdez-Chitina-Whittier County Alaska Fall \n1 Warren County New Jersey Fall \n2 Sullivan County New Hampshire Summer \n3 Thurston County Nebraska Spring \n4 Ward County North Dakota Spring \n\n title latitude longitude \\\n0 NaN NaN NaN \n1 NaN NaN NaN \n2 Report 55269: Dawn sighting at Stevens Brook o... 43.41549 -72.33093 \n3 Report 59757: Possible daylight sighting of a ... 42.15685 -96.34203 \n4 Report 751: Hunter describes described being s... 48.25422 -101.31660 \n\n date number ... precip_intensity precip_probability precip_type \\\n0 NaN 1261.0 ... NaN NaN NaN \n1 NaN 438.0 ... NaN NaN NaN \n2 2016-06-07 55269.0 ... 0.001 0.7 rain \n3 2018-05-25 59757.0 ... 0.000 0.0 NaN \n4 2000-04-21 751.0 ... NaN NaN rain \n\n pressure summary uv_index visibility \\\n0 NaN NaN NaN NaN \n1 NaN NaN NaN NaN \n2 998.87 Mostly cloudy throughout the day. 6.0 9.70 \n3 1008.07 Partly cloudy in the morning. 10.0 8.25 \n4 1011.47 Partly cloudy until evening. 6.0 10.00 \n\n wind_bearing wind_speed location \n0 NaN NaN NaN \n1 NaN NaN NaN \n2 262.0 0.49 POINT(-72.33093000000001 43.415490000000005) \n3 193.0 3.33 POINT(-96.34203000000001 42.15685) \n4 237.0 11.14 POINT(-101.3166 48.254220000000004) \n\n[5 rows x 29 columns]", "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
observedlocation_detailscountystateseasontitlelatitudelongitudedatenumber...precip_intensityprecip_probabilityprecip_typepressuresummaryuv_indexvisibilitywind_bearingwind_speedlocation
0Ed L. was salmon fishing with a companion in P...East side of Prince William SoundValdez-Chitina-Whittier CountyAlaskaFallNaNNaNNaNNaN1261.0...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1heh i kinda feel a little dumb that im reporti...the road is off us rt 80, i dont know the exit...Warren CountyNew JerseyFallNaNNaNNaNNaN438.0...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
2I was on my way to Claremont from Lebanon on R...Close to Claremont down 120 not far from Kings...Sullivan CountyNew HampshireSummerReport 55269: Dawn sighting at Stevens Brook o...43.41549-72.330932016-06-0755269.0...0.0010.7rain998.87Mostly cloudy throughout the day.6.09.70262.00.49POINT(-72.33093000000001 43.415490000000005)
3I was northeast of Macy Nebraska along the Mis...Latitude & Longitude : 42.158230 -96.344197Thurston CountyNebraskaSpringReport 59757: Possible daylight sighting of a ...42.15685-96.342032018-05-2559757.0...0.0000.0NaN1008.07Partly cloudy in the morning.10.08.25193.03.33POINT(-96.34203000000001 42.15685)
4While this incident occurred a long time ago, ...Ward County, Just outside of a the Minuteman T...Ward CountyNorth DakotaSpringReport 751: Hunter describes described being s...48.25422-101.316602000-04-21751.0...NaNNaNrain1011.47Partly cloudy until evening.6.010.00237.011.14POINT(-101.3166 48.254220000000004)
\n

5 rows × 29 columns

\n
" }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv('data/bfro_reports_geocoded.csv')\n", "df.head()" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "This is great! We have the observed column which has details of the observation, location details of where the observation happened, county, state, season the observation happened in, the report title, latitude and longitude, date etc.\n", "\n", "Let's take a look at the shape of the dataset." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "(4747, 29)" }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.shape" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "4,747 records. Not bad for a little analysis.\n", "\n", "Next let's see what the columns in the dataset are." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "Index(['observed', 'location_details', 'county', 'state', 'season', 'title',\n 'latitude', 'longitude', 'date', 'number', 'classification', 'geohash',\n 'temperature_high', 'temperature_mid', 'temperature_low', 'dew_point',\n 'humidity', 'cloud_cover', 'moon_phase', 'precip_intensity',\n 'precip_probability', 'precip_type', 'pressure', 'summary', 'uv_index',\n 'visibility', 'wind_bearing', 'wind_speed', 'location'],\n dtype='object')" }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.columns" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "Ok, looks like we have plenty of geo data to plot maps with. Let's first start by looking at which states have the most observations." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "Washington 563\nCalifornia 402\nFlorida 292\nOhio 276\nOregon 241\nIllinois 232\nTexas 215\nMichigan 208\nMissouri 141\nGeorgia 121\nColorado 121\nKentucky 112\nPennsylvania 111\nNew York 102\nWest Virginia 100\nArkansas 95\nAlabama 91\nTennessee 91\nOklahoma 85\nArizona 84\nIdaho 81\nWisconsin 80\nNorth Carolina 79\nIndiana 75\nVirginia 72\nMinnesota 69\nNew Jersey 63\nUtah 57\nIowa 56\nMontana 45\nKansas 42\nLouisiana 40\nNew Mexico 40\nSouth Carolina 38\nMaryland 34\nMassachusetts 27\nWyoming 27\nMississippi 21\nAlaska 20\nConnecticut 15\nNebraska 15\nMaine 14\nNew Hampshire 13\nSouth Dakota 11\nVermont 9\nNevada 7\nRhode Island 5\nDelaware 5\nNorth Dakota 4\nName: state, dtype: int64" }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['state'].value_counts()" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "Washington and California sure love their Bigfoot eh?\n", "\n", "For an analysis that's a little bit more intuitive, let's plot this out really quick. I'll use the Altair plotting library but you can use whatever you want." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/html": "\n
\n", "text/plain": "alt.Chart(...)" }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import altair as alt\n", "\n", "bf_states = alt.Chart(df).mark_bar().encode(\n", " x=alt.X('count(state):Q'),\n", " y=alt.Y('state:N', sort='-x')\n", ")\n", "\n", "bf_states" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "Some of thes location frequencies make sense to me. Such as more wooded states there seem to be higher sighting volume. It is weird to me that Bigfoot doesn't appear in Alaska very frequently. Perhaps due to a lower population with which Bigfoot can be spotted? Who knows for sure.\n", "\n", "Now let's take a look at which season Sasquatch is seen in." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/html": "\n
\n", "text/plain": "alt.Chart(...)" }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bf_seasons = alt.Chart(df).mark_bar().encode(\n", " x='count(season):Q',\n", " y=alt.Y('season:N', sort='-x')\n", ")\n", "\n", "bf_seasons" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "Looks like summer and fall are the big winners for Sassy sightings. I wonder if we see a similar trend in number of folks camping by season. I bet we do. Just a thought.\n", "\n", "Ok... ready for some maps? Me too!\n", "\n", "First things first, lets get a map centered on North America. For this part, let's use the slimmer dataset provided in `bfro_report_locations.csv`" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": " number title classification \\\n0 637 Report 637: Campers' encounter just after dark... Class A \n1 2917 Report 2917: Family observes large biped from car Class A \n2 7963 Report 7963: Sasquatch walks past window of ho... Class A \n3 9317 Report 9317: Driver on Alcan Highway has noon,... Class A \n4 13038 Report 13038: Snowmobiler has encounter in dee... Class A \n\n timestamp latitude longitude \n0 2000-06-16T12:00:00Z 61.5000 -142.9000 \n1 1995-05-15T12:00:00Z 55.1872 -132.7982 \n2 2004-02-09T12:00:00Z 55.2035 -132.8202 \n3 2004-06-18T12:00:00Z 62.9375 -141.5667 \n4 2004-02-15T12:00:00Z 61.0595 -149.7853 ", "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
numbertitleclassificationtimestamplatitudelongitude
0637Report 637: Campers' encounter just after dark...Class A2000-06-16T12:00:00Z61.5000-142.9000
12917Report 2917: Family observes large biped from carClass A1995-05-15T12:00:00Z55.1872-132.7982
27963Report 7963: Sasquatch walks past window of ho...Class A2004-02-09T12:00:00Z55.2035-132.8202
39317Report 9317: Driver on Alcan Highway has noon,...Class A2004-06-18T12:00:00Z62.9375-141.5667
413038Report 13038: Snowmobiler has encounter in dee...Class A2004-02-15T12:00:00Z61.0595-149.7853
\n
" }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bf_locations = pd.read_csv('data/bfro_report_locations.csv')\n", "bf_locations.head()" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "This is good. We don't want the entire dataset while trying to manipulate some maps." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "", "text/html": "
Make this Notebook Trusted to load map: File -> Trust Notebook
" }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "folium.Map(location=[37.0902, -95.7129], zoom_start=4)" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "What I did above is passed latitude and longitude coordinates and a zoom parameter to the Map object. This tells folium where to center the map and at what zoom. If we set the zoom to 1 the map gets zoomed all the way out." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "", "text/html": "
Make this Notebook Trusted to load map: File -> Trust Notebook
" }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "folium.Map(location=[37.0902, -95.7129], zoom_start=1)" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ " If we set the zoom to 20, we get a nice tight view of an \"Industrial Area\" in Dearing, Kansas. Never been there... seems nice.\n", "\n", "Why did I pick this location you ask? Turns out, this is really close to the center of the United States wher most of the BF sightings in the dataset took place." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "", "text/html": "
Make this Notebook Trusted to load map: File -> Trust Notebook
" }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "folium.Map(location=[37.0902, -95.7129], zoom_start=20)" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "Ok ok, this is all great but let's get some Sass locations in there.\n", "\n", "Next thing we need to do is learn how to place a marker." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "", "text/html": "
Make this Notebook Trusted to load map: File -> Trust Notebook
" }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# This is our map lovingly centered on Dearing Kansas at a zoom of 15\n", "kansas_map = folium.Map(location=[37.0902, -95.7129], zoom_start=15)\n", "# Add the marker to the map...\n", "folium.Marker(location=[37.0902, -95.7129]).add_to(kansas_map)\n", "# Display the map with the marker that's been added\n", "kansas_map" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "There you have it, a blue marker at the public access of some lake in Kansas. Sweet!\n", "\n", "Now to add all the locations that Bigfoot has been spotted at, we need to create a list of tuples containing latitude and longitude coordinates where Bigfoot has been spotted like this:\n", "\n", "`[(lat, long),\n", " (lat, long),\n", " (lat, long)...]`" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": "[(61.5, -142.9),\n (55.1872, -132.7982),\n (55.2035, -132.8202),\n (62.9375, -141.5667),\n (61.0595, -149.7853)]" }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "locations = list(zip(bf_locations['latitude'].values, bf_locations['longitude'].values))\n", "locations[:5] # Showing first 5 locations as an example" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "Now we just need to use these tuples as marker placements for the map." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "pycharm": { "name": "#%%\n" }, "scrolled": true }, "outputs": [ { "data": { "text/plain": "", "text/html": "
Make this Notebook Trusted to load map: File -> Trust Notebook
" }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bf_map = folium.Map(location=[37.0902, -95.7129], zoom_start=4) # Base Map of US (centered)\n", "# For each location in the list of lat and long tuples we created above:\n", "for location in locations:\n", " # Add a marker to the bf_map\n", " folium.Marker(location=[location[0], location[1]]).add_to(bf_map)\n", "\n", "bf_map" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "There you have it. That's a lot of Notorious B.I.G. Foot sightings. Even a couple in the Pacific Ocean! That's neat! At some point we'll need to examine those for validity but for now we'll just assume Bigfoot is better known as Sassy Surfer out there." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "bf_map.save('bf_locations_markers.html')" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "bf_map.save('bf_locations_markers.jpeg')" ] }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.11" } }, "nbformat": 4, "nbformat_minor": 1 }