{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "

\"FT

\n", "

Interactive Crime Map Algorithm Workbook; 05/06/19

\n", "

By Frogtown Crusader (Abu Nayeem)

" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Table of Contents\n", "\n", "* [Intro](#intro)\n", "* [General Data Setup](#setup)\n", "* Decoding Intersection \n", " * [Hardcode Intersection Key Steps](#hard)\n", " * [Intersection Key Setup](#setup_inter)\n", " * [Intersection Table Setup](#setup_table)\n", " * [Merging Intersection Table](#merge_inter)\n", "* Decoding Address\n", " * [Intro/setup](#setup_add)\n", " * [Google Geocoder In-depth explanation](#geo_explain)\n", " * [Annotated Geocoder Example](#ann_workbook)\n", " * [Bare Geocoder Workbook](#workbook)\n", " * [Geocoder Aggregation](#agg)\n", " * [Address Key Preparation](#address_key)\n", " * [Algorithm Maintenance](#maintenance)\n", "* Alternative Algorithm: Absolute Decoder\n", " * [Decoder Steps](#abs_decode)\n", " * [Visualizing Outliers](#out)\n", " * [Decoder Cleanup](#decode_clean)\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Intro \n", "\n", "The goal of this notebook is to illustrate the notes and steps needed to prepare interactive crime data for Saint Paul and the steps needed to create a proxy algorithm for geo-coordinates. This workbook is focused in tuning the proxy algorithm. \n", "\n", "The [Crime Incident Report - Dataset](https://information.stpaul.gov/Public-Safety/Crime-Incident-Report-Dataset/gppb-g9cg) was obtained from the Saint Paul Website. It is publicly available. The report contains incidents from Aug 14 2014 through the most recent date, as released by the Saint Paul Police Department.\n", "\n", "### Proxy Algorithm\n", "\n", "**Challenge:**\n", "* How can we find the geo-coordinates of a masked column?\n", "\n", "**Values of Column:**\n", "* '45X University Ave' i.e Masked Address\n", "* 'Victoria Street and Avon Avenue' i.e. Intersection\n", "\n", "**Strategy:** The algorithm will treat both steps separately; where I split the data between Intersection and Mashed Address and combined them back. (More details as we code!)\n", "* PreCoding: For __Intersection Key__, we will find the geocoordinates of all potential intersections of interest and save it as Key Table\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Data Setup \n", "\n", "I will be using the Socrata API to download data. There are some advantages and disadvantages. For advantage, I don't have to store data directly. The disadvantage is that the API is clunky and the column order and datatypes changes!" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import warnings\n", "warnings.filterwarnings('ignore')\n", "import matplotlib as mpl\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline \n", "import plotly\n", "from pygeocoder import Geocoder #GeoCoding Algorithm\n", "import folium\n", "from IPython.display import HTML\n", "from IPython.display import display\n", "import json # library to handle JSON files\n", "import requests # library to handle requests\n", "from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe\n", "from sodapy import Socrata" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "ename": "NameError", "evalue": "name 'results' is not defined", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mNameError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mpf\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mpd\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mDataFrame\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfrom_records\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mresults\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 2\u001b[0m \u001b[0mpf\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mshape\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 3\u001b[0m \u001b[1;31m#Grid_Index= [66,67, 68, 86, 87,88,89, 90, 91, 92,106,107,108,109,110]\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 4\u001b[0m \u001b[0mcols\u001b[0m\u001b[1;33m=\u001b[0m \u001b[1;33m[\u001b[0m\u001b[1;34m'Block'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;34m'CallDispCode'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;34m'CallDisposition'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;34m'Case'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;34m'Code'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m'Count'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;34m'Date'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;34m'Grid'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;34m'Incident'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;34m'IncType'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;34m'Neighborhood'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;34m'NNum'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;34m'Time'\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[0mpf\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mcolumns\u001b[0m\u001b[1;33m=\u001b[0m \u001b[0mcols\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;31mNameError\u001b[0m: name 'results' is not defined" ] } ], "source": [ "pf = pd.DataFrame.from_records(results)\n", "pf.shape\n", "#Grid_Index= [66,67, 68, 86, 87,88,89, 90, 91, 92,106,107,108,109,110]\n", "cols= ['Block','CallDispCode','CallDisposition','Case','Code', 'Count','Date','Grid','Incident','IncType','Neighborhood','NNum','Time']\n", "pf.columns= cols\n", "pf = pf.astype({\"Case\": int, \"Code\": int, \"Grid\":float, \"NNum\":int,\"Count\":int})\n", "pf=pf.query(\"Grid in (66,67, 68, 86, 87,88,89, 90, 91, 92,106,107,108,109,110)\")\n", "pf.shape\n", "\n", "\n", "['Armatage', 'Lind - Bohanon', 'McKinley', 'Harrison', 'Hawthorne',\n", " 'Como', 'Folwell', 'Howe', 'Willard - Hay', 'North Loop',\n", " 'Linden Hills', 'Northeast Park', 'Elliot Park', 'Standish',\n", " 'Near - North', 'East Harriet', 'Powderhorn Park', 'Audubon Park',\n", " 'Longfellow', 'Lowry Hill East', 'Marshall Terrace', 'Jordan',\n", " 'Keewaydin', 'Beltrami', 'Northrop', 'Field', 'Hale', 'Logan Park',\n", " 'Sheridan', 'St. Anthony West', 'St. Anthony East', 'Tangletown',\n", " nan, 'Minnehaha', 'Fulton', 'Morris Park', 'Wenonah', 'Kenny',\n", " 'Windom', 'Hiawatha', 'Seward', 'Loring Park', 'Windom Park',\n", " 'East Phillips', 'King Field', 'East Isles', 'West Maka Ska',\n", " 'Central', 'Diamond Lake', 'Lowry Hill',\n", " 'Nicollet Island - East Bank', 'Victory', 'Lyndale', 'Holland',\n", " 'East Bde Maka Ska', 'Marcy Holmes', 'Bottineau', 'Bryant',\n", " 'South Uptown', 'Midtown Phillips', 'Bancroft', 'Whittier',\n", " 'Prospect Park - East River Road', 'Ventura Village',\n", " 'Downtown West', 'Page', 'Phillips West', 'Webber - Camden',\n", " 'Corcoran', 'Waite Park', 'Lynnhurst', 'Columbia Park', 'Cooper',\n", " 'Cedar Riverside', 'Cedar - Isles - Dean',\n", " 'University of Minnesota', 'Cleveland', 'Ericsson',\n", " 'Downtown East', 'Bryn - Mawr', \"Steven's Square - Loring Heights\",\n", " 'Kenwood', 'Shingle Creek', 'Regina', 'Mid - City Industrial',\n", " 'Sumner - Glenwood', 'Camden Industrial']" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "WARNING:root:Requests made without an app_token will be subject to strict throttling limits.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "(26802, 13)\n", "Block object\n", "CallDispCode object\n", "CallDisposition object\n", "Case int32\n", "Code int32\n", "Count int32\n", "Date object\n", "Grid float64\n", "Incident object\n", "IncType object\n", "Neighborhood object\n", "NNum int32\n", "Time object\n", "dtype: object\n" ] } ], "source": [ "import pandas as pd\n", "import numpy as np\n", "import warnings\n", "warnings.filterwarnings('ignore')\n", "import matplotlib as mpl\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline \n", "import plotly\n", "from pygeocoder import Geocoder #GeoCoding Algorithm\n", "import folium\n", "from IPython.display import HTML\n", "from IPython.display import display\n", "import json # library to handle JSON files\n", "import requests # library to handle requests\n", "from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe\n", "from sodapy import Socrata\n", "\n", "#New Upload Method Get Information from Socrata API\n", "client = Socrata(\"information.stpaul.gov\", None)\n", "\n", "#Easier to bulk upload\n", "results = client.get(\"gppb-g9cg\", limit=1000000)\n", "df = pd.DataFrame.from_records(results)\n", "# Find Max Date Value\n", "results = client.get(\"gppb-g9cg\", limit=1)\n", "r_max = pd.DataFrame.from_records(results)\n", "\n", "\n", "#NOT USED SPECIFIC ENTRY\n", "# Upload data based on grid; I couldnt figure out Socrata API\n", "#results = client.get(\"gppb-g9cg\", limit=5000, grid=(66))\n", "#r66 = pd.DataFrame.from_records(results)\n", "#results = client.get(\"gppb-g9cg\", limit=5000, grid=(67))\n", "#r67 = pd.DataFrame.from_records(results)\n", "#results = client.get(\"gppb-g9cg\", limit=5000, grid=(68))\n", "#r68 = pd.DataFrame.from_records(results)\n", "#results = client.get(\"gppb-g9cg\", limit=5000, grid=(86))\n", "#r86 = pd.DataFrame.from_records(results)\n", "#results = client.get(\"gppb-g9cg\", limit=5000, grid=(87))\n", "#r87 = pd.DataFrame.from_records(results)\n", "#results = client.get(\"gppb-g9cg\", limit=5000, grid=(88))\n", "#r88 = pd.DataFrame.from_records(results)\n", "#results = client.get(\"gppb-g9cg\", limit=5000, grid=(89))\n", "#r89 = pd.DataFrame.from_records(results)\n", "#results = client.get(\"gppb-g9cg\", limit=5000, grid=(90))\n", "#r90 = pd.DataFrame.from_records(results)\n", "#results = client.get(\"gppb-g9cg\", limit=5000, grid=(91))\n", "#r91 = pd.DataFrame.from_records(results)\n", "#results = client.get(\"gppb-g9cg\", limit=5000, grid=(92))\n", "#r92 = pd.DataFrame.from_records(results)\n", "#results = client.get(\"gppb-g9cg\", limit=5000, grid=(106))\n", "#r106 = pd.DataFrame.from_records(results)\n", "#results = client.get(\"gppb-g9cg\", limit=5000, grid=(107))\n", "#r107 = pd.DataFrame.from_records(results)\n", "#results = client.get(\"gppb-g9cg\", limit=5000, grid=(108))\n", "#r108 = pd.DataFrame.from_records(results)\n", "#results = client.get(\"gppb-g9cg\", limit=5000, grid=(109))\n", "#r109 = pd.DataFrame.from_records(results)\n", "#results = client.get(\"gppb-g9cg\", limit=5000, grid=(110))\n", "#r110 = pd.DataFrame.from_records(results)\n", "#Combine all the datatables \n", "#df= pd.concat([r66,r67,r68,r86,r87,r88,r89,r90,r91,r92,r106,r107,r108,r109,r110],ignore_index=True)\n", "\n", "#rename columns [Note the order of Columns have changed]\n", "cols= ['Block','CallDispCode','CallDisposition','Case','Code', 'Count','Date','Grid','Incident','IncType','Neighborhood','NNum','Time']\n", "df.columns= cols\n", "df = df.astype({\"Case\": int, \"Code\": int, \"Grid\":float, \"NNum\":int,\"Count\":int})\n", "#select respective Grids of interest\n", "df=df.query(\"Grid in (66,67, 68, 86, 87,88,89, 90, 91, 92,106,107,108,109,110)\")\n", "\n", "\n", "# Old Method\n", "#df_crime = pd.read_csv('Datasets/Crime_Incident_Report_-_Dataset.csv')\n", "#cols= ['Case','Date','Time','Code','IncType','Incident','Grid','NNum','Neighborhood','Block','CallDispCode','CallDisposition', 'Count']\n", "#df.columns= cols\n", "#print(df_crime.head())\n", "\n", "print(df.shape)\n", "print(df.dtypes)\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create New Variables " ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
BlockCallDispCodeCallDispositionCaseCodeCountDateGridIncidentIncType...DayofWeekWeekendMonthDayDayYearDay_MaxTimeHourHourLateNightIntersection
0127x seminary avAAdvised18032373995412018-02-14T00:00:00.00066.0Proactive Police VisitProactive Police Visit...20214451212019-05-11 21:43:462100
1117x piercebutler rdAAdvised17275739995412017-11-28T00:00:00.00066.0Proactive Police VisitProactive Police Visit...1011283321212019-05-11 20:50:002000
2121x piercebutler rdRRReport Written1727473953512017-11-27T00:00:00.00066.0BurglaryBurglary, Forced Entry, Day, Commercial...0011273311212019-05-11 13:30:001300
3127x hewitt avAAdvised17182301995412017-08-04T00:00:00.00066.0Proactive Police VisitProactive Police Visit...40842161212019-05-11 20:24:512000
4127x hewitt avAAdvised17156986995412017-07-07T00:00:00.00066.0Proactive Police VisitProactive Police Visit...40771881212019-05-11 23:58:062310
\n", "

5 rows × 25 columns

\n", "
" ], "text/plain": [ " Block CallDispCode CallDisposition Case Code Count \\\n", "0 127x seminary av A Advised 18032373 9954 1 \n", "1 117x piercebutler rd A Advised 17275739 9954 1 \n", "2 121x piercebutler rd RR Report Written 17274739 535 1 \n", "3 127x hewitt av A Advised 17182301 9954 1 \n", "4 127x hewitt av A Advised 17156986 9954 1 \n", "\n", " Date Grid Incident \\\n", "0 2018-02-14T00:00:00.000 66.0 Proactive Police Visit \n", "1 2017-11-28T00:00:00.000 66.0 Proactive Police Visit \n", "2 2017-11-27T00:00:00.000 66.0 Burglary \n", "3 2017-08-04T00:00:00.000 66.0 Proactive Police Visit \n", "4 2017-07-07T00:00:00.000 66.0 Proactive Police Visit \n", "\n", " IncType ... DayofWeek Weekend Month Day \\\n", "0 Proactive Police Visit ... 2 0 2 14 \n", "1 Proactive Police Visit ... 1 0 11 28 \n", "2 Burglary, Forced Entry, Day, Commercial ... 0 0 11 27 \n", "3 Proactive Police Visit ... 4 0 8 4 \n", "4 Proactive Police Visit ... 4 0 7 7 \n", "\n", " DayYear Day_Max TimeHour Hour LateNight Intersection \n", "0 45 121 2019-05-11 21:43:46 21 0 0 \n", "1 332 121 2019-05-11 20:50:00 20 0 0 \n", "2 331 121 2019-05-11 13:30:00 13 0 0 \n", "3 216 121 2019-05-11 20:24:51 20 0 0 \n", "4 188 121 2019-05-11 23:58:06 23 1 0 \n", "\n", "[5 rows x 25 columns]" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Add Time Variables\n", "df= df[df.Case != 18254093] #messed up time variable\n", "\n", "#Convert Date to Datetime!\n", "from datetime import datetime\n", "\n", "df['DateTime']= pd.to_datetime(df['Date']) # Create new column called DateTime\n", "df['Year']= df['DateTime'].dt.year #create year column\n", "df['DayofWeek']=df['DateTime'].dt.dayofweek #create day of the week column where default 0=Monday\n", "df['Weekend'] = df['DayofWeek'].apply(lambda x: 1 if (x>4) else 0) #Create a weekend category\n", "df['Month'] = df['DateTime'].dt.month # Create Month Category\n", "df['Day'] = df['DateTime'].dt.day #Create Day of the Current month\n", "df['DayYear'] = df['DateTime'].dt.dayofyear #Create Day of the year (0-365)\n", "\n", "#Find max-day:\n", "r_max['DateTime']= pd.to_datetime(r_max['date'])\n", "r_max['DayYear'] = r_max['DateTime'].dt.dayofyear #Create Day of the year (0-365)\n", "df['Day_Max'] = r_max.iloc[0,-1] #selects uptodate day; NOTE: the data is sorted chronologically\n", "\n", "#Hour Data\n", "df['TimeHour']= pd.to_datetime(df['Time'])\n", "df['Hour'] = df['TimeHour'].dt.hour.astype(int) #Create Hour Colum\n", "df['LateNight'] = df['Hour'].apply(lambda x: 1 if (x>21 or x<5) else 0) #Latenight designation from 10Pm to 6PM\n", "\n", "#Creating the intersection Column. Note: the Block column has the address information\n", "df.Block = df.Block.astype(str) #first change the type to string\n", "df['Block']= df['Block'].str.lower() #lowercase string to create uniformity\n", "\n", "#While scanning the data I noticed that all intersections had \"&\" \n", "df['Intersection'] = df['Block'].apply(lambda x: 1 if '&' in x else 0) #intersection\n", "\n", "df.head(5)" ] }, { "cell_type": "markdown", "metadata": { "heading_collapsed": true }, "source": [ "### Hardcoding Intersection Key \n", "\n", "0) Create a DataSheet (i.e Excel or GoogleSheet [preferred]) \n", " * Setup four columns IntersectionID (used as index), IntersectionName1; IntersectionName2; and clumped geocoordinates \n", " * A grid column is not included because that data can be messy and it's not clear what grid a boundary intersection will be located\n", " * Do not worry about undercase and uppercase\n", " * The actual location data is not consistent in the order it names an intersection; so I have created a post-code so you don't need to enter data a second time\n", " * Avoid double-count when entering the data! Don't worry we will perform some debugging and error checks\n", " \n", "1) List all possible intersections of interest. Use the [police grid boundaries](https://information.stpaul.gov/Public-Safety/Saint-Paul-Police-Grid-Shapefile/ykwt-ie3e) when selecting intersections of Interest.\n", " * To address the boundary problem I've included the neighboring police grids of Frogtown to assure all relevant points are being mapped. However, the boundary problem does exist at the outer boundaries.\n", "\n", "2) Strategy for Setup: Often urban areas are organized in a grid; so intersections can follow a pattern where two avenues have almost the same intersection pairs. You can copy and paste some of the columns, etc. See picture below\n", " * Make sure you set the naming of the intersections in the way you actually scroll down on a map to enter data. (Saves a lot of time in data entry)\n", " * Go to Google Maps on a web browser and point to an intersection until you some geocoordinates come up and then click on the geocoordinates hyperlink; From that window, you copy and paste the geocoordinates. You could manual enter the values separately, but it will make the process much tedious. The post code can handle it much readily\n", "\n", "3) Since the intersection key is a static document, I recommend exporting it as csv and loading it to your machine" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setup Intersection Key \n", "\n", "In this step, I'll be loading the Intersection Key File and prep it up for joining the primary dataset. The primary key is the name of the intersection; the format I decided to go with is 'name1_name2'. However, 'name2_name1' is also valid and should have same coordinates. The final dataframe has an indexkey to connect to join with the primary output key. \n", "\n", "**Note:** I discovered the bugs of the code during the data exploration phase." ] }, { "cell_type": "code", "execution_count": 402, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IntersectionIDIntersection1Intersection2Coordinates
01lexingtonfront44.970295, -93.146572
12lexingtonstinson44.969316, -93.146529
\n", "
" ], "text/plain": [ " IntersectionID Intersection1 Intersection2 Coordinates\n", "0 1 lexington front 44.970295, -93.146572\n", "1 2 lexington stinson 44.969316, -93.146529" ] }, "execution_count": 402, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Setting up the Coordinate Key\n", "\n", "#Prep from appropiate key (note this process can be done in excel as well)\n", "df_key= pd.read_csv('Datasets/Frog_key - Sheet1.csv')\n", "\n", "#convert to lowercase\n", "df_key['Intersection1']= df_key['Intersection1'].str.lower()\n", "df_key['Intersection2']= df_key['Intersection2'].str.lower()\n", "#remove empty space; found out when debugging!\n", "df_key['Intersection1']= df_key['Intersection1'].str.replace(' ', '', regex=True)\n", "df_key['Intersection2']= df_key['Intersection2'].str.replace(' ', '', regex=True)\n", "\n", "df_key.head(2)\n" ] }, { "cell_type": "code", "execution_count": 403, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Intersection1Intersection2CoordinatesInt1_2Int2_1OutputKey
0lexingtonfront44.970295, -93.146572lexington_frontfront_lexingtonlexington_front
1lexingtonstinson44.969316, -93.146529lexington_stinsonstinson_lexingtonlexington_stinson
\n", "
" ], "text/plain": [ " Intersection1 Intersection2 Coordinates Int1_2 \\\n", "0 lexington front 44.970295, -93.146572 lexington_front \n", "1 lexington stinson 44.969316, -93.146529 lexington_stinson \n", "\n", " Int2_1 OutputKey \n", "0 front_lexington lexington_front \n", "1 stinson_lexington lexington_stinson " ] }, "execution_count": 403, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Create a dataframe and new columns on potential mapping\n", "A=df_key[['Intersection1','Intersection2', 'Coordinates']]\n", "A['Int1_2']= A['Intersection1']+ '_' + A['Intersection2'] #int1_int2\n", "A['Int2_1']= A['Intersection2']+ '_' + A['Intersection1'] #int2_int1\n", "A['OutputKey']= A['Int1_2'] #create an output key based on oneinstersection pair\n", "A.head(2)\n", "\n", "#A.query('Int==\"marshall_victoria\"')\n", "#A.query('Intersection1==\"marshall\"')\n", "#Intersection_key.query('IndexKey==\"marshall_victoria\"')\n" ] }, { "cell_type": "code", "execution_count": 404, "metadata": {}, "outputs": [], "source": [ "# Take a subset of data and rename the Int columns to IndexKey\n", "H1=A[['Int1_2','Coordinates','OutputKey']]\n", "H1.columns= ['IndexKey','Coordinates','OutputKey']\n", "H2=A[['Int2_1','Coordinates','OutputKey']]\n", "H2.columns= ['IndexKey','Coordinates','OutputKey']\n", "\n", "#We finally merge the two columns\n", "Intersection_key=H1.append(H2, ignore_index=True)\n", "Intersection_key.tail(2)\n", "Intersection_key.to_csv('Intersection_key_clean.csv', encoding='utf-8', index=False)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setup Intersection DataTable \n", "\n", "We have prepared the key, but we need to prepare datatable to match with the IndexKey. This will require several string splittings." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The intersection table dimension are (4948, 25)\n", "182 dale st n & edmund\n", "257 milton st n & thomas\n", "258 thomas av & stalbans\n", "266 energy la & norris\n", "315 arundel st & university\n", "368 hubbard av & syndicate\n", "416 thomas av & milton\n", "467 thomas av & milton\n", "539 chatsworth st n & university\n", "601 dale st n & marshall\n", "Name: Block, dtype: object\n" ] } ], "source": [ "# Create a new dateframe specifying only intersections\n", "dfI=df.query('Intersection ==1')\n", "print('The intersection table dimension are ' + str(dfI.shape))\n", "print(dfI.Block.head(10))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Strategy:** Do you see a pattern?\n", "\n", "1) Split the string to two sections on the ' ', the first section has an intersection variable\n", "\n", "2) Split the string to two sections on the '& ', the second section has an intersection variable\n", "\n", "3) Note: The avenue and direction does not matter for our purposes and probability that there is same named street and avenue having the same paired intersection is unlikely" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CaseDateTimeCodeIncTypeIncidentGridNNumNeighborhoodBlock...Day_MaxTimeHourHourLateNightIntersectionInter2Inter1IndexKeyCoordinatesOutputKey
01908839504/30/20192019-04-30T08:00:00.000861Assault, Domestic, Opposite SexSimple Asasult Dom.89.077 - Thomas/Dale(Frogtown)dale st n & edmund...1202019-04-30 08:00:00801edmunddaledale_edmund44.958439, -93.126376dale_edmund
11908807504/29/20192019-04-29T21:40:24.0009954Proactive Police VisitProactive Police Visit87.077 - Thomas/Dale(Frogtown)milton st n & thomas...1202019-04-29 21:40:242101thomasmiltonmilton_thomas44.959361, -93.139031milton_thomas
21908807104/29/20192019-04-29T21:39:13.0009954Proactive Police VisitProactive Police Visit88.077 - Thomas/Dale(Frogtown)thomas av & stalbans...1202019-04-29 21:39:132101stalbansthomasthomas_stalbans44.959350, -93.128908stalbans_thomas
\n", "

3 rows × 30 columns

\n", "
" ], "text/plain": [ " Case Date Time Code \\\n", "0 19088395 04/30/2019 2019-04-30T08:00:00.000 861 \n", "1 19088075 04/29/2019 2019-04-29T21:40:24.000 9954 \n", "2 19088071 04/29/2019 2019-04-29T21:39:13.000 9954 \n", "\n", " IncType Incident Grid NNum \\\n", "0 Assault, Domestic, Opposite Sex Simple Asasult Dom. 89.0 7 \n", "1 Proactive Police Visit Proactive Police Visit 87.0 7 \n", "2 Proactive Police Visit Proactive Police Visit 88.0 7 \n", "\n", " Neighborhood Block ... Day_Max \\\n", "0 7 - Thomas/Dale(Frogtown) dale st n & edmund ... 120 \n", "1 7 - Thomas/Dale(Frogtown) milton st n & thomas ... 120 \n", "2 7 - Thomas/Dale(Frogtown) thomas av & stalbans ... 120 \n", "\n", " TimeHour Hour LateNight Intersection Inter2 Inter1 \\\n", "0 2019-04-30 08:00:00 8 0 1 edmund dale \n", "1 2019-04-29 21:40:24 21 0 1 thomas milton \n", "2 2019-04-29 21:39:13 21 0 1 stalbans thomas \n", "\n", " IndexKey Coordinates OutputKey \n", "0 dale_edmund 44.958439, -93.126376 dale_edmund \n", "1 milton_thomas 44.959361, -93.139031 milton_thomas \n", "2 thomas_stalbans 44.959350, -93.128908 stalbans_thomas \n", "\n", "[3 rows x 30 columns]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Split the strings\n", "new=dfI['Block'].str.split(\"& \", n = 1, expand = True) \n", "dfI['Inter2']= new[1]\n", "new=dfI['Block'].str.split(\" \", n = 1, expand = True) #Note the code specifies the first time a space occured\n", "dfI['Inter1']=new[0]\n", "\n", "#Create the IndexKey; recall we prepared the IntersectionKey where we consider any order\n", "dfI['IndexKey']= dfI['Inter1']+ '_' + dfI['Inter2']\n", "dfI.reset_index()\n", "dfI=pd.merge(dfI, Intersection_key, on='IndexKey', how='left')\n", "dfI.head(3)\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Merging Intersection Data \n", "\n", "Check which intersections have not matched and the respective count" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "scrolled": true }, "outputs": [], "source": [ "#Check if any missing values\n", "\n", "#find null subset\n", "B= dfI[dfI['Coordinates'].isnull()]\n", "C=B[['Neighborhood','IndexKey']]\n", "#C=C.query('Neighborhood != \"7 - Thomas/Dale(Frogtown)\"')\n", "#C.groupby(['Neighborhood','IndexKey']).sum()\n", "#C.IndexKey.value_counts()" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['Case', 'Date', 'Time', 'Code', 'IncType', 'Incident', 'Grid', 'NNum',\n", " 'Neighborhood', 'Block', 'CallDispCode', 'CallDisposition', 'Count',\n", " 'DateTime', 'Year', 'DayofWeek', 'Weekend', 'Month', 'Day', 'DayYear',\n", " 'Day_Max', 'TimeHour', 'Hour', 'LateNight', 'Intersection', 'Inter2',\n", " 'Inter1', 'IndexKey', 'Coordinates', 'OutputKey', 'Latitude',\n", " 'Longitude'],\n", " dtype='object')" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Drop rows with missing coordinates\n", "dfI=dfI[dfI['Coordinates'].notnull()]\n", "\n", "# Separate Latitude and Longitude \n", "new=dfI['Coordinates'].str.split(\",\", n = 1, expand = True) \n", "# making seperate first name column from new data frame \n", "dfI['Latitude']= pd.to_numeric(new[0]) #pd.to_numeric convert it to float\n", "dfI['Longitude']= pd.to_numeric(new[1])\n" ] }, { "cell_type": "code", "execution_count": 158, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CaseDateTimeCodeIncTypeIncidentGridNNumNeighborhoodBlock...MonthDayDayYearDay_MaxTimeHourHourLateNightIntersectionLatitudeLongitude
01907807004/17/201915:059954Proactive Police VisitProactive Police Visit109.088 - Summit/Universityarundel_central...4171071072019-04-30 15:05:00150144.953081-93.118654
11907806804/17/201915:039954Proactive Police VisitProactive Police Visit110.088 - Summit/Universityfarrington_fuller...4171071072019-04-30 15:03:00150144.953989-93.113264
21907811004/17/201916:089954Proactive Police VisitProactive Police Visit89.077 - Thomas/Dale(Frogtown)mackubin_university...4171071072019-04-30 16:08:00160144.955842-93.121236
31907818204/17/201917:229954Proactive Police VisitProactive Police Visit87.077 - Thomas/Dale(Frogtown)lexington_university...4171071072019-04-30 17:22:00170144.955826-93.146539
41907844104/17/201923:489954Proactive Police VisitProactive Police Visit87.077 - Thomas/Dale(Frogtown)milton_thomas...4171071072019-04-30 23:48:00231144.959361-93.139031
\n", "

5 rows × 27 columns

\n", "
" ], "text/plain": [ " Case Date Time Code IncType \\\n", "0 19078070 04/17/2019 15:05 9954 Proactive Police Visit \n", "1 19078068 04/17/2019 15:03 9954 Proactive Police Visit \n", "2 19078110 04/17/2019 16:08 9954 Proactive Police Visit \n", "3 19078182 04/17/2019 17:22 9954 Proactive Police Visit \n", "4 19078441 04/17/2019 23:48 9954 Proactive Police Visit \n", "\n", " Incident Grid NNum Neighborhood \\\n", "0 Proactive Police Visit 109.0 8 8 - Summit/University \n", "1 Proactive Police Visit 110.0 8 8 - Summit/University \n", "2 Proactive Police Visit 89.0 7 7 - Thomas/Dale(Frogtown) \n", "3 Proactive Police Visit 87.0 7 7 - Thomas/Dale(Frogtown) \n", "4 Proactive Police Visit 87.0 7 7 - Thomas/Dale(Frogtown) \n", "\n", " Block ... Month Day DayYear Day_Max TimeHour \\\n", "0 arundel_central ... 4 17 107 107 2019-04-30 15:05:00 \n", "1 farrington_fuller ... 4 17 107 107 2019-04-30 15:03:00 \n", "2 mackubin_university ... 4 17 107 107 2019-04-30 16:08:00 \n", "3 lexington_university ... 4 17 107 107 2019-04-30 17:22:00 \n", "4 milton_thomas ... 4 17 107 107 2019-04-30 23:48:00 \n", "\n", " Hour LateNight Intersection Latitude Longitude \n", "0 15 0 1 44.953081 -93.118654 \n", "1 15 0 1 44.953989 -93.113264 \n", "2 16 0 1 44.955842 -93.121236 \n", "3 17 0 1 44.955826 -93.146539 \n", "4 23 1 1 44.959361 -93.139031 \n", "\n", "[5 rows x 27 columns]" ] }, "execution_count": 158, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Final Load\n", "dfI['Block']=dfI['OutputKey'] #for practical purposes it makes sense\n", "Drop_col=['Inter2','Inter1', 'IndexKey', 'Coordinates', 'OutputKey']\n", "dfI_Final=dfI.drop(Drop_col, axis=1,)\n", "dfI_Final.head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Figuring out the Address Key \n", "\n", "So how do we get geocoordinates from masked address?\n", "\n", "The intended strategy was to fill in the missing values with numericals and have a geo-coder application convert it to coordinates. First, I tried the 'Nominator' that is an built-in Geocoder function. It failed quite horribly even for actual addresses. On the other hand, Google's API is very good at approximating address including those that don't necessarily exist. It was not entirely perfect, but had success rate of around 96%. The drawback of the Google API is that it is not fully automated or at least I am not aware how to do it\n", "\n", "Note: If I truly desired to go overkill, I could create a centroid boundary where the address can be located based on the geo-coordinates on the intersections previously mapped out. If out of boundary, then incorrectly matched\n", "\n", "#### Basic Setup\n", "\n", "I will be applying a geocoder function on an entire column. It took some tinkering to setup the Google API into a function where the API specified is default. I borrowed the code from somewhere, but I have broken down the output below and explain how to get raw data and get other information.\n", "\n", "Let's find the output of getting the address of a single location: '90 Grotto Street, St. Paul, MN, 55103'" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'results': [{'address_components': [{'long_name': '490',\n", " 'short_name': '490',\n", " 'types': ['street_number']},\n", " {'long_name': 'Grotto Street North',\n", " 'short_name': 'Grotto St N',\n", " 'types': ['route']},\n", " {'long_name': 'West Frogtown',\n", " 'short_name': 'West Frogtown',\n", " 'types': ['neighborhood', 'political']},\n", " {'long_name': 'Saint Paul',\n", " 'short_name': 'St Paul',\n", " 'types': ['locality', 'political']},\n", " {'long_name': 'Ramsey County',\n", " 'short_name': 'Ramsey County',\n", " 'types': ['administrative_area_level_2', 'political']},\n", " {'long_name': 'Minnesota',\n", " 'short_name': 'MN',\n", " 'types': ['administrative_area_level_1', 'political']},\n", " {'long_name': 'United States',\n", " 'short_name': 'US',\n", " 'types': ['country', 'political']},\n", " {'long_name': '55104', 'short_name': '55104', 'types': ['postal_code']}],\n", " 'formatted_address': '490 Grotto St N, St Paul, MN 55104, USA',\n", " 'geometry': {'location': {'lat': 44.9560645, 'lng': -93.1311927},\n", " 'location_type': 'RANGE_INTERPOLATED',\n", " 'viewport': {'northeast': {'lat': 44.95741348029149,\n", " 'lng': -93.12984371970849},\n", " 'southwest': {'lat': 44.95471551970849, 'lng': -93.13254168029151}}},\n", " 'place_id': 'Eic0OTAgR3JvdHRvIFN0IE4sIFN0IFBhdWwsIE1OIDU1MTA0LCBVU0EiGxIZChQKEgnBnPtqgir2hxGHkz1OfBptEhDqAw',\n", " 'types': ['street_address']}],\n", " 'status': 'OK'}" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#First component URL\n", "geocode_url = \"https://maps.googleapis.com/maps/api/geocode/json?address=490 Grotto Street, St. Paul, MN, 55103\"\n", "# Second Compoent URL (add API)\n", "geocode_url = geocode_url + \"FILLME\"\n", "#Ping Result\n", "output = requests.get(geocode_url)\n", "# Results will be in JSON format - convert to dict using requests functionality\n", "output = output.json()\n", "\n", "output" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is a messy JSON file but its organized by dictionaries and lists. Most of the information is in first column. It is then followed by a dictionary, whom have possibly nested dictionaries and lists.\n", "\n", "### Explaining geo-coder/json file \n", "\n", "I will briefly explain what is a json file and how we use python to get information below. First, a json file is website information data; similar to that of html. The geo-coder is essentially getting information as if I entered geo-coordinates using the google map application. I annotate the structure of the JSON file below. \n", "\n", "**Note**: \n", "\n", "1) The results have a hidden list category '[0]'; the address selected only pinged one listing; IF there was two listing, we would need to specify the second list. Even if one entry is displayed, we still need to specify the first list item. \n", "\n", "2) The entry index for the address_column varies which crashes the code if selecting from normal progression; instead the final index, [-1] is selected to receive the zip code entry; Note: I found this out from debugging \n", " * I was still getting the incorrect index, so I created a conditional argument to find the correct index" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#dictionary\n", " #LIST HIDDEN layer, because one value came up; if two entries come up\n", " {'results':\n", " #dictionary\n", " {\n", " 'address_components': \n", " #list\n", " [\n", " #dictionary\n", " { \n", " 'long_name': '490',\n", " 'short_name': '490',\n", " 'types': ['street_number']\n", " }, #end list entry 1\n", " {'long_name': 'Grotto Street North',\n", " 'short_name': 'Grotto St N',\n", " 'types': ['route']},\n", " {'long_name': 'West Frogtown',\n", " 'short_name': 'West Frogtown',\n", " 'types': ['neighborhood', 'political']},\n", " {'long_name': 'Saint Paul',\n", " 'short_name': 'St Paul',\n", " 'types': ['locality', 'political']},\n", " {'long_name': 'Ramsey County',\n", " 'short_name': 'Ramsey County',\n", " 'types': ['administrative_area_level_2', 'political']},\n", " {'long_name': 'Minnesota',\n", " 'short_name': 'MN',\n", " 'types': ['administrative_area_level_1', 'political']},\n", " {'long_name': 'United States',\n", " 'short_name': 'US',\n", " 'types': ['country', 'political']},\n", " {'long_name': '55104', 'short_name': '55104', 'types': ['postal_code']}\n", " ] #endlist,\n", " \n", " 'formatted_address': '490 Grotto St N, St Paul, MN 55104, USA',\n", " \n", " 'geometry': \n", " #dictionary\n", " {'location':\n", " #dictionary\n", " {'lat': 44.9560645, 'lng': -93.1311927},\n", " 'location_type': 'RANGE_INTERPOLATED',\n", " 'viewport': {'northeast': {'lat': 44.95741348029149,'lng': -93.12984371970849},\n", " 'southwest': {'lat': 44.95471551970849, 'lng': -93.13254168029151}\n", " }\n", " },\n", " 'place_id': 'Eic0OTAgR3JvdHRvIFN0IE4sIFN0IFBhdWwsIE1OIDU1MTA0LCBVU0EiGxIZChQKEgnBnPtqgir2hxGHkz1OfBptEhDqAw',\n", " 'types': ['street_address']\n", "# second macro dictionary entry\n", " 'status': 'OK'}\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that there are many nested dictionaries and lists in the data; \n", "* To select on list value we point to the list index, []\n", "* To select on dictionary value, .get(dictionary_key)" ] }, { "cell_type": "code", "execution_count": 110, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The latitude is 44.9560645\n", "The zip is 55104\n" ] } ], "source": [ "#Get Latitude\n", "print('The latitude is ' + str(output.get('results')[0].get('geometry').get('location').get('lat')))\n", "#Get ZipCode\n", "print('The zip is ' + str(output.get('results')[0].get('address_components')[-1].get('long_name'))) #selected seventh entry in address components\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import geocoder \n", "import requests\n", "#geocoder.google(\"1022 edmund avenue west, St. Paul, MN, 55104\", key=API_KEY)\n", "\n", "def get_google_results(address, api_key='FillMe', return_full_response=False):\n", " \"\"\"\n", " Get geocode results from Google Maps Geocoding API.\n", " \n", " Note, that in the case of multiple google geocode reuslts, this function returns details of the FIRST result.\n", " \n", " @param address: String address as accurate as possible. For Example \"18 Grafton Street, Dublin, Ireland\"\n", " @param api_key: String API key if present from google. \n", " If supplied, requests will use your allowance from the Google API. If not, you\n", " will be limited to the free usage of 2500 requests per day.\n", " @param return_full_response: Boolean to indicate if you'd like to return the full response from google. This\n", " is useful if you'd like additional location details for storage or parsing later.\n", " \"\"\"\n", " # Set up your Geocoding url\n", " geocode_url = \"https://maps.googleapis.com/maps/api/geocode/json?address={}\".format(address)\n", " if api_key is not None:\n", " geocode_url = geocode_url + \"&key={}\".format(api_key)\n", " \n", " # Ping google for the reuslts:\n", " results = requests.get(geocode_url)\n", " # Results will be in JSON format - convert to dict using requests functionality\n", " results = results.json()\n", "# results =results['formatted_address']\n", " \n", " # if there's no results or an error, return empty results.\n", " zip_index=0\n", " if len(results['results']) == 0:\n", " output = {\n", " \"formatted_address\" : None,\n", " \"latitude\": None,\n", " \"longitude\": None\n", " ,\"zip\": None\n", " \n", " }\n", " else: \n", " answer = results['results'][0]\n", " for x,j in enumerate(answer.get('address_components')):\n", " if j.get('types')[0]=='postal_code':\n", " zip_index=x\n", " else:\n", " pass\n", " output = {\n", " \"formatted_address\" : answer.get('formatted_address'),\n", " \"latitude\": answer.get('geometry').get('location').get('lat'),\n", " \"longitude\": answer.get('geometry').get('location').get('lng')\n", " ,\"zip\": answer.get('address_components')[zip_index].get('long_name')\n", " }\n", " \n", " # Append some other details: \n", " output['input_string'] = address\n", " # how many number of results displayed\n", " output['number_of_results'] = len(results['results'])\n", " #was it succesfully executed\n", " output['status'] = results.get('status')\n", " if return_full_response is True:\n", " output['response'] = results\n", " \n", " return output\n", "\n", "get_google_results('500 Grotto St N, Saint Paul, MN, 55104')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Geocoder Algorithm Annotated Steps " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The limitation of the Google API is that there is a time-out when doing many entries (100 is too much), so I would have to do batches of 50 in plugging in the algorithm. The Google API spits out an address associated with the provided geo-coordinates, which can be used to determine if it is a good match.\n", "\n", "The correct zip code is very important for the algorithm to run more accurately.\n" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(222, 4)" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dfW=df.query('Intersection==0')\n", "dfW=dfW.query('Grid in [86.0]') #Perform algorithm separately for each grid\n", "\n", "#I tried different specifications to improve accuracy; In retrospect, I should of given the xx value a different value just for graphical purposes\n", "dfW['Block1']= dfW.Block.str.replace('xx','05')\n", "dfW['Block1']= dfW.Block1.str.replace('x ','0 ') #notice the space\n", "\n", "#special Case!\n", "dfW['Block1']= dfW.Block1.str.replace('ravou0','ravoux')\n", "\n", "# Set it up for Address format; get a general idea on the zipcode of the dataset\n", "dfW['Address']= dfW['Block1'] + ', Saint Paul, MN, 55104'\n", "\n", "# Create a datatable showing original, transformed, and full address\n", "A=dfW[['Grid','Block','Block1','Address']].sort_values('Address')\n", "#Note: the groupby gets unique addresses\n", "A=A.groupby(['Grid','Address','Block']).count()\n", "A=A.reset_index()\n", "A.shape\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Based on the above dimensions, we create 50 entry bins " ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [], "source": [ "#Creating 50 bins\n", "B=A.loc[0:50,:]\n", "C=A.loc[51:100,:]\n", "D=A.loc[101:150,:]\n", "E=A.loc[151:200,:]\n", "F=A.loc[201:,]\n", "#G=A.loc[251:,]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When running the algorithm, wait until one section finishes and then run the next one. We are savings the results of the output in a new column called \"Coordinates\". " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We create a function that runs the algorithm and display the results. We are looking for results that don't match. You should right down the indexes on a piece of paper. Notice the column, Fail, which indicates the entry did not fit zip code parameters; this parameter will also indicate; if this raw entry has been misplaced. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "\n", "def run_geo_algorithm(data,zipcode):\n", " data['Coordinates']= data['Address'].apply(get_google_results)\n", " data['For_Address'] = data['Coordinates'].apply(lambda x: x['formatted_address'])\n", " data['Latitude'] = data['Coordinates'].apply(lambda x: x['latitude']) \n", " data['Longitude'] = data['Coordinates'].apply(lambda x: x['longitude']) \n", " data['Zip'] = data['Coordinates'].apply(lambda x: x['zip']) \n", " data['Results'] = data['Coordinates'].apply(lambda x: x['number_of_results']) \n", " data['Fail'] = data['Coordinates'].apply(lambda x: x['zip'] in zipcode) \n", " return data[['For_Address', 'Block','Zip','Results','Fail']]\n", "\n" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
For_AddressBlockZipResultsFail
201474 Lexington Pkwy N, St Paul, MN 55104, USA59x lexington pa n551041True
202590 N Syndicate St, St Paul, MN 55104, USA59x syndicate st n551041True
203600 N Dunlap St, St Paul, MN 55104, USA60x dunlap st n551041True
204600 N Griggs St, St Paul, MN 55104, USA60x griggs st n551042True
205600 Hamline Ave N, St Paul, MN 55104, USA60x hamline av n551041True
206474 Lexington Pkwy N, St Paul, MN 55104, USA60x lexington pa n551041True
207610 N Dunlap St, St Paul, MN 55104, USA61x dunlap st n551041True
208610 N Griggs St, St Paul, MN 55104, USA61x griggs st n551041True
209610 Lexington Pkwy N, St Paul, MN 55104, USA61x lexington pa n551041True
210620 Lexington Pkwy N, St Paul, MN 55104, USA62x lexington pa n551041True
211St Paul, MN 55104, USA63x lexington pa n551041True
212St Paul, MN 55104, USA64x lexington pa n551041True
213650 Hamline Ave N, St Paul, MN 55104, USA65x hamline av n551041True
2141096 Grand Ave, St Paul, MN 55105, USA65x lexington pa n551051False
215660 Lexington Pkwy N, St Paul, MN 55104, USA66x lexington pa n551041True
216670 Robert St N, St Paul, MN 55101, USA67x lexington pa n551011False
217690 Bedford St, St Paul, MN 55130, USA69x bedford st551301False
218700 N Griggs St, St Paul, MN 55104, USA70x griggs st n551041True
219700 Hamline Ave N, St Paul, MN 55104, USA70x hamline av n551041True
2201096 Grand Ave, St Paul, MN 55105, USA70x lexington pa n551052False
221720 Hamline Ave N, St Paul, MN 55104, USA72x hamline av n551041True
\n", "
" ], "text/plain": [ " For_Address Block Zip \\\n", "201 474 Lexington Pkwy N, St Paul, MN 55104, USA 59x lexington pa n 55104 \n", "202 590 N Syndicate St, St Paul, MN 55104, USA 59x syndicate st n 55104 \n", "203 600 N Dunlap St, St Paul, MN 55104, USA 60x dunlap st n 55104 \n", "204 600 N Griggs St, St Paul, MN 55104, USA 60x griggs st n 55104 \n", "205 600 Hamline Ave N, St Paul, MN 55104, USA 60x hamline av n 55104 \n", "206 474 Lexington Pkwy N, St Paul, MN 55104, USA 60x lexington pa n 55104 \n", "207 610 N Dunlap St, St Paul, MN 55104, USA 61x dunlap st n 55104 \n", "208 610 N Griggs St, St Paul, MN 55104, USA 61x griggs st n 55104 \n", "209 610 Lexington Pkwy N, St Paul, MN 55104, USA 61x lexington pa n 55104 \n", "210 620 Lexington Pkwy N, St Paul, MN 55104, USA 62x lexington pa n 55104 \n", "211 St Paul, MN 55104, USA 63x lexington pa n 55104 \n", "212 St Paul, MN 55104, USA 64x lexington pa n 55104 \n", "213 650 Hamline Ave N, St Paul, MN 55104, USA 65x hamline av n 55104 \n", "214 1096 Grand Ave, St Paul, MN 55105, USA 65x lexington pa n 55105 \n", "215 660 Lexington Pkwy N, St Paul, MN 55104, USA 66x lexington pa n 55104 \n", "216 670 Robert St N, St Paul, MN 55101, USA 67x lexington pa n 55101 \n", "217 690 Bedford St, St Paul, MN 55130, USA 69x bedford st 55130 \n", "218 700 N Griggs St, St Paul, MN 55104, USA 70x griggs st n 55104 \n", "219 700 Hamline Ave N, St Paul, MN 55104, USA 70x hamline av n 55104 \n", "220 1096 Grand Ave, St Paul, MN 55105, USA 70x lexington pa n 55105 \n", "221 720 Hamline Ave N, St Paul, MN 55104, USA 72x hamline av n 55104 \n", "\n", " Results Fail \n", "201 1 True \n", "202 1 True \n", "203 1 True \n", "204 2 True \n", "205 1 True \n", "206 1 True \n", "207 1 True \n", "208 1 True \n", "209 1 True \n", "210 1 True \n", "211 1 True \n", "212 1 True \n", "213 1 True \n", "214 1 False \n", "215 1 True \n", "216 1 False \n", "217 1 False \n", "218 1 True \n", "219 1 True \n", "220 2 False \n", "221 1 True " ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#run it for each letter\n", "run_geo_algorithm(F,['55104']) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We then set up code collecting all the successful matches and the unsuccessful matches" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [], "source": [ "#C_mess=B.loc[[0,19],:]\n", "#B= B.drop([0,19])\n", "E_mess=E.loc[[195,198],:]\n", "E= E.drop([195,198])\n", "F_mess=F.loc[[201,206,210,211,212,214,220],:]\n", "F= F.drop([201,206,210,211,212,214,220])\n", "\n", "Address_Mess_86= pd.concat([E_mess, F_mess], ignore_index=True)\n", "Add_Com=pd.concat([B,C,D,E,F], ignore_index=True)\n", "CC= Add_Com[['Grid','Block','For_Address','Latitude','Longitude','Fail']]\n", "CC_86=CC.reset_index()\n", "CC_86['index1'] = CC_86.index\n" ] }, { "cell_type": "code", "execution_count": 260, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
indexGridBlockFor_AddressLatitudeLongitudeFailindex1
0191.0x como av217 Como Ave, St Paul, MN 55103, USA44.961345-93.109380False0
1391.0x university av w298 University Ave W, St Paul, MN 55103, USA44.955502-93.112500False1
2591.01x acker st e10 Acker St E, St Paul, MN 55117, USA44.963792-93.099412False2
3691.01x acker st w10 W Acker St, St Paul, MN 55117, USA44.963794-93.100281False3
4791.01x como av10 Como Ave, St Paul, MN 55103, USA44.958783-93.099995True4
5891.01x empire dr10 Empire Dr, St Paul, MN 55103, USA44.961884-93.098466True5
61091.01x winter st10 Winter St, St Paul, MN 55103, USA44.959380-93.099357True6
71191.010x como av100 Como Ave, St Paul, MN 55103, USA44.958676-93.104033True7
81291.010x empire dr100 Empire Dr, St Paul, MN 55103, USA44.961717-93.105415True8
91391.010x university av w100 University Ave W, St Paul, MN 55155, USA44.954403-93.104593True9
101491.011x como av110 Como Ave, St Paul, MN 55103, USA44.958635-93.104630True10
111591.011x empire dr110 Empire Dr, St Paul, MN 55103, USA44.961914-93.098667True11
121691.011x pennsylvania av w110 W Pennsylvania Ave, St Paul, MN 55103, USA44.960755-93.104028True12
131791.011x sherburne av110 Sherburne Ave, St Paul, MN 55103, USA44.956844-93.104657True13
141891.011x university av w110 University Ave W, St Paul, MN 55103, USA44.955843-93.105887True14
151991.011x winter st110 Winter St, St Paul, MN 55103, USA44.959688-93.104550True15
162091.012x arch st w120 Arch St, St Paul, MN 55130, USA44.960220-93.096477False16
172191.012x charles av120 Charles Ave, St Paul, MN 55103, USA44.957587-93.105056True17
182291.012x como av120 Como Ave, St Paul, MN 55103, USA44.958733-93.104886True18
192391.012x pennsylvania av e120 Pennsylvania Ave E, St Paul, MN 55130, USA44.961626-93.095043False19
202491.012x university av w120 University Ave W, St Paul, MN 55103, USA44.955843-93.105887True20
212591.012x winter st120 Winter St, St Paul, MN 55103, USA44.959787-93.105052True21
222691.0123x beech st1230 Beech St, St Paul, MN 55106, USA44.962002-93.048630False22
232791.013x charles av130 Charles Ave, St Paul, MN 55103, USA44.957590-93.105497True23
242891.013x sherburne av130 Sherburne Ave, St Paul, MN 55103, USA44.956850-93.105463True24
252991.0141x carling dr1410 Carling Dr, St Paul, MN 55108, USA44.972710-93.158861False25
263091.015x pennsylvania av w150 W Pennsylvania Ave, St Paul, MN 55103, USA44.960518-93.105073True26
273191.0158x montana av e1580 Montana Ave E, St Paul, MN 55106, USA44.987257-93.033770False27
283291.016x pennsylvania av w160 W Pennsylvania Ave, St Paul, MN 55103, USA44.960453-93.105301True28
293391.0163x charles av1630 Charles Ave, St Paul, MN 55104, USA44.957278-93.169159False29
...........................
616591.06x como av60 Como Ave, St Paul, MN 55103, USA44.958952-93.101977True61
626691.06x empire dr60 Empire Dr, St Paul, MN 55103, USA44.961271-93.102051True62
636791.060x capitol bd600 N Capitol Blvd, St Paul, MN 55103, USA44.959536-93.101892True63
646891.060x park st600 Park St, St Paul, MN 55103, USA44.959447-93.103396True64
656991.060x rice st600 Rice St, St Paul, MN 55103, USA44.959554-93.105677True65
667091.061x capitol bd610 N Capitol Blvd, St Paul, MN 55103, USA44.959804-93.101754True66
677191.061x park st610 Park St, St Paul, MN 55103, USA44.959969-93.103753True67
687291.061x rice st610 Rice St, St Paul, MN 55103, USA44.959697-93.105678True68
697391.062x capitol bd620 N Capitol Blvd, St Paul, MN 55103, USA44.960109-93.101700True69
707491.063x capitol bd630 N Capitol Blvd, St Paul, MN 55103, USA44.960342-93.101702True70
717591.063x park st630 Park St, St Paul, MN 55103, USA44.960332-93.103413True71
727691.064x rice st640 Rice St, St Paul, MN 55103, USA44.960655-93.105942True72
737791.067x jackson st670 Jackson St, St Paul, MN 55130, USA44.957361-93.096620False73
747891.068x jackson st680 Jackson St, St Paul, MN 55101, USA44.956342-93.097273False74
757991.068x rice st680 Rice St, St Paul, MN 55103, USA44.962212-93.105766True75
768091.07x como av70 Como Ave, St Paul, MN 55103, USA44.958952-93.102460True76
778191.07x pennsylvania av e70 W Pennsylvania Ave, St Paul, MN 55103, USA44.960455-93.102194True77
788291.070x robert st n700 Robert St N, St Paul, MN 55103, USA44.956250-93.098315True78
798591.072x jackson st720 Jackson St, St Paul, MN 55130, USA44.960142-93.098144False79
808891.074x capitol ht740 Capitol Heights, St Paul, MN 55103, USA44.957766-93.098975True80
818991.075x capitol bd750 N Capitol Blvd, St Paul, MN 55103, USA44.960423-93.102203True81
829091.075x jackson st750 Jackson St, St Paul, MN 55117, USA44.961889-93.097993True82
839191.075x sherburne av750 Sherburne Ave, St Paul, MN 55104, USA44.956431-93.131959False83
8410091.08x empire dr80 Empire Dr, St Paul, MN 55103, USA44.961905-93.098606True84
8510191.08x pennsylvania av w80 W Pennsylvania Ave, St Paul, MN 55103, USA44.960457-93.102736True85
8610291.08x winter st80 Winter St, St Paul, MN 55103, USA44.959802-93.102999True86
8710491.080x cedar st800 Cedar St, St Paul, MN 55103, USA44.958783-93.100676True87
8810591.081x capitol bd810 N Capitol Blvd, St Paul, MN 55103, USA44.960423-93.102203True88
8910991.09x empire dr90 Empire Dr, St Paul, MN 55103, USA44.961908-93.098626True89
9011091.09x pennsylvania av w90 W Pennsylvania Ave, St Paul, MN 55103, USA44.960460-93.103279True90
\n", "

91 rows × 8 columns

\n", "
" ], "text/plain": [ " index Grid Block \\\n", "0 1 91.0 x como av \n", "1 3 91.0 x university av w \n", "2 5 91.0 1x acker st e \n", "3 6 91.0 1x acker st w \n", "4 7 91.0 1x como av \n", "5 8 91.0 1x empire dr \n", "6 10 91.0 1x winter st \n", "7 11 91.0 10x como av \n", "8 12 91.0 10x empire dr \n", "9 13 91.0 10x university av w \n", "10 14 91.0 11x como av \n", "11 15 91.0 11x empire dr \n", "12 16 91.0 11x pennsylvania av w \n", "13 17 91.0 11x sherburne av \n", "14 18 91.0 11x university av w \n", "15 19 91.0 11x winter st \n", "16 20 91.0 12x arch st w \n", "17 21 91.0 12x charles av \n", "18 22 91.0 12x como av \n", "19 23 91.0 12x pennsylvania av e \n", "20 24 91.0 12x university av w \n", "21 25 91.0 12x winter st \n", "22 26 91.0 123x beech st \n", "23 27 91.0 13x charles av \n", "24 28 91.0 13x sherburne av \n", "25 29 91.0 141x carling dr \n", "26 30 91.0 15x pennsylvania av w \n", "27 31 91.0 158x montana av e \n", "28 32 91.0 16x pennsylvania av w \n", "29 33 91.0 163x charles av \n", ".. ... ... ... \n", "61 65 91.0 6x como av \n", "62 66 91.0 6x empire dr \n", "63 67 91.0 60x capitol bd \n", "64 68 91.0 60x park st \n", "65 69 91.0 60x rice st \n", "66 70 91.0 61x capitol bd \n", "67 71 91.0 61x park st \n", "68 72 91.0 61x rice st \n", "69 73 91.0 62x capitol bd \n", "70 74 91.0 63x capitol bd \n", "71 75 91.0 63x park st \n", "72 76 91.0 64x rice st \n", "73 77 91.0 67x jackson st \n", "74 78 91.0 68x jackson st \n", "75 79 91.0 68x rice st \n", "76 80 91.0 7x como av \n", "77 81 91.0 7x pennsylvania av e \n", "78 82 91.0 70x robert st n \n", "79 85 91.0 72x jackson st \n", "80 88 91.0 74x capitol ht \n", "81 89 91.0 75x capitol bd \n", "82 90 91.0 75x jackson st \n", "83 91 91.0 75x sherburne av \n", "84 100 91.0 8x empire dr \n", "85 101 91.0 8x pennsylvania av w \n", "86 102 91.0 8x winter st \n", "87 104 91.0 80x cedar st \n", "88 105 91.0 81x capitol bd \n", "89 109 91.0 9x empire dr \n", "90 110 91.0 9x pennsylvania av w \n", "\n", " For_Address Latitude Longitude \\\n", "0 217 Como Ave, St Paul, MN 55103, USA 44.961345 -93.109380 \n", "1 298 University Ave W, St Paul, MN 55103, USA 44.955502 -93.112500 \n", "2 10 Acker St E, St Paul, MN 55117, USA 44.963792 -93.099412 \n", "3 10 W Acker St, St Paul, MN 55117, USA 44.963794 -93.100281 \n", "4 10 Como Ave, St Paul, MN 55103, USA 44.958783 -93.099995 \n", "5 10 Empire Dr, St Paul, MN 55103, USA 44.961884 -93.098466 \n", "6 10 Winter St, St Paul, MN 55103, USA 44.959380 -93.099357 \n", "7 100 Como Ave, St Paul, MN 55103, USA 44.958676 -93.104033 \n", "8 100 Empire Dr, St Paul, MN 55103, USA 44.961717 -93.105415 \n", "9 100 University Ave W, St Paul, MN 55155, USA 44.954403 -93.104593 \n", "10 110 Como Ave, St Paul, MN 55103, USA 44.958635 -93.104630 \n", "11 110 Empire Dr, St Paul, MN 55103, USA 44.961914 -93.098667 \n", "12 110 W Pennsylvania Ave, St Paul, MN 55103, USA 44.960755 -93.104028 \n", "13 110 Sherburne Ave, St Paul, MN 55103, USA 44.956844 -93.104657 \n", "14 110 University Ave W, St Paul, MN 55103, USA 44.955843 -93.105887 \n", "15 110 Winter St, St Paul, MN 55103, USA 44.959688 -93.104550 \n", "16 120 Arch St, St Paul, MN 55130, USA 44.960220 -93.096477 \n", "17 120 Charles Ave, St Paul, MN 55103, USA 44.957587 -93.105056 \n", "18 120 Como Ave, St Paul, MN 55103, USA 44.958733 -93.104886 \n", "19 120 Pennsylvania Ave E, St Paul, MN 55130, USA 44.961626 -93.095043 \n", "20 120 University Ave W, St Paul, MN 55103, USA 44.955843 -93.105887 \n", "21 120 Winter St, St Paul, MN 55103, USA 44.959787 -93.105052 \n", "22 1230 Beech St, St Paul, MN 55106, USA 44.962002 -93.048630 \n", "23 130 Charles Ave, St Paul, MN 55103, USA 44.957590 -93.105497 \n", "24 130 Sherburne Ave, St Paul, MN 55103, USA 44.956850 -93.105463 \n", "25 1410 Carling Dr, St Paul, MN 55108, USA 44.972710 -93.158861 \n", "26 150 W Pennsylvania Ave, St Paul, MN 55103, USA 44.960518 -93.105073 \n", "27 1580 Montana Ave E, St Paul, MN 55106, USA 44.987257 -93.033770 \n", "28 160 W Pennsylvania Ave, St Paul, MN 55103, USA 44.960453 -93.105301 \n", "29 1630 Charles Ave, St Paul, MN 55104, USA 44.957278 -93.169159 \n", ".. ... ... ... \n", "61 60 Como Ave, St Paul, MN 55103, USA 44.958952 -93.101977 \n", "62 60 Empire Dr, St Paul, MN 55103, USA 44.961271 -93.102051 \n", "63 600 N Capitol Blvd, St Paul, MN 55103, USA 44.959536 -93.101892 \n", "64 600 Park St, St Paul, MN 55103, USA 44.959447 -93.103396 \n", "65 600 Rice St, St Paul, MN 55103, USA 44.959554 -93.105677 \n", "66 610 N Capitol Blvd, St Paul, MN 55103, USA 44.959804 -93.101754 \n", "67 610 Park St, St Paul, MN 55103, USA 44.959969 -93.103753 \n", "68 610 Rice St, St Paul, MN 55103, USA 44.959697 -93.105678 \n", "69 620 N Capitol Blvd, St Paul, MN 55103, USA 44.960109 -93.101700 \n", "70 630 N Capitol Blvd, St Paul, MN 55103, USA 44.960342 -93.101702 \n", "71 630 Park St, St Paul, MN 55103, USA 44.960332 -93.103413 \n", "72 640 Rice St, St Paul, MN 55103, USA 44.960655 -93.105942 \n", "73 670 Jackson St, St Paul, MN 55130, USA 44.957361 -93.096620 \n", "74 680 Jackson St, St Paul, MN 55101, USA 44.956342 -93.097273 \n", "75 680 Rice St, St Paul, MN 55103, USA 44.962212 -93.105766 \n", "76 70 Como Ave, St Paul, MN 55103, USA 44.958952 -93.102460 \n", "77 70 W Pennsylvania Ave, St Paul, MN 55103, USA 44.960455 -93.102194 \n", "78 700 Robert St N, St Paul, MN 55103, USA 44.956250 -93.098315 \n", "79 720 Jackson St, St Paul, MN 55130, USA 44.960142 -93.098144 \n", "80 740 Capitol Heights, St Paul, MN 55103, USA 44.957766 -93.098975 \n", "81 750 N Capitol Blvd, St Paul, MN 55103, USA 44.960423 -93.102203 \n", "82 750 Jackson St, St Paul, MN 55117, USA 44.961889 -93.097993 \n", "83 750 Sherburne Ave, St Paul, MN 55104, USA 44.956431 -93.131959 \n", "84 80 Empire Dr, St Paul, MN 55103, USA 44.961905 -93.098606 \n", "85 80 W Pennsylvania Ave, St Paul, MN 55103, USA 44.960457 -93.102736 \n", "86 80 Winter St, St Paul, MN 55103, USA 44.959802 -93.102999 \n", "87 800 Cedar St, St Paul, MN 55103, USA 44.958783 -93.100676 \n", "88 810 N Capitol Blvd, St Paul, MN 55103, USA 44.960423 -93.102203 \n", "89 90 Empire Dr, St Paul, MN 55103, USA 44.961908 -93.098626 \n", "90 90 W Pennsylvania Ave, St Paul, MN 55103, USA 44.960460 -93.103279 \n", "\n", " Fail index1 \n", "0 False 0 \n", "1 False 1 \n", "2 False 2 \n", "3 False 3 \n", "4 True 4 \n", "5 True 5 \n", "6 True 6 \n", "7 True 7 \n", "8 True 8 \n", "9 True 9 \n", "10 True 10 \n", "11 True 11 \n", "12 True 12 \n", "13 True 13 \n", "14 True 14 \n", "15 True 15 \n", "16 False 16 \n", "17 True 17 \n", "18 True 18 \n", "19 False 19 \n", "20 True 20 \n", "21 True 21 \n", "22 False 22 \n", "23 True 23 \n", "24 True 24 \n", "25 False 25 \n", "26 True 26 \n", "27 False 27 \n", "28 True 28 \n", "29 False 29 \n", ".. ... ... \n", "61 True 61 \n", "62 True 62 \n", "63 True 63 \n", "64 True 64 \n", "65 True 65 \n", "66 True 66 \n", "67 True 67 \n", "68 True 68 \n", "69 True 69 \n", "70 True 70 \n", "71 True 71 \n", "72 True 72 \n", "73 False 73 \n", "74 False 74 \n", "75 True 75 \n", "76 True 76 \n", "77 True 77 \n", "78 True 78 \n", "79 False 79 \n", "80 True 80 \n", "81 True 81 \n", "82 True 82 \n", "83 False 83 \n", "84 True 84 \n", "85 True 85 \n", "86 True 86 \n", "87 True 87 \n", "88 True 88 \n", "89 True 89 \n", "90 True 90 \n", "\n", "[91 rows x 8 columns]" ] }, "execution_count": 260, "metadata": {}, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will check if the data matches the graph and distinguish outliers as either being miscategorized or geo-coordinates are incorrect." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "%matplotlib inline \n", "import folium\n", "\n", "def plot_geomatch(Dat):\n", " FG_map = folium.Map(location=[44.958326, -93.122926], zoom_start=14,tiles=\"OpenStreetMap\")\n", " \n", " #Datl= Dat.query('Fail = True')\n", " for index, row in Dat.iterrows(): \n", " popup_text = \"Block: {}
Address: {}
Index: {}\"\n", " popup_text = popup_text.format(row[\"Block\"], row['For_Address'],row['index1'])\n", " folium.CircleMarker(location=(row[\"Latitude\"],row[\"Longitude\"]),\n", " radius=7,\n", " color=\"#E37222\",\n", " popup=popup_text,\n", " fill=True).add_to(FG_map)\n", " \n", " return FG_map\n" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "plot_geomatch(CC_86.query('Fail==True'))" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [], "source": [ "#Here is index of false values\n", "FalseIndex= [197,186,187,160]\n", "CC_86.loc[FalseIndex,'Fail']=False\n", "\n", "#Here is index where address needs to be fixed\n", "#MessIndex=[112,164]\n", "#Map_mess=CC_86.loc[MessIndex,:]\n", "\n", "#Address_Mess_86 = None\n", "#Address_Mess_86= pd.concat([Address_Mess_86,Map_mess], ignore_index=True)\n", "\n", "CC_86.to_csv(r'GridFiles/86_Address_Comp.csv',index=False)\n", "Address_Mess_86.to_csv(r'GridFiles/86_Address_Mess.csv',index=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Repeat the previous steps for the remaining grids \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we will run this code for each district; changing the final name and valid zip codes\n", "\n", "Grids:\n", "* 86,87,106,107\n", " - Zip: 55104\n", "* 89,90\n", " - Zip: 55103\n", "* 109\n", " - Zip: 55103, 55102\n", "* 67,88,108\n", " - Zip: 55103, 55104\n", "* 68\n", " - Zip: 55103, 55104, 55107\n", "* 66\n", " - Zip: 55103, 55108,55104\n", "* 110\n", " - Zip: 55103, 55102\n", "* 91\n", " - Zip: 55103, 55155,55117\n", "* 92\n", " - Zip: 55103, 55130,55117,55101" ] }, { "cell_type": "code", "execution_count": 136, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(105, 4)" ] }, "execution_count": 136, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Setup Dataframe\n", "dfW=df.query('Intersection==0')\n", "dfW=dfW.query('Grid in [92.0]') #Perform algorithm separately for each grid\n", "dfW['Block1']= dfW.Block.str.replace('xx','05')\n", "dfW['Block1']= dfW.Block1.str.replace('x ','0 ') #notice the space\n", "dfW['Block1']= dfW.Block1.str.replace('ravou0','ravoux')\n", "dfW['Address']= dfW['Block1'] + ', Saint Paul, MN, 55103'\n", "A=dfW[['Grid','Block','Block1','Address']].sort_values('Address')\n", "A=A.groupby(['Grid','Address','Block']).count()\n", "A=A.reset_index()\n", "A.shape" ] }, { "cell_type": "code", "execution_count": 137, "metadata": {}, "outputs": [], "source": [ "#Split Dataframe\n", "\n", "B=A.loc[0:50,:]\n", "C=A.loc[51:100,]\n", "D=A.loc[101:,]\n", "#E=A.loc[151:200,:]\n", "#F=A.loc[201:,]\n", "#G=A.loc[251:,]" ] }, { "cell_type": "code", "execution_count": 141, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
For_AddressBlockZipResultsFail
10180 E Mt Airy St, St Paul, MN 55130, USA8x mtairy st551301True
10290 Arch St, St Paul, MN 55130, USA9x arch st e551301True
10390 E Mt Airy St, St Paul, MN 55130, USA9x mtairy st551301True
104970 3rd St E, St Paul, MN 55106, USA97x 3 st e551061False
\n", "
" ], "text/plain": [ " For_Address Block Zip Results \\\n", "101 80 E Mt Airy St, St Paul, MN 55130, USA 8x mtairy st 55130 1 \n", "102 90 Arch St, St Paul, MN 55130, USA 9x arch st e 55130 1 \n", "103 90 E Mt Airy St, St Paul, MN 55130, USA 9x mtairy st 55130 1 \n", "104 970 3rd St E, St Paul, MN 55106, USA 97x 3 st e 55106 1 \n", "\n", " Fail \n", "101 True \n", "102 True \n", "103 True \n", "104 False " ] }, "execution_count": 141, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#run geo_algorithm for each appriopiate letter\n", "run_geo_algorithm(D,['55103','55130','55117','55101']) " ] }, { "cell_type": "code", "execution_count": 146, "metadata": {}, "outputs": [], "source": [ "#Plug in Indexes\n", "#B_Ind= [0,2,4,9]\n", "#B_mess=B.loc[B_Ind,:]\n", "#B= B.drop(B_Ind)\n", "#C_Ind= [70,75,81,83]\n", "#C_mess=C.loc[C_Ind,:]\n", "#C= C.drop(C_Ind)\n", "#D_Ind= [103,106,107,108,111,112]\n", "#D_mess=D.loc[D_Ind,:]\n", "#D= D.drop(D_Ind)\n", "#F_Ind= [226,227,228]\n", "#F_mess=F.loc[F_Ind,:]\n", "#F= F.drop(F_Ind)\n", "#G_Ind= [270]\n", "#G_mess=G.loc[G_Ind,:]\n", "#G= G.drop(G_Ind)\n", "\n", "#Prepare Data for plotting\n", "Address_Mess_92= pd.concat([C_mess], ignore_index=True)\n", "Add_Com=pd.concat([B,C,D])\n", "CC= Add_Com[['Grid','Block','For_Address','Latitude','Longitude','Fail']]\n", "CC_92=CC.reset_index()\n", "CC_92['index1'] = CC_92.index\n" ] }, { "cell_type": "code", "execution_count": 385, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "execution_count": 385, "metadata": {}, "output_type": "execute_result" } ], "source": [ "plot_geomatch(CC_86.query('Fail==True'))" ] }, { "cell_type": "code", "execution_count": 154, "metadata": {}, "outputs": [], "source": [ "#Based on the plots perform these steps\n", "\n", "#Update the False Index\n", "#FalseIndex= [0,1,2,3]\n", "#CC_92.loc[FalseIndex,'Fail']=False\n", "\n", "#Here is index where address needs to be fixed\n", "#MessIndex=[143]\n", "#Map_mess=CC_92.loc[MessIndex,:]\n", "\n", "#Address_Mess_88 = None\n", "Address_Mess_92= pd.concat([Address_Mess_92], ignore_index=True)\n", "\n", "#Finally save the file into a csv file\n", "CC_92.to_csv(r'GridFiles/92_Address_Comp.csv',index=False)\n", "Address_Mess_92.to_csv(r'GridFiles/92_Address_Mess.csv',index=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once complete, we have created a bunch of csv files for each grid that has the 'complete' filled address and addresses that are incorrect." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Decoder Aggregation \n", "\n", "0) Goals: No data entry is lost or gained from the entire process \n", "\n", "1) We have a bunch of complete addresses and a bunch of incorrect addresses, we want to combine them to a stacked table respectively\n", "\n", "2) Data Cleanseliness [Optional]: Make a csv that list of incorrectly grid specified and send to database owner\n", " * Why does it matter?... If data is misspecified it impacts aggregation and provide wrong information in data-pulls. Data should be accurate for a public agency\n", " \n", "3) Export all the incorrect addresses into an csv file\n", " - Fill in the geo-coordinates manually like you have done for the intersection data\n", " \n", "4) Combine with the correct key\n", "\n", "5) Assure Uniqueness Requirement is Satisfied! and then save it to a csv; this is your master decoder \n" ] }, { "cell_type": "code", "execution_count": 401, "metadata": {}, "outputs": [], "source": [ "import glob\n", "import os\n", "\n", "#Import All good Comp files to one file\n", "path = r'C:\\Users\\17189\\Documents\\GitHub\\Coursera_Capstone\\Saint Paul Project\\GridFiles' # use your path\n", "all_files = glob.glob(os.path.join(path, \"*Comp.csv\")) # advisable to use os.path.join as this makes concatenation OS independent\n", "df_from_each_file = (pd.read_csv(f) for f in all_files)\n", "Address_Agg= pd.concat(df_from_each_file, ignore_index=True)\n", "\n", "#Create a csv for the incorrectly specified entries\n", "Address_False= Address_Agg.query('Fail==False')\n", "Address_False= Address_False[['Block','Grid']]\n", "Address_False.to_csv(r'Address_False.csv',index=False)\n", "\n", "\n", "Address_Complete= Address_Agg.query('Fail==True')\n", "Address_Complete = Address_Complete.groupby(['Block','For_Address','Latitude','Longitude']).count().reset_index()\n", "Address_Complete.to_csv(r'Address_Complete.csv',index=False)\n", "\n", "\n", "#Create a csv for the messed up entries and prepare it for export\n", "all_files = glob.glob(os.path.join(path, \"*Mess.csv\")) # advisable to use os.path.join as this makes concatenation OS independent\n", "df_from_each_file = (pd.read_csv(f) for f in all_files)\n", "Address_Agg_Mess= pd.concat(df_from_each_file, ignore_index=True)\n", "\n", "#this code specifies in case there was a correction to the address in a different sub-section; if so dont include it\n", "Z=pd.merge(Address_Agg_Mess,Address_Complete, on='Block', how='left')\n", "Z= Z.query('Latitude_y == \"NaN\"')\n", "Z.to_csv(r'Address_Agg_Mess.csv',index=False)\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Prepare Address Key \n", "\n", "Assume that we have gotten all the desired values from the mess table" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "code_folding": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
BlockAddressLatitudeLongitude
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [Block, Address, Latitude, Longitude]\n", "Index: []" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# We load the mess and complete data\n", "dfW=df.query('Intersection==0')\n", "#df_key_agg = pd.read_csv('Address_KeySing.csv')\n", "df_keymess= pd.read_csv('Datasets/AddressKeyMess.csv')\n", "\n", "#Prepare the Corrected Address Data to combine\n", "new=df_keymess['Coordinates'].str.split(\",\", n = 1, expand = True) #Separate Coordinates (Depending if collected in pairs)\n", "# making seperate first name column from new data frame \n", "df_keymess['Latitude']= pd.to_numeric(new[0]) #pd.to_numeric convert it to float\n", "df_keymess['Longitude']= pd.to_numeric(new[1])\n", "df_keymess= df_keymess[['Block','Address','Latitude','Longitude']]\n", "D1=df_keymess.groupby('Block').count().reset_index()\n", "#Sanity Check\n", "print(D1.query('Address > 1'))\n", "\n", "#Prepare the Completed Data to combine\n", "#df_key_agg.columns= ['Block','Address','Latitude','Longitude']\n", "\n", "#Merge Both Dataframes\n", "#df_C=df_key2.append(df_keymess, ignore_index=True)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Sanity Check for entire dataset\n", "\n", "It is required that block are unique entries otherwise it will not be a 1-1 join." ] }, { "cell_type": "code", "execution_count": 99, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
BlockAddressLatitudeLongitude
83642x rice st222
\n", "
" ], "text/plain": [ " Block Address Latitude Longitude\n", "836 42x rice st 2 2 2" ] }, "execution_count": 99, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Run Code\n", "D1=df_C.groupby('Block').count().reset_index()\n", "D1.query('Address > 1')\n", "\n", "#B= df_C.drop(df_C.index[528])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Once Complete save it to csv file\n", "Address_CompleteKey= df_C[['Block','Address','Latitude','Longitude']]\n", "Address_CompleteKey.to_csv(r'Address_CompleteKey.csv',index=False)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The actual loading of the data is on the loading data notebook" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Algorithm Maintenance \n", "\n", "This algorithm is meant to decode all existing coordinates and save it on a decoder csv. There are two potential options: One of regularly updated maintenance or create an absolute decoder key where zero maintenance is required.\n", "\n", "1) Absolute Decoder Key: We can implement a strategy similar to the intersection model. We consider all potential address derivations and map the geo-coordinates.\n", " - The advantages of this approach is that maintenance is not required once decoder is made; though any addresses outside the decoder range is not considered\n", " - Disadvantage: You need to consider all major streets on the grid and may need to adjust zip code \n", " - You need to consider when streets change direction (North and South)\n", " - It much simpler to visualize and debug.\n", " - I believe this type of mapping initiative is more efficient in taking care of outliers, create a distance metric (to establish distance for each consecutive address, and meta-information such as zip-code\n", " - You don't need to use Google API geocoder, once sufficiently mapped out\n", "\n", "2) Continuous Maintenance:\n", " - Any data not matched by the decoder key; can be separated and have the algorithm run and save the coordinate information and updates the appropiate csv file for the respective grid\n", " - It is unlikely new address locations will pop up with 4 years worth of data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Absolute Decoder Key Method \n", "\n", "My recommend strategy of using the decoder code is to map each street and/or avenue and save the correct values into a csv file. In considerations, in making this time-efficient, I will describe the process in creating an effective workspace.\n", "\n", "\n", "1) To ease in selection of finding the names of streets/avenues, you can use the intersection decoder to find the values. You may by missing some streets, but that is where we test it with the actual data\n", "\n", "2) The range of values will be dependent on the area of interest. If it follows a numerical pattern, you can set the boundaries using the index to select specific area to measure. If ambitious, you can measure in entirety\n", "\n", "3) When cleaning up the datapoints, it is important that you map out the datapoints and print out correlation graph\n", "\n", "4) One of the challenges of decoder are streets that have addresses based on their affiliated direction. One way to narrow down the values is to restricting on the valid zipcodes or you can create two separate entries for each direction; however the raw data may not distinguish this.\n", " - Separation example: University Av N and University Av S" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data Setup\n", "\n", "I'll be replicating the block column for the police data in designing this function. The function has zipcode, and range values to pinpoint address ranges. The indexing is setup where 1XX is between 10X and 11X, which is very important!\n" ] }, { "cell_type": "code", "execution_count": 296, "metadata": {}, "outputs": [], "source": [ "def abs_alg(street,zipcode,r_left=0,r_right=135):\n", " global df\n", " df=pd.DataFrame()\n", " df1=pd.DataFrame()\n", " numbers = range(r_left,r_right)\n", " df['Index']= [n for n in numbers]\n", " df1['Index']= [n for n in numbers]\n", " df['Block'] = df['Index'].apply(lambda x: str(x) +'x ' + street if x!= 0 else 'x ' + street )\n", " df1['Block'] = df1['Index'].apply(lambda x: str(x) +'xx ' + street if x!= 0 else 'xx ' + street )\n", " df['Address'] = df['Index'].apply(lambda x: str(x) +'0 ' + street + ', MN, Saint Paul, ' + zipcode if x!= 0 else '2 ' + street + ', MN, Saint Paul, ' + zipcode)\n", " df1['Address'] = df1['Index'].apply(lambda x: str(x) +'05 ' + street + ', MN, Saint Paul, ' + zipcode if x!= 0 else '5 ' + street + ', MN, Saint Paul, ' + zipcode)\n", " df1['Index'] = df1['Index'] * 10\n", " df= pd.concat([df, df1.query('Index<140')], ignore_index=True).sort_values(['Index','Address'])\n", " df=df.reset_index()\n", " return print(df.shape)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Run Geo-Coder\n", "\n", "It is important to reset the index; we want the index to be sorted numerically to address" ] }, { "cell_type": "code", "execution_count": 336, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(151, 4)\n" ] } ], "source": [ "abs_alg('Sherburne Ave','55104')\n", "\n", "# See shape first; it is usally consistent for each run\n", "B=df.loc[0:50,:]\n", "C=df.loc[51:100,:]\n", "D=df.loc[101:,]" ] }, { "cell_type": "code", "execution_count": 340, "metadata": {}, "outputs": [], "source": [ "run_geo_algorithm(B,['55104']) \n", "#run_geo_algorithm(C,['55104']) \n", "#run_geo_algorithm(D,['55104']) " ] }, { "cell_type": "code", "execution_count": 341, "metadata": {}, "outputs": [], "source": [ "#Add values and \n", "Add_Com=pd.concat([B,C,D])\n", "CC= Add_Com[['Block','For_Address','Latitude','Longitude','Fail']]\n", "CC_Sher1=CC.reset_index()\n", "CC_Sher1['index1'] = CC_Sher1.index" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data Visualization\n", "\n", "The visual plot below can help visualize blatant outliers, but also serve to set the boundaries of the data if desired." ] }, { "cell_type": "code", "execution_count": 406, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "execution_count": 406, "metadata": {}, "output_type": "execute_result" } ], "source": [ "plot_geomatch(CC_Sher1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The scatterplot is an effective method to determine visually if there are outliers for the dataset." ] }, { "cell_type": "code", "execution_count": 343, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAAGoCAYAAADVd+V5AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAIABJREFUeJzt3Xu4JXV95/v3h+7m0hjsRroldIPdijcMjsjGmDAmREWMkIYRHVH0aDRyzjwhIZiJymGCMzHnaAYTmeHJRUSIMd4CUafFQUZBzBMR40ZokebWXkZpOOlGbCM2QZr+nj+qNiwW+7J291770vV+Pc969lq/ql/Vr2qvXZ/9+1WtWqkqJEnqqr3mugGSJM0lg1CS1GkGoSSp0wxCSVKnGYSSpE4zCCVJnWYQSpI6zSCUJHWaQShJ6rTFc92Aecrb7UjaE2SuG7AQ2COUJHWaPUJJnfGxr31/rpuwS173i4fNdRP2aPYIJUmdZo9wBi3U/zYlqcvsEUqSOs0glCR1mkEoSeo0g1CS1GkGoSSp0wxCSVKnGYSSpE4zCCVJnWYQSpI6zSCUJHWaQShJ6jSDUJLUaQahJKnTDEJJUqcZhJKkTjMIJUmdZhBKkjrNIJQkdZpBKEnqNINQktRpBqEkqdMMQklSpxmEkqROMwglSZ1mEEqSOs0glCR1mkEoSeo0g1CS1GkGoSSp0wxCSVKnGYSSpE4zCCVJnWYQSpI6zSCUJHWaQShJ6jSDUJLUaQahJKnTDEJJUqcZhJKkTjMIJUmdZhBKkjrNIJQkdZpBKEnqNINQktRpBqEkqdMMQklSpxmEkqROMwglSZ1mEEqSOi1VNddtmHeSfB44aBeqHgTcO8PNmQnzsV22aTC2aTDzsU0w9+26t6pePofrXxAMwhmUZLSqRua6Hf3mY7ts02Bs02DmY5tg/rZLj+XQqCSp0wxCSVKnGYQz66K5bsAE5mO7bNNgbNNg5mObYP62Sz08RyhJ6jR7hJKkTjMIJUmdZhBKkjpt6EGYZFGSG5Nc0Vd+YZL7J6izd5JLk9ycZEOS4/qmXZTkjiS3JTm1LX9/kpvaxx1JtvXUebhn2vohbaokaQFaPAvrOAu4FThgrCDJCLBskjpvBaiqI5OsBK5MckxV7QTOBbZU1TOS7AUc2M57ds/yfwc4qmd5D1TV8wZt8Mtf/vL6/Oc/P+jskjRfZdAZ98Dj3sDbPtQeYZLVwInAxT1li4DzgbdPUvUI4GqAqtoCbAPG7s7wZuA97bSdVTXe7YteC3x8V9t9773z8U5NkjQ8XT7uDXto9AKawNvZU3YmsL6q7pmk3gbg5CSLk6wFjgYOTTLWi3x3km8kuSzJk3srJnkKsBa4pqd43ySjSa5PcsrubpQkac8xtCBMchLNEOYNPWWHAK8GLpyi+iXAXcAoTZheB+ygGcpdDXylqp4PfBV4X1/d04DLq+rhnrLD2vv9vQ64IMnTxmnvGW1Yjm7dunUaWypJC5PHvcbQPlCf5D3AG2gCbF+ac4QPto9/bWc7DPhOVR0+xbKuA36L5lzj/cDPVdXOJIcCn6+q5/TMeyPw21V13QTL+mvgiqq6fKL1jYyM1Ojo6EDbKUnz2MDnyfbA497cnyOsqnOqanVVraHppV1TVcur6uCqWtOWbx8vBJMsTbJ/+/x4YEdVbawmtT8LHNfO+hJgY0+9ZwLLaXqKY2XLk+zTPj8IOLa3jiSp22bjqtGBJFkHjFTVecBK4KokO4HNND3LMe8APpLkAmAr8Js9014LfKIe2819NvCBdll7Ae+tKoNQkgR4r9Fx7YFDBJK6yaHRAXhnGUlSpxmEkqROMwglSZ1mEEqSOs0glCR1mkEoSeo0g1CS1GkGoSSp0wxCSVKnGYSSpE4zCCVJnWYQSpI6zSCUJHWaQShJ6jSDUJLUaQahJKnTDEJJUqcZhJKkTjMIJUmdZhBKkjrNIJQkdZpBKEnqNINQktRpBqEkqdMMQklSpxmEkqROMwglSZ029CBMsijJjUmu6Cu/MMn9E9TZO8mlSW5OsiHJcX3TLkpyR5Lbkpzalr8pydYkN7WP3+qp88Ykd7aPNw5pUyVJC9DiWVjHWcCtwAFjBUlGgGWT1HkrQFUdmWQlcGWSY6pqJ3AusKWqnpFkL+DAnnqfrKozexeU5EDgXcAIUMANSdZX1Y9mYNskSQvcUHuESVYDJwIX95QtAs4H3j5J1SOAqwGqaguwjSbIAN4MvKedtrOq7p2iGScAX6iq+9rw+wLw8ulvjSRpTzTsodELaAJvZ0/ZmcD6qrpnknobgJOTLE6yFjgaODTJWC/y3Um+keSyJE/uqXdqkm8muTzJoW3ZKuAHPfPc1ZY9RpIzkowmGd26dev0tlKSFiCPe42hBWGSk2iGMG/oKTsEeDVw4RTVL6EJrFGaML0O2EEzlLsa+EpVPR/4KvC+ts5ngTVV9Vzgi8CHx1Y7zvLrcQVVF1XVSFWNrFixYrCNlKQFzONeY5jnCI8F1iV5BbAvzTnCW4AHgU1JAJYm2VRVh/dWrKodwNljr5NcB9wJ/BDYDny6nXQZ8Ja2zg97FvFB4E/a53cBx/VMWw1cu9tbJ0naIwytR1hV51TV6qpaA5wGXFNVy6vq4Kpa05Zv7w9BgCRLk+zfPj8e2FFVG6uqaHp+x7WzvgTY2M738z2LWEdzgQ7AVcDLkixPshx4WVsmSdKsXDU6kCTrgJGqOg9YCVyVZCewGXhDz6zvAD6S5AJgK/CbbfnvtsvYAdwHvAmgqu5L8m7g6+18f1RV9w17eyRJC0OaTpZ6jYyM1Ojo6Fw3Q5J213jXSIxrDzzuDbzt3llGktRpBqEkqdMMQklSpxmEkqROMwglSZ1mEEqSOs0glCR1mkEoSeo0g1CS1GkGoSSp0wxCSVKnGYSSpE4zCCVJnWYQSpI6zSCUJHWaQShJ6jSDUJLUaQahJKnTDEJJUqcZhJKkTjMIJUmdZhBKkjrNIJQkdZpBKEnqNINQktRpBqEkqdMMQklSpw09CJMsSnJjkiv6yi9Mcv8EdfZOcmmSm5NsSHJc37SLktyR5LYkp7blb0uyMck3k1yd5Ck9dR5OclP7WD+kTZUkLUCLZ2EdZwG3AgeMFSQZAZZNUuetAFV1ZJKVwJVJjqmqncC5wJaqekaSvYAD2zo3AiNVtT3JfwD+K/CadtoDVfW8Gd0qSdIeYag9wiSrgROBi3vKFgHnA2+fpOoRwNUAVbUF2AaMtNPeDLynnbazqu5tn3+pqra381wPrJ65LZEk7amGPTR6AU3g7ewpOxNYX1X3TFJvA3ByksVJ1gJHA4cmGetFvjvJN5JcluTJ49R/C3Blz+t9k4wmuT7JKeOtMMkZ7TyjW7duHXT7JGnB8rjXGFoQJjmJZgjzhp6yQ4BXAxdOUf0S4C5glCZMrwN20Azlrga+UlXPB74KvK9vva+n6T2e31N8WFWNAK8DLkjytP4VVtVFVTVSVSMrVqyY1rZK0kLkca8xzHOExwLrkrwC2JfmHOEtwIPApiQAS5NsqqrDeytW1Q7g7LHXSa4D7gR+CGwHPt1Ouoym9zc230tpziH+alU92LO8u9uf30lyLXAU8O2Z3FhJ0sI0tB5hVZ1TVaurag1wGnBNVS2vqoOrak1bvr0/BAGSLE2yf/v8eGBHVW2sqgI+CxzXzvoSYGM731HAB4B17XnFsWUtT7JP+/wgmoDeOJSNliQtOLNx1ehAkqyjuerzPGAlcFWSncBm4A09s74D+EiSC4CtwG+25ecDTwAua3ub36+qdcCzgQ+0y9oLeG9VGYSSJADSdLLUa2RkpEZHR+e6GZK0uzLojHvgcW/gbffOMpKkTjMIJUmdZhBKkjrNIJQkdZpBKEnqNINQktRpBqEkqdMMQklSpxmEkqROMwglSZ1mEEqSOs0glCR1mkEoSeo0g1CS1GkGoSSp0wxCSVKnGYSSpE4zCCVJnWYQSpI6zSCUJHWaQShJ6jSDUJLUaQahJKnTDEJJUqcZhJKkTjMIJUmdNvQgTLIoyY1JrugrvzDJ/RPU2TvJpUluTrIhyXF90y5KckeS25Kc2pbvk+STSTYl+VqSNT11zmnLb09ywlA2VJK0IC2ehXWcBdwKHDBWkGQEWDZJnbcCVNWRSVYCVyY5pqp2AucCW6rqGUn2Ag5s67wF+FFVHZ7kNOBPgNckOQI4DXgOcAjwxSTPqKqHZ3YzJUkL0VB7hElWAycCF/eULQLOB94+SdUjgKsBqmoLsA0Yaae9GXhPO21nVd3blp8MfLh9fjnwkiRpyz9RVQ9W1XeBTcALdn/rJEl7gmEPjV5AE3g7e8rOBNZX1T2T1NsAnJxkcZK1wNHAoUnGepHvTvKNJJcleXJbtgr4AUBV7QB+DDypt7x1V1v2GEnOSDKaZHTr1q3T3lBJWmg87jWGFoRJTqIZwryhp+wQ4NXAhVNUv4QmsEZpwvQ6YAfNUO5q4CtV9Xzgq8D7xhY/znJqkvLHFlRdVFUjVTWyYsWKKZonSQufx73GMM8RHgusS/IKYF+ac4S3AA8Cm5pRS5Ym2VRVh/dWbHt0Z4+9TnIdcCfwQ2A78Ol20mU05wahCc5DgbuSLAaeCNzXUz5mNXD3zG2mJGkhG1qPsKrOqarVVbWG5mKVa6pqeVUdXFVr2vLt/SEIkGRpkv3b58cDO6pqY1UV8FnguHbWlwAb2+frgTe2z1/Vrq/a8tPaq0rXAk8H/mkImyxJWoBm46rRgSRZB4xU1XnASuCqJDuBzcAbemZ9B/CRJBcAW4HfbMs/1JZvoukJngZQVbck+TuawNwB/LZXjEqSxqTpNKnXyMhIjY6OznUzJGl3jXeNxLj2wOPewNvunWUkSZ1mEEqSOs0glCR1mkEoSeo0g1CS1GkGoSSp0wxCSVKnGYSSpE4zCCVJnWYQSpI6zSCUJHWaQShJ6jSDUJLUaQahJKnTDEJJUqcZhJKkTjMIJUmdZhBKkjrNIJQkdZpBKEnqNINQktRpBqEkqdMMQklSpw0UhGm8Psl57evDkrxguE2TJGn4Bu0R/gXwS8Br29c/Af58KC2SJGkWLR5wvl+squcnuRGgqn6UZO8htkuSpFkxaI/woSSLgAJIsgLYObRWSZI0SwYNwv8OfBpYmeT/Af4R+H8HqZhkUZIbk1zRV35hkvsnqLN3kkuT3JxkQ5LjeqZdm+T2JDe1j5Vt+ft7yu5Isq2nzsM909YPuM2SpA4YaGi0qj6a5AbgJUCAU6rq1gHXcRZwK3DAWEGSEWDZJHXe2q73yDborkxyTFWN9UJPr6rRvjae3bP83wGO6pn8QFU9b8D2SpI6ZNIeYZIDxx7AFuDjwMeAf27LJpVkNXAicHFP2SLgfODtk1Q9ArgaoKq2ANuAkanW1+O1bVslSZrUVEOjNwCj7c+twB3Ane3zGwZY/gU0gdd7PvFMYH1V3TNJvQ3AyUkWJ1kLHA0c2jP90naY8w+TpLdikqcAa4Freor3TTKa5Pokp4y3wiRntPOMbt26dYBNk6SFzeNeY9IgrKq1VfVU4CrgN6rqoKp6EnAS8KnJ6iY5CdhSVTf0lB0CvBq4cIp2XQLcRRPCFwDXATvaaadX1ZHAi9rHG/rqngZcXlUP95QdVlUjwOuAC5I8bZxtvaiqRqpqZMWKFVM0T5IWPo97jUEvljmmqv7n2IuquhL41SnqHAusS/I94BPAi4FbgMOBTW350iSb+itW1Y6qOruqnldVJ9OcT7yznba5/fkTmmHa/g/2n0bfsGhV3d3+/A5wLY89fyhJ6rBBg/DeJP8pyZokT0lyLvDDySpU1TlVtbqq1tCE0zVVtbyqDq6qNW359qo6vL9ukqVJ9m+fHw/sqKqN7VDpQW35Epqe6bd66j0TWA58tadseZJ92ucH0QT0xgG3e8595sbNHPvea1j7zs9x7Huv4TM3bp7rJknSHmXQD9S/FngXzUcoAP6BR+8yMyOSrANGquo8YCVwVZKdwGYeHf7cpy1fAiwCvgh8sK+dn6iq6il7NvCBdll7Ae+tqgURhJ+5cTPnfOpmHnioGeXdvO0BzvnUzQCcctSquWyaJO0x8tjMEMDIyEiNjo5OPeOQHfvea9i87YHHla9ath9feeeLd2mZn7lxM+dfdTt3b3uAQ5btxx+c8MzdDtVhLFPd4Htn6DL1LI35ctybQQNv+0A9wiRfor2rTK+q2rWj8R5m7I9587YHWJTwcBWrZuCP+u5xQhCanuFRf/S/2Lb9oWkdPIbRwxzWMmfy4LgQwr+LbVwI751hLXM+rvO+n/5sqMufzwbqESY5uuflvsCpNOftJvss4II1nf+M+v+Ye+23ZBHveeWRu/QG/syNm/n9v9vAwwP8fgZdzzB6mDO9zPH25+7ux5lcnm2cuTbO9/fOMJc5WchNdExZvnQJ7/qN50x3vQP3ip767OfWH//1FbzuFw+bzvLns4G3fZeHRpN8uaqmunJ0QZpOEE70xzxmV/6oJwvXqYz1SPt/rlq236TtHK/uIL3ate/83OOHCmjegd9974mTrm+8A8JYz7rfrh4cF0L4d7WNu/PeGc9C2I+DBOtkx5RdCOFpB+EgFkhYzvjQaO9dZPai+YD7wdNs1B5pouHLMZu3PcCad35uWuFy/lW371IIAo/0IPt/bt72AGGc8e1J6g4yVHXIBAF7yLL9HvO6P/R+7Vkr+PsbNj9uWGyi7R7bz9MdMpro9zPZ722qdUw2ZL3mnZ+b8nfcv/yJDnq965nOdn/mxs0DLXOmtnntOz83UJv6lz/RthfwtHP+57T+KduVbR7Errx/ets03j96/e/xBx56mPOvup1Tjlo16Xb0z6uZM+hVozfQvD9D88H27wJvGVajFpLJDmS9phMuu/OHO5mxX+B0xgCm+sP7gxOeOe5/uH9wwjMfeT3euaCPXv/9x7XjgYcefuTA1++QZfvt0jmlQYN6srb2r2Oq3/lk7Rpv+RP9TsbaOJ3tHpt3IuNt9+5uc+3CNp/zqZs59ehVj/lnqNd0/inblW0e1HTfP/1tms4/elNtR++8w7ZAenwzZtBzhPtW1b/2le1TVQ8OrWVzaKbOEQ5qr8DOYsIQmGmrlu3H3dseGDgQ+4eqxuvdfem2rdy97QGeuN8SEh5zIc9Ew53TMbaPxrMo4U///b8ZNwz/02duHjd0+/d5/89+vUNhg/7Oxxs+m2jYqz8Me4fApjM8tyvDahPVWZSws2rcHvxEptOmVbvw/uhf/lTn0se2GZiwxzvV+/mnP9vBQw/X45bZf16v/4K58Uz2/gIG2g/THJadsaHRBRiOM3uOMMk3qur5U5XtKaZ7GXHvH8Gw7MXMfQHkqvaP/eNf+8HAwTv2BzxR72XpkubeDNsfmruvqewPt+n2fqfSO0z3a89awd9e//2B6wzyT85E4TxInbFQOfuTN025zbvzj9d06kz2z0uvZfstYdsDD02rHdP53U427668R4bxPht0X82Xc4STmUeBOTNBmORgYBXwtzT36Rxb8AHAX1XVs3ajkfPW7nyeZqqLZ3bFWI8HHv2vdrz/VDW7Bj14Sbujt2e+Cx+j8HOEA5jqHOEJwJuA1cCf9ZT/BPi/p92sDhjvnNnu2ln1yJt/quGY6fQmtHsMQQ1bYMJh/5l2309/xse+NvUox1wZZk9z0iCsqg8DH05yalX9/dBasQcZe8MOcr5gUBOdmD/lqFW79NGGiYz9++TxfXr8p0PDUng7xTGDhvSuBOakQZjk9VX1t8CaJG/rn15VfzZOtc4bL6B29aKa/iswp2PQK1p754fBTtjrUTsH/JymBjd2Qch0TzXst2QR+y7Zix9tn945x/lq1W5c8TpdB+6/93w6vzerpvr2if3bn08Afq7v8YQhtmuPc8pRq3jPK4985I3dO3i9V/tiUfsdw2M/Vy3bb7fuYPEHJzyT/ZYsGmjescCdTp1hWr50Ca9/4WGP7K/+fbNsvyWPXKAzqKVL9mL50iXjLm/gkwnjGDt3Mx/2W79l+y1hyV67s3Wzr/efv+ns1+VLl/CeVx7Ju37jOfPydzFdu/NPsKZnqqHRD7RPv1hVX+mdluTYobVqDzXVUOYw1gePH6Zd1XeJ+Hgn4ac699i/jP6PTYxNG2S9Y+vblfsqDnKedND7vvZeRj+2PT/a/tCkVweOHawm2tdT7b8rNtzzyBWTy5cu4cTn/vzj9ttkdSaz35JF/Od1zwHgP6+/5ZE6E12dOtHvtHcfjPc77P1d9+6j/vWsmuB9sWycj9z0nxPv36+965no1mP9dZaNsz2DbNdE+2Ky99lU76WJ9sVM3qtYg/PjE+PYA6+e0gyZj9+WMKybvmuP4FWjA5jqHOEvAb8MrOg7R3gAzfcBSp0y2736QczHNkkLyVQfn9ib5lzgYprzgmP+BXjVsBolSdJsmeoc4ZeBLyf566r637PUJkmSZs2gN93enuR84Dk030cI+MW8kqSFb9Drzz8K3AasBf4L8D3g60NqkyRJs2bQIHxSVX0IeKiqvlxVbwZeOMR2SZI0KwYdGh370NI9SU4E7qa5/6gkSQvaoEH4x0meCPw+cCHNxyd+b2itkiRplgwUhFU19iVVPwZ+DSCJQShJWvCmd7PGx3rcTbglSVpodicIF9adfCVJGsfuBKFfwCZJWvAmDcIkP0nyL+M8fgIcMsgKkixKcmOSK/rKL0xy/wR19k5yaZKbk2xIclzPtGuT3J7kpvaxsi1/U5KtPeW/1VPnjUnubB9vHKTdkqRumOoWaz832fQBnQXcSnOlKQBJRoBlk9R5a7v+I9uguzLJMVW1s51+elWNd5v0T1bVmb0FSQ4E3gWM0PRib0iyvqp+tMtbJEnaY+zO0OiUkqwGTgQu7ilbBJwPvH2SqkcAVwNU1RZgG02Q7YoTgC9U1X1t+H0BePkuLkuStIcZahACF9AE3s6esjOB9VV1zyT1NgAnJ1mcZC1wNHBoz/RL2+HPP0zSe9HOqUm+meTyJGPzrwJ+0DPPXW3ZYyQ5I8loktGtW7cOvoWStEB53GsMLQiTnARsqaobesoOAV5N86H8yVxCE1ijNGF6HbCjnXZ6VR0JvKh9vKEt/yywpqqeC3wR+PDYasdZ/uMu9Kmqi6pqpKpGVqxYMcAWStLC5nGvMcwe4bHAuiTfAz4BvBi4BTgc2NSWL02yqb9iVe2oqrOr6nlVdTLN+cQ722mb258/AT4GvKB9/cOqerBdxAdpepHQBGpvb3I1zS3iJEkaXhBW1TlVtbqq1gCnAddU1fKqOriq1rTl26vq8P66SZYm2b99fjywo6o2tkOlB7XlS4CTgG+1r3++ZxHraC7QAbgKeFmS5UmWAy9ryyRJGvheo0OXZB0wUlXnASuBq5LsBDbz6PDnPm35EmARzRDoB9tpv9suYwdwH/AmgKq6L8m7efRro/6oqu6bhU2SJC0AqfJz8f1GRkZqdHS8T2dI0oIy8B3A9sDj3sDbPuyrRiVJmtcMQklSpxmEkqROMwglSZ1mEEqSOs0glCR1mkEoSeo0g1CS1GkGoSSp0wxCSVKnGYSSpE4zCCVJnWYQSpI6zSCUJHWaQShJ6jSDUJLUaQahJKnTDEJJUqcZhJKkTjMIJUmdZhBKkjrNIJQkdZpBKEnqNINQktRpBqEkqdMMQklSpxmEkqROG3oQJlmU5MYkV/SVX5jk/gnq7J3k0iQ3J9mQ5LieadcmuT3JTe1jZVv+tiQbk3wzydVJntJT5+Ge+dcPaVMlSQvQ4llYx1nArcABYwVJRoBlk9R5K0BVHdkG3ZVJjqmqne3006tqtK/OjcBIVW1P8h+A/wq8pp32QFU9bwa2RZK0hxlqjzDJauBE4OKeskXA+cDbJ6l6BHA1QFVtAbYBI5Otq6q+VFXb25fXA6t3veWSpK4Y9tDoBTSBt7On7ExgfVXdM0m9DcDJSRYnWQscDRzaM/3SdpjzD5NknPpvAa7seb1vktEk1yc5ZbwVJjmjnWd069atg2ybJC1oHvcaQwvCJCcBW6rqhp6yQ4BXAxdOUf0S4C5glCZMrwN2tNNOr6ojgRe1jzf0rff1NL3H83uKD6uqEeB1wAVJnta/wqq6qKpGqmpkxYoVg2+oJC1QHvcawzxHeCywLskrgH1pzhHeAjwIbGo7ckuTbKqqw3srVtUO4Oyx10muA+5sp21uf/4kyceAFwB/0873UuBc4Fer6sGe5d3d/vxOkmuBo4BvD2GbJUkLzNB6hFV1TlWtrqo1wGnANVW1vKoOrqo1bfn2/hAESLI0yf7t8+OBHVW1sR0qPagtXwKcBHyrfX0U8AFgXXtecWxZy5Ps0z4/iCagNw5ruyVJC8tsXDU6kCTraK76PA9YCVyVZCewmUeHP/dpy5cAi4AvAh9sp50PPAG4rO1tfr+q1gHPBj7QLmsv4L1VZRBKkoBZCsKquha4dpzyJ/Q8Xw+sb59/D3jmOPP/lObCmfHW8dIJyq8Djpx+qyVJXeCdZSRJnWYQSpI6zSCUJHWaQShJ6jSDUJLUaQahJKnTDEJJUqcZhJKkTjMIJUmdZhBKkjrNIJQkdZpBKEnqNINQktRpBqEkqdMMQklSpxmEkqROMwglSZ1mEEqSOs0glCR1mkEoSeo0g1CS1GkGoSSp0wxCSVKnGYSSpE4zCCVJnWYQSpI6behBmGRRkhuTXNFXfmGS+yeos3eSS5PcnGRDkuN6pl2b5PYkN7WPlW35Pkk+mWRTkq8lWdNT55y2/PYkJwxlQyVJC9LiWVjHWcCtwAFjBUlGgGWT1HkrQFUd2QbdlUmOqaqd7fTTq2q0r85bgB9V1eFJTgP+BHhNkiOA04DnAIcAX0zyjKp6eCY2TpK0sA21R5hkNXAicHFP2SLgfODtk1Q9ArgaoKq2ANuAkSlWdzLw4fb55cBLkqQt/0RVPVhV3wU2AS+Y/tZIkvZEwx4avYAm8Hb2lJ0JrK+qeyaptwE4OcniJGuBo4FDe6Zf2g6L/mEbdgCrgB8AVNUO4MfAk3rLW3e1ZY+R5Iwko0lGt27dOq2NlKSFyONeY2hBmOQkYEtV3dBTdgjwauDCKapfQhNYozRheh2wo512elUdCbyofbxhbPHjLKcmKX9sQdVFVTVSVSOT41+MAAAKz0lEQVQrVqyYonmStPB53GsM8xzhscC6JK8A9qU5R3gL8CCwqe3ILU2yqaoO763Y9ujOHnud5Drgznba5vbnT5J8jGaY829ogvNQ4K4ki4EnAvf1lI9ZDdw941srSVqQhtYjrKpzqmp1Va2huVjlmqpaXlUHV9Watnx7fwgCJFmaZP/2+fHAjqra2A6VHtSWLwFOAr7VVlsPvLF9/qp2fdWWn9ZeVboWeDrwT8PabknSwjIbV40OJMk6YKSqzgNWAlcl2Qls5tHhz33a8iXAIuCLwAfbaR8CPpJkE01P8DSAqrolyd8BG2mGV3/bK0YlSWPSdJrUa2RkpEZH+z+dIUkLznjXSIxrDzzuDbzt3llGktRpBqEkqdMMQklSpxmEkqROMwglSZ1mEEqSOs0glCR1mkEoSeo0g1CS1GkGoSSp0wxCSVKnGYSSpE4zCCVJnWYQSpI6zSCUJHWaQShJ6jSDUJLUaQahJKnTDEJJUqcZhJKkTjMIJUmdZhBKkjrNIJQkdZpBKEnqNINQktRpBqEkqdMMQklSpw09CJMsSnJjkiv6yi9Mcv8EdfZOcmmSm5NsSHLcOPOsT/KtntefTHJT+/hekpva8jVJHuiZ9lczvImSpAVs8Sys4yzgVuCAsYIkI8CySeq8FaCqjkyyErgyyTFVtbOt/0rgMSFaVa/pWf6fAj/umfztqnre7m6IJGnPM9QeYZLVwInAxT1li4DzgbdPUvUI4GqAqtoCbANG2vpPAN4G/PEE6wzw74GP7/4WSJL2dMMeGr2AJvB29pSdCayvqnsmqbcBODnJ4iRrgaOBQ9tp7wb+FNg+Qd0XAf9cVXf2lK1th2e/nORF41VKckaS0SSjW7dunXrLJGmB87jXGFoQJjkJ2FJVN/SUHQK8GrhwiuqXAHcBozRheh2wI8nzgMOr6tOT1H0tj+0N3gMcVlVH0fQkP5bkgP5KVXVRVY1U1ciKFSum3kBJWuA87jWGeY7wWGBdklcA+9KcI7wFeBDY1IxgsjTJpqo6vLdiVe0Azh57neQ64E7gV4Gjk3yvbfvKJNdW1XHtfIuBV9L0IMeW9WC7TqrqhiTfBp5BE7KSpI4bWo+wqs6pqtVVtQY4DbimqpZX1cFVtaYt394fggBJlibZv31+PLCjqjZW1V9W1SFt3X8L3DEWgq2XArdV1V09y1rRnpckyVOBpwPfGcY2S5IWntm4anQgSdYBI1V1HrASuCrJTmAz8IYBF3Maj79I5leAP0qyA3gY+L+q6r4ZarYkaYFLVc11G+adkZGRGh115FTSgpdBZ9wDj3sDb7t3lpEkdZpBKEnqNINQktRpBqEkqdMMQklSpxmEkqROMwglSZ1mEEqSOs0glCR1mkEoSeo0g1CS1GkGoSSp0wxCSVKnGYSSpE4zCCVJnWYQSpI6zSCUJHWaQShJ6jSDUJLUaamquW7DvJNkK/C/d6HqQcC9M9ycmTAf22WbBmObBjMf2wRz3657q+rlg8yY5PODzrunMQhnUJLRqhqZ63b0m4/tsk2DsU2DmY9tgvnbLj2WQ6OSpE4zCCVJnWYQzqyL5roBE5iP7bJNg7FNg5mPbYL52y718ByhJKnT7BFKkjrNIJQkdZpBOEOSvDzJ7Uk2JXnnHLXh0CRfSnJrkluSnNWWH5jkC0nubH8un4O2LUpyY5Ir2tdrk3ytbdMnk+w9y+1ZluTyJLe1++uX5no/JTm7/b19K8nHk+w7F/spySVJtiT5Vk/ZuPsmjf/evu+/meT5s9im89vf3zeTfDrJsp5p57Rtuj3JCbPVpp5p/zFJJTmofT0r+0m7xiCcAUkWAX8O/DpwBPDaJEfMQVN2AL9fVc8GXgj8dtuOdwJXV9XTgavb17PtLODWntd/Ary/bdOPgLfMcnv+G/D5qnoW8G/ats3ZfkqyCvhdYKSqfgFYBJzG3Oynvwb6P1g90b75deDp7eMM4C9nsU1fAH6hqp4L3AGcA9C+508DntPW+Yv2b3Q22kSSQ4Hjge/3FM/WftIuMAhnxguATVX1nar6GfAJ4OTZbkRV3VNV32if/4Tm4L6qbcuH29k+DJwym+1Ksho4Ebi4fR3gxcDlc9GmJAcAvwJ8CKCqflZV25jj/QQsBvZLshhYCtzDHOynqvoH4L6+4on2zcnA31TjemBZkp+fjTZV1f+qqh3ty+uB1T1t+kRVPVhV3wU20fyNDr1NrfcDbwd6r0Sclf2kXWMQzoxVwA96Xt/Vls2ZJGuAo4CvAU+uqnugCUtg5Sw35wKaA8PO9vWTgG09B7HZ3l9PBbYCl7bDtRcn2Z853E9VtRl4H00v4h7gx8ANzO1+6jXRvpkv7/03A1e2z+esTUnWAZurakPfpPmynzQOg3BmZJyyOftcSpInAH8P/F5V/ctctaNty0nAlqq6obd4nFlnc38tBp4P/GVVHQX8lLkZLn5Ee87tZGAtcAiwP81wWr/59nmnuf5dkuRcmtMCHx0rGme2obcpyVLgXOC88SaPUzbffpedZRDOjLuAQ3terwbunouGJFlCE4IfrapPtcX/PDYM0/7cMotNOhZYl+R7NEPGL6bpIS5rhwBh9vfXXcBdVfW19vXlNME4l/vppcB3q2prVT0EfAr4ZeZ2P/WaaN/M6Xs/yRuBk4DT69EPRc9Vm55G84/Mhvb9vhr4RpKD57BNGoBBODO+Djy9vcJvb5oT9etnuxHtubcPAbdW1Z/1TFoPvLF9/kbgf8xWm6rqnKpaXVVraPbLNVV1OvAl4FVz1Kb/D/hBkme2RS8BNjKH+4lmSPSFSZa2v8exNs3Zfuoz0b5ZD/wf7VWRLwR+PDaEOmxJXg68A1hXVdv72npakn2SrKW5QOWfht2eqrq5qlZW1Zr2/X4X8Pz2/TZn+0kDqCofM/AAXkFz5dq3gXPnqA3/lma45ZvATe3jFTTn5K4G7mx/HjhH7TsOuKJ9/lSag9Mm4DJgn1luy/OA0XZffQZYPtf7CfgvwG3At4CPAPvMxX4CPk5znvIhmoP5WybaNzRDfn/evu9vprnqdbbatInmvNvYe/2veuY/t23T7cCvz1ab+qZ/DzhoNveTj117eIs1SVKnOTQqSeo0g1CS1GkGoSSp0wxCSVKnGYSSpE4zCKVdlOS6ac5/XNpv39iFdT0ryVeTPJjkP+7KMiSNb/HUs0gaT1X98iyu7j6ab6eY7RuBS3s8e4TSLkpyf/vzuCTX5tHvN/xoe3eYse+pvC3JPwKv7Km7f/t9dl9vb/x9clv+tiSXtM+PTPPdhEuraktVfZ3mw9uSZpBBKM2Mo4Dfo/k+yqcCxybZF/gg8BvAi4CDe+Y/l+Z2c8cAvwac334DxgXA4Un+HXAp8H/WY28fJmmGGYTSzPinqrqrqnbS3O5rDfAsmhtp31nNLZz+tmf+lwHvTHITcC2wL3BYW/9NNLdY+3JVfWX2NkHqJs8RSjPjwZ7nD/Po39ZE9zAMcGpV3T7OtKcD99N8HZOkIbNHKA3PbcDaJE9rX7+2Z9pVwO/0nEs8qv35ROC/Ab8CPCnJq5A0VPYIpSGpqn9NcgbwuST3Av8I/EI7+d005wO/2Ybh92i+V+/9wF9U1R1J3gJ8Kck/0PzTOgocAOxM8nvAETXHX7ws7Qn89glJUqc5NCpJ6jSDUJLUaQahJKnTDEJJUqcZhJKkTjMIJUmdZhBKkjrt/wdd5EQn4ihQkwAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "import seaborn as sns\n", "import scipy.stats as stats #used to get correlation coefficient\n", "\n", "j=sns.jointplot(x='index1', y='Latitude', data= CC_Sher1)" ] }, { "cell_type": "code", "execution_count": 345, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAb0AAAGoCAYAAADSNTtsAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAIABJREFUeJzt3XuU5HV55/H3I5AwEMhAGNbQMFyyisp1tHVIMBt1ieAFaEFjBHNZzJIcMWZQiTMriSbhHMkSF91zYgwhQqLEjBoYLhKJxgsbIsQhMzDDAorJgAxGINBKoFea5tk/6ldSjt3Tv6quql91f9+vc+p0Xbuerpnpz3zvkZlIklSCZzVdgCRJw2LoSZKKYehJkoph6EmSimHoSZKKYehJkoph6EmSimHoSZKKYehJkoqxa9MFjAi3pZG0FETTBYw6W3qSpGLY0pO0JP3VLfc1XULPzli9sukSlixbepKkYtjSW4DF/D9JSSqRLT1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEiM5uuoXER8Vlgvx5euh/wcJ/LWShrqm8U67Kmeqxpdg9n5kkN1zDSDL0FiIiNmTnedB2drKm+UazLmuqxJvXK7k1JUjEMPUlSMQy9hbmk6QJmYU31jWJd1lSPNaknjulJkophS0+SVAxDT5JUDENPklQMQ0+SVAxDDzjppJMS8OLFi5fFfqllif7Oq8XQAx5+uOmdgyRpeEr+nWfoSZKKYehJkoph6EmSimHoSZKKYehJkoph6EmSimHoSZKKYehJkoph6EmSimHoSZKKYehJkoph6EmSimHoSZKKYehJkoph6EmSimHoSZKKYehJkoph6EmSimHoSZKKYehJkoph6EmSimHoSZKKYehJkoph6EmSimHoSZKKYehJkoph6EmSimHoSZKKYehJkoph6EmSimHoSZKKYehJkoph6EmSimHoSZKKYehJkoph6EmSimHoSZKKYehJkoph6EmSimHoSZKKYehJkoph6EmSirFrE28aEccAHwF+DNgGnJmZ342IlwCXtJ8GvC8zr5rl9W8D1gA/BazIzIer+wP4EPBq4AngVzPzn/td/4ZN27nohrt5YHKKA5Yv4+XPW8EX73ro+7fPO/FwJlaN9fttJUkLFJk5/DeN+Crwrsz8ckScBRyamb8TEXsAT2bmUxHxk8BtwAGZ+dQOr18FPAp8CRjvCL1XA79JK/RWAx/KzNXz1TM+Pp4bN26sVfuGTdtZd+UWpqZndvq8ZwU8nTBmCEoanqjzpG5+5y0itX72pro3DwdurK5/DjgdIDOf6Ai43YFZEzkzN2XmtlkeOhX4y2y5GVhehWffXHTD3fMGHrQCD2D75BTrrtzChk3b+1mGJKkHjXRvAluBU4CrgTcAB7UfiIjVwEeBg4Ff2rGVN48x4Jsdt++v7vvWjk+MiLOBswFWrlxZ+w0emJzqopyWqekZLrrhblt7khrT+Ttvv2eP8Ve33NdwRQtzxur6v7c7DaylFxGfj4its1xOBc4CzomIW4G9gCfbr8vMWzLzCODFwLqI2L2bt53lvrlai5dk5nhmjq9YsaL2GxywfFkX5Txjew9hKUn90vk7b6/l+zZdTmMGFnqZeUJmHjnL5erMvCszX5mZLwI+AXxjltffCTwOHNnF295PR6sROBB4YCE/x47OO/Fwlu22S0+vXfX7f2c3pyQ1qJExvYjYv/r6LOB8WjM5iYhDI2LX6vrBtMb+tnXxra8BfjlajgO+k5k/1LW5EBOrxnj/aUexfNluXb/20SemHd+TpAY1NZHlTRHxNeAuWi2xy6r7XwrcFhGbgauAt3bMzLw+Ig6orr89Iu6n1ZK7PSIurV5/PfAvwD3AnwFvHUTxE6vG2PzeV/LBNx7L2PJlBK1Zmm8+biVj83R/Tk3P8M5P3mbwSVIDGlmyMGr6PX33+Au/MO8Y3rLdduH9px3l5BZJ/VRr2v5hzz86L7j8ukHXMlCzTGQZ6SULS1qdcT9bfJI0fIbeANQd95vJdIxPkobI0BuQznG/XWLuVrctPkkaHkNvwCZWjfGBXzhmp92dM5msWb/ZJQ2SNGCG3hC0uzt31uIDlzRI0qAZekNSp8UHdndK0iAZekNUt8U3k8m56zdz/oYtQ6pMkspg6A1Z3RZfAh+/+T7H+SSpjwy9BnSzldmjT0w7yUWS+sTQa0jdJQ1tTnKRpIUz9BrW7u6ss3+Ok1wkaWEMvREwsWqMM49bWSv43MVFknpn6I2ICyaO4uI3HltrnM8WnyT1xtAbIZ3jfHX27XSCiyR1x9AbQd1McnF2pyTVZ+iNsLpr+sDZnZJUh6E34uru4gKtsb73XXPHEKqSpMXJ0FsEumnxTU5N29UpSXMw9BaJbndxsatTkn6YobeIdDO702UNkvTDDL1FqB1+++zhsgZJ6oaht4i99+Qjas/sXLN+M4et+wyHrP0Mx1/4BUNQUpEMvUWsm3E+gKez9XX75JRjfpKKZOgtct2e1tDmmJ+kEhl6S0Q3yxra3LxaUmkMvSWk2+5OcEG7pLIYektMN8sa2ianpm3tSSqCobdEdYbf2PJl8z7f8T1JJTD0lriJVWPctPYVbLvwNXzwjcfO+TzX9EkqgaFXkIlVY/MuaPeoIklLmaFXmG4XtBt+kkbJGatXcsbqlT2/3tArTDdHFYGbV0taWgy9AnW7ps+F7JKWCkOvUN2u6XOii6SlwNArWC9r+h59Yppz12/m/A1bBlydJPWfoaeuwy+Bj998n60+SYuOoafv63bzamd4SlpsDD39kG4nuhh+khYLQ0+z6mXzapc3SBp1hp7m1O7ufPNxK6l7Up/LGySNMkNP87pg4igu7mKGp8sbJI0qQ0+19Lq8we5OSaPE0FNXug0/uzsljRJDTz3pZnnDTKYL2iWNBENPC1J3eUMCV9x8ny0+SY0y9LRgdZc3JJ7QLqlZhp76om53pzM7JTXJ0FNftbs751vX58xOSU0w9NR3E6vGOLPGgnZndkoaNkNPA9Fe0F5nZqfdnZKGZdemC9DSNbFqDIB1V25hanpmp89tb1r9jk9u5umEseXLOO/Ew7//PSSpH2zpaaC63bj66Wx93T455ZifpL4z9DRw3Z7T1zY1PcP7rrljgJVJKo2hp6Hp9pw+gMmpacf7JPWNoaeh8pw+SU0y9DR0vZzY4PIGSf1g6KkxneE3tnzZvM+fybTFJ2lBDD01bmLVGDetfQUffOOx84732eKTtBCGnkZG3fE+F7RL6lUjoRcRx0TEVyJiS0RcGxF7V/e/JCI2V5fbIuJ1c7z+bRFxT0RkROzXcf+ZEXF7dfnHiDhmWD+T+qOb5Q3tBe2Gn6S6mmrpXQqszcyjgKuA86r7twLjmXkscBLwpxEx264xNwEnAPfucP+/Aj+XmUcDfwBcMojiNXjdLG9wdqekupoKvcOBG6vrnwNOB8jMJzLzqer+3WkdwfZDMnNTZm6b5f5/zMxHq5s3Awf2s2gNV7u7s86Cdsf6JNXRVOhtBU6prr8BOKj9QESsjog7gC3Ab3SEYLfeAvztXA9GxNkRsTEiNj700EM9voUGrZsW30wm567fzPkbtgyhMmlx6fyd99jkI02X05iBhV5EfD4its5yORU4CzgnIm4F9gKebL8uM2/JzCOAFwPrImL3Ht775bRC791zPSczL8nM8cwcX7FiRbdvoSHqZkF7AlfcfJ8tPmkHnb/z9lq+b9PlNGZgpyxk5gnzPOWVABHxXOA1s7z+zoh4HDgS2Fj3fSPiaFpjhq/KzH+vX7FG2cSqMSZWjbFh03bed80dTE5Nz/ncBNas38ya9Zs9rUHSD2hq9ub+1ddnAecDH6luH9qeuBIRB9Ma+9vWxfddCVwJ/FJmfq3PZWsEdLt59fbJKWd4Svq+psb03hQRXwPuAh4ALqvufylwW0RspjWr862Z+TBARFwfEQdU198eEffTmqhye0RcWr3+d4GfAD5cLXuo3ULU4tIe66t7ZoPLGyQBROasEySLMj4+nhs3mo+L0fkbtnDFzffNPs13Dst224X3n3aUXZ5aimr9P/Cw5x+dF1x+3aBrGYozVq9sX631s7sjixa1CyaO4uIezulzeYNUJkNPi14v5/S5lZlUJkNPS0J7WUP7tIZuxvpc2yeVY2BLFqRhay9raKuzvAGeWds3fvC+jvNJS5wtPS1Z3SxvSHCcTyqAoaclr+6Yn4fUSkufoaci1N3KzJmd0tJm6KkY7e7ONx+3cqcTXZzZKS1dhp6KU3dtn7u4SEuPoacieUitVCZDT8Xq9pDaNes3c8jaz3D8hV8wAKVFytBT0XrZzWX75JQL2qVFytBT8bo5pLbNw2qlxal26EXEwRFxQnV9WUTsNbiypOHqXMheN/xc0C4tPrVCLyL+O/Bp4E+ruw4ENgyqKKkp3R5S64J2aXGp29I7Bzge+C5AZn4d2H9QRUlN62aszwXt0uJRN/S+l5lPtm9ExK7Q1bmd0qKz48kNO+OCdmlxqBt6X46I/wEsi4ifBz4FXDu4sqTRMLFqjJvWvoJtF76mVpenC9ql0VY39NYCDwFbgF8HrgfOH1RR0ijqdkG74SeNnlrn6WXm08CfVRepWO3z9t75yduYyfl7+Nu7uXS+VlJzdhp6EbGFnYzdZebRfa9IGnHt8Fp35RampmfmfX57okvnayU1Y76W3murr+dUXz9WfT0TeGIgFUmLQDu86pzMDq2JLueu38zGex/hgomjBl2epDnsdEwvM+/NzHuB4zPztzNzS3VZC5w4nBKl0dTtgnZ3cZGaV3ciy54R8dL2jYj4GWDPwZQkLS7dhJ+7uEjNqjWRBXgL8NGI+PHq9iRw1mBKkhaniVVjTKwaY8Om7Tud6NLexaX9GknDU3f25q3AMRGxNxCZ+Z3BliUtXu0gO3f95jlngbWPKlqzfjNjy5dx3omHG4DSENQKvYj43R1uA5CZvz+AmqRFb2LVGBvvfYQrbr5v3q2Ltk9OsWb9Zn7v2jt478lHGH7SANUd03u84zIDvAo4ZEA1SUvCBRNHcXHNjavBBe3SMNTt3vxA5+2I+CPgmoFUJC0h3a7pg2fCz5af1H+9HiK7B3BYPwuRlqr2xtV1W3xt7d1cbPVJ/VN3TK9zZ5ZdgBXAHwyqKGmp6aXFB+7mIvVb3SULr+24/hTw7cx8agD1SEtWO7QuuuFutk9OEdQ7n8slDlL/1A29CzLzlzrviIiP7XifpJ1rr+Vr27Bpe62tzKamZ3jfNXcYetIC1R3TO6LzRnWI7Iv6X45Ulm52c5mcmnZmp7RAOw29iFgXEY8BR0fEd6vLY8C3gauHUqFUgM7w29mEF5c1SAuz0+7NzHw/8P6IeH9mrhtSTVKx2t2Xa9Zv3unzPKdPpTtj9cqeXjdfS+951dVPRcQLd7z09I6Sdmpi1Rj77DH/qQ3tmZ22+KT65hvTe0f19QOzXP5ogHVJRXvvyUewbLdd5n3eTKbdnVIX5uvePLv6+vLhlCMJuj+k1u5OqZ66i9NPm+Xu7wBbMvPB/pYkCX7wqKK6yxpcyC7tXDfn6f008MXq9suAm4HnRsTvZ+bHBlCbJOqf0wfPdHeuWb+ZffbYzb07pR3UXaf3NPD8zDw9M08HXgB8D1gNvHtQxUl6xsSqMT7wC8fUGusDlzdIs6kbeodk5rc7bj8IPDczHwHmH3CQ1BftzavnW8jeyY2rpWfU7d78PxFxHfCp6vbpwI0RsScwOZDKJM2qm+7ONsf7pJa6Lb1zgMuBY4FVwF8C52Tm487slJrRbXenyxukmqGXLZ/OzHMzc011vc4G8ZIGqNfuTsNPpaoVehFxWkR8PSK+095/MyK+O+jiJM2vm02rOznWpxLVHdP7n8DJmXnnIIuR1LvOY4vqjvc51qfS1B3T+7aBJy0e3Yz3OdanktRt6W2MiPXABlrr8wDIzCsHUpWkBXMrM+mH1W3p7Q08AbwSOLm6vHZQRUnqj27H+9ontEtLVa2WXmb+t0EXImlwulnb1z6h3S3MtBTVnb15YERcFREPRsS3I+JvIuLAQRcnqb/qjvU5s1NLVd3uzcuAa4ADgDHg2uo+SYtM3bV9U9MzrFm/mUPWfobjL/yCAagloW7orcjMyzLzqepyObBigHVJGqD2WF+dE9oBtk9OOcNTS0Ld0Hs4It4cEbtUlzcD/z7IwiQNXt0T2tvs9tRiVzf0zgJ+Afg34FvA6wEnt0iLXC/bmLUXtBt8Wozq7r15X2aekpkrMnP/zJwAZjtNXdIi07msYZeIWq9xQbsWq7otvdm8o29VSGpct6c2gN2dWnwWEnr1/ks42wsjjomIr0TEloi4NiL2ru5/SURsri63RcTr5nj92yLinojIiNhvlsdfHBEzEfH6XmuUStTu7hxbvgyo94/cBe1aTOpuQzabhRwtdCnwrsz8ckScBZwH/A6wFRjPzKci4ieB2yLi2sx8aofX3wRcB3xpx28cEbsAfwjcsID6pGJ1blwN9TavdkG7FoudtvTaRwjNcnmM1pq9Xh0O3Fhd/xytk9jJzCc6Am535gjWzNyUmdvm+N6/CfwN8OAC6pNU6WZBu+N8GnU7Db3M3Csz957lsldmLqSVuBU4pbr+BuCg9gMRsToi7gC2AL8xSytvThExBrwO+EiN554dERsjYuNDDz3UVfFSabqZ5ek432jq/J332OQjTZfTmIWM6e1URHw+IrbOcjmV1hKIcyLiVmAv4Mn26zLzlsw8AngxsC4idu/ibT8IvDszZ+Z7YmZekpnjmTm+YoXr7KX5dLOg3WUNo6fzd95ey/dtupzGDCz0MvOEzDxylsvVmXlXZr4yM18EfAL4xiyvvxN4HDiyi7cdB/46IrbRWkv44YiY6MOPI6lSd0G7yxo0ihbSRdmziNg/Mx+MiGcB51N1R0bEocA3q4ksB9Ma+9tW9/tm5qEd73E5cF1mbuhn7VLpPKdPi9nAWnrzeFNEfA24C3iAZzavfimtGZubgauAt2bmwwARcX1EHFBdf3tE3A8cCNweEZcO/SeQCtbLOX12d2oURO5kGnIpxsfHc+PGjU2XIS1adZY1tO2zx24ubRicWuunD3v+0XnB5dcNupaBOmP1yh3vqvWzN9XSk7SEdLObS3tpg0cWqQmGnqS+6GXz6u2TUy5v0FAZepL6ppfNqx3v0zAZepL6rtvNq13eoLrOWL1ytvG82gw9SQPRS3enu7lo0Aw9SQPT7u5883Erax/LYnenBsnQkzRwF0wcxcVvPPb7RxbNZybTFp8GopEdWSSVp/PIog2btrPuyi1MTc+9TW67xdd+rdQPtvQkDV3d8T4nuKjfDD1JjehmecOjT0xz7vrNnL9hy5Cq01Jl6ElqVN3lDQl8/Ob7bPVpQQw9SY1rd3fWWdDusgYthKEnaSR0s6DdZQ3qlaEnaWR0s6DdZQ3qhaEnaaR0s6B9anrGExvUFUNP0khqL2ivu43Z9skplzdoXoaepJHVy6kNTnTRzhh6kkZet6c2ONFFczH0JC0K3SxrgNZEFxe0a0eGnqRFo9sWnwvatSNDT9Ki0m7xtU9sqNPue/SJaSe5CPCUBUmLUOeJDdA6teGdn7yNmcydvq49yaX9PVQeW3qSFr12t2edVt/U9AwX3XD3wGvSaDL0JC0JE6vGOLPmCe3bJ6fs5iyUoSdpyehmQbtjfGUy9CQtKZ0L2ucLPye4lMfQk7QkdYbffAy/chh6kpa0iVVj31/eMB+3MFv6DD1JS955Jx7e1RZma9Zv9tSGJcrQk7TkdXNOX5unNixNhp6kInQzwaWTXZ5Li6EnqSi9hJ+nNiwdhp6kIrXDb5896gXfTKbdnUuAoSepaO89+Yjak1zA7s7Fzg2nJRWtvfH0RTfczfbJKYLWkUQ70+7u7Hy9FgdDT1Lxejm1od3d+XvX3sF7Tz7C8Fsk7N6UpB10c1it3Z2Li6EnSbPoZm3f1PQM77vmjiFUpYUy9CRpDp3LG3aJnR9aNDk17czORcDQk6R51O3udOPq0edEFkmqoT1R5X3X3MHk1PROn9se5+t83VJ3xuqVTZdQiy09SaqpmwXt7uIymgw9SepS3QXt7uIyegw9SepSt6c2ONY3Ogw9SepBLxtXG37NM/QkaQG6WdbQ9ugT05y7fjPnb9gy4Oq0I0NPkvqgm11coLW/58dvvs9W35AZepLUJ72c0O42ZsNl6ElSH3lI7Wgz9CRpALoNv5lMW3xDYOhJ0gC1w+/Nx61kvmkuU9MzXHTD3UOpq1SGniQNwQUTR3FxjVbfA5NTQ6qoTIaeJA1JneUNByxfNuSqymLoSdKQzbW8Ydluu3DeiYc3VFUZPGVBkhrQPn3hohvu5oHJKQ5YvozzTjy8mFMZmmLoSVJDJlaNGXJDZvemJKkYhp4kqRiNhF5EHBMRX4mILRFxbUTsXd3/kojYXF1ui4jXzfH6t0XEPRGREbHfDo+9rHr9HRHx5WH8PJK0EBs2bef4C7/AoWs/w/EXfsEF6gPUVEvvUmBtZh4FXAWcV92/FRjPzGOBk4A/jYjZxh1vAk4A7u28MyKWAx8GTsnMI4A3DKh+SeqLDZu2s+7KLWyfnCKB7ZNT7swyQE2F3uHAjdX1zwGnA2TmE5n5VHX/7rQ2Iv8hmbkpM7fN8tAZwJWZeV/1vAf7WbQk9dtFN9zN1PTMD9znziyD01TobQVOqa6/ATio/UBErI6IO4AtwG90hGAdzwX2iYgvRcStEfHLfatYkgZgrh1Y3JllMAYWehHx+YjYOsvlVOAs4JyIuBXYC3iy/brMvKXqmnwxsC4idu/ibXcFXgS8BjgR+J2IeO4c9Z0dERsjYuNDDz3U408pSQsz1w4s/d6ZpfN33mOTj/T1ey8mAwu9zDwhM4+c5XJ1Zt6Vma/MzBcBnwC+Mcvr7wQeB47s4m3vBz6bmY9n5sO0ulCPmaO+SzJzPDPHV6xY0f0PKEl9cN6Jhw9lZ5bO33l7Ld+3r997MWlq9ub+1ddnAecDH6luH9qeuBIRB9Ma+9vWxbe+GvjZiNg1IvYAVgN39rF0Seqr9sGzY8uXEcDY8mW8/7SjXLQ+IE3tyPKmiDinun4lcFl1/aXA2oiYBp4G3lq12IiI64Ffy8wHIuLtwG8DzwZuj4jrM/PXMvPOiPgscHv1+kszc+sQfy5J6po7swxPZM46QbIo4+PjuXHjxqbLkKSFmu/IPgAOe/7RecHl1/X1jc9YvbKv368HtX52d2SRJBXD0JMkFcPQkyQVw9CTJBXD0JMkFcPQkyQVw5PTJWlEbNi0nYtuuJsHJqc4YPkyzjvxcNfv9ZmhJ0kjoH3EUPvEhfYRQ4DB10d2b0rSCPCIoeGwpSdJI2CYRwztu+ePjMIOKo2wpSdJI2BYRwyVztCTpBEwrCOGSmf3piSNgPZkFWdvDpahJ0kjwiOGBs/uTUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEiM5uuoXER8RBwbw8v3Q94uM/lLJQ11TeKdVlTPdY0u4cz86T5nhQRn63zvKXI0FuAiNiYmeNN19HJmuobxbqsqR5rUq/s3pQkFcPQkyQVw9BbmEuaLmAW1lTfKNZlTfVYk3rimJ4kqRi29CRJxTD0JEnFMPR6FBEnRcTdEXFPRKxtqIaDIuKLEXFnRNwREb9V3b9vRHwuIr5efd2ngdp2iYhNEXFddfvQiLilqml9RPzIkOtZHhGfjoi7qs/rp5v+nCLi3OrPbWtEfCIidm/ic4qIj0bEgxGxteO+WT+baPnf1d/72yPihUOs6aLqz+/2iLgqIpZ3PLauqunuiDhxWDV1PPauiMiI2K+6PZTPSd0z9HoQEbsAfwy8CngB8KaIeEEDpTwFvDMznw8cB5xT1bEW+PvMfA7w99XtYfst4M6O238IXFzV9CjwliHX8yHgs5n5POCYqrbGPqeIGAPeDoxn5pHALsAv0szndDmw40LluT6bVwHPqS5nA38yxJo+BxyZmUcDXwPWAVR/538ROKJ6zYerf6PDqImIOAj4eeC+jruH9TmpS4Zeb14C3JOZ/5KZTwJ/DZw67CIy81uZ+c/V9cdo/SIfq2r5i+ppfwFMDLOuiDgQeA1waXU7gFcAn26ipojYG/gvwJ8DZOaTmTlJw58TsCuwLCJ2BfYAvkUDn1Nm3gg8ssPdc302pwJ/mS03A8sj4ieHUVNm/l1mPlXdvBk4sKOmv87M72XmvwL30Po3OvCaKhcDvw10zgocyuek7hl6vRkDvtlx+/7qvsZExCHAKuAW4D9l5regFYzA/kMu54O0fgk8Xd3+CWCy4xfWsD+vw4CHgMuqLtdLI2JPGvycMnM78Ee0WgffAr4D3Eqzn1OnuT6bUfm7fxbwt9X1xmqKiFOA7Zl52w4PjcrnpB0Yer2JWe5rbO1HRPwY8DfAmsz8blN1VLW8FngwM2/tvHuWpw7z89oVeCHwJ5m5CnicZrp8v68aIzsVOBQ4ANiTVpfYjkZtTVHTf5ZExHtode1f0b5rlqcNvKaI2AN4D/C7sz08y32j9mdZJEOvN/cDB3XcPhB4oIlCImI3WoF3RWZeWd397XZXSvX1wSGWdDys2gj2AAADVklEQVRwSkRso9Xt+wpaLb/lVTceDP/zuh+4PzNvqW5/mlYINvk5nQD8a2Y+lJnTwJXAz9Ds59Rprs+m0b/7EfErwGuBM/OZRcZN1fRTtP7Tclv19/1A4J8j4tkN1qR5GHq9+SrwnGqm3Y/QGkS/ZthFVGNlfw7cmZn/q+Oha4Bfqa7/CnD1sGrKzHWZeWBmHkLrc/lCZp4JfBF4fUM1/RvwzYg4vLrrvwL/lwY/J1rdmsdFxB7Vn2O7psY+px3M9dlcA/xyNTvxOOA77W7QQYuIk4B3A6dk5hM71PqLEfGjEXEorckj/zToejJzS2bun5mHVH/f7wdeWP19a+xz0jwy00sPF+DVtGaQfQN4T0M1vJRWl8ntwObq8mpaY2h/D3y9+rpvQ/W9DLiuun4YrV9E9wCfAn50yLUcC2ysPqsNwD5Nf07A7wF3AVuBjwE/2sTnBHyC1rjiNK1f3G+Z67Oh1W33x9Xf+y20Zp8Oq6Z7aI2Ttf+uf6Tj+e+parobeNWwatrh8W3AfsP8nLx0f3EbMklSMezelCQVw9CTJBXD0JMkFcPQkyQVw9CTJBXD0JNqiIh/7PL5L4vqhIke3ut5EfGViPheRLyrl+8haXa7zv8USZn5M0N8u0doncAw7A2wpSXPlp5UQ0T8R/X1ZRHxpXjmbL4rqh1V2mcs3hUR/wCc1vHaPauz2L5abXh9anX/OyLio9X1o6J1rt4emflgZn6V1iJoSX1k6EndWwWsoXWW4mHA8RGxO/BnwMnAzwLP7nj+e2htx/Zi4OXARdUpDx8E/nNEvA64DPj1/MHttST1maEnde+fMvP+zHya1nZYhwDPo7WB9Neztc3Rxzue/0pgbURsBr4E7A6srF7/q7S2IPtyZt40vB9BKpNjelL3vtdxfYZn/h3NtadfAKdn5t2zPPYc4D9oHS8kacBs6Un9cRdwaET8VHX7TR2P3QD8ZsfY36rq648DH6J1qvtPRMTrkTRQtvSkPsjM/xcRZwOfiYiHgX8Ajqwe/gNa43e3V8G3jdaZcBcDH87Mr0XEW4AvRsSNtP4zuhHYG3g6ItYAL8iGDwiWlgJPWZAkFcPuTUlSMQw9SVIxDD1JUjEMPUlSMQw9SVIxDD1JUjEMPUlSMf4/Yfv2gAtKDdgAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "import seaborn as sns\n", "import scipy.stats as stats #used to get correlation coefficient\n", "\n", "j=sns.jointplot(x='index1', y='Longitude', data= CC_Sher1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It looks like there is a section of the data that is not picking up the right coordinates, lets change the first 50 index of values with a different zip-code to see if that improved the score." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Spotting outliers and data cleanup \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When looking for outliers, **you cannot rely on the graph alone**. We will do another example to illustrate this (i.e. Lexingtion Parkway)" ] }, { "cell_type": "code", "execution_count": 316, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(151, 4)\n" ] } ], "source": [ "#abs_alg('Fuller Ave','55104')\n", "abs_alg('Lexington Parkway','55104')\n", "\n", "B=df.loc[0:50,:]\n", "C=df.loc[51:100,:]\n", "D=df.loc[101:,]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#run_geo_algorithm(B,['55104']) #notice missing values and differnt zip code\n", "#run_geo_algorithm(C,['55104']) \n", "#run_geo_algorithm(D,['55104']) " ] }, { "cell_type": "code", "execution_count": 407, "metadata": { "code_folding": [] }, "outputs": [ { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "execution_count": 407, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Add_Com=pd.concat([B,C,D])\n", "#CC= Add_Com[['Block','For_Address','Latitude','Longitude','Fail']]\n", "#CC_Lex=CC.reset_index()\n", "#CC_Lex['index1'] = CC_Lex.index\n", "plot_geomatch(CC_Lex)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that the number of outliers differ from the map (Zoom out) compared to scatterplot below. The reason is that some of the outliers is mapped within the range of normal value and hence disguised. Thus a statistical approach should be used to find the index of the outliers. " ] }, { "cell_type": "code", "execution_count": 375, "metadata": {}, "outputs": [ { "ename": "TypeError", "evalue": "Object of type ndarray is not JSON serializable", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\IPython\\core\\formatters.py\u001b[0m in \u001b[0;36m__call__\u001b[1;34m(self, obj)\u001b[0m\n\u001b[0;32m 339\u001b[0m \u001b[1;32mpass\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 340\u001b[0m \u001b[1;32melse\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 341\u001b[1;33m \u001b[1;32mreturn\u001b[0m \u001b[0mprinter\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mobj\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 342\u001b[0m \u001b[1;31m# Finally look for special method names\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 343\u001b[0m \u001b[0mmethod\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mget_real_method\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mobj\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mprint_method\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\mpld3\\_display.py\u001b[0m in \u001b[0;36m\u001b[1;34m(fig, kwds)\u001b[0m\n\u001b[0;32m 408\u001b[0m \u001b[0mformatter\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mip\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdisplay_formatter\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mformatters\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;34m'text/html'\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 409\u001b[0m formatter.for_type(Figure,\n\u001b[1;32m--> 410\u001b[1;33m lambda fig, kwds=kwargs: fig_to_html(fig, **kwds))\n\u001b[0m\u001b[0;32m 411\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 412\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\mpld3\\_display.py\u001b[0m in \u001b[0;36mfig_to_html\u001b[1;34m(fig, d3_url, mpld3_url, no_extras, template_type, figid, use_http, **kwargs)\u001b[0m\n\u001b[0;32m 249\u001b[0m \u001b[0md3_url\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0md3_url\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 250\u001b[0m \u001b[0mmpld3_url\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mmpld3_url\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 251\u001b[1;33m \u001b[0mfigure_json\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mjson\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdumps\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfigure_json\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mcls\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mNumpyEncoder\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 252\u001b[0m \u001b[0mextra_css\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mextra_css\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 253\u001b[0m extra_js=extra_js)\n", "\u001b[1;32m~\\Anaconda3\\lib\\json\\__init__.py\u001b[0m in \u001b[0;36mdumps\u001b[1;34m(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)\u001b[0m\n\u001b[0;32m 236\u001b[0m \u001b[0mcheck_circular\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mcheck_circular\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mallow_nan\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mallow_nan\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mindent\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mindent\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 237\u001b[0m \u001b[0mseparators\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mseparators\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mdefault\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mdefault\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0msort_keys\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0msort_keys\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 238\u001b[1;33m **kw).encode(obj)\n\u001b[0m\u001b[0;32m 239\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 240\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m~\\Anaconda3\\lib\\json\\encoder.py\u001b[0m in \u001b[0;36mencode\u001b[1;34m(self, o)\u001b[0m\n\u001b[0;32m 197\u001b[0m \u001b[1;31m# exceptions aren't as detailed. The list call should be roughly\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 198\u001b[0m \u001b[1;31m# equivalent to the PySequence_Fast that ''.join() would do.\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 199\u001b[1;33m \u001b[0mchunks\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0miterencode\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mo\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0m_one_shot\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;32mTrue\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 200\u001b[0m \u001b[1;32mif\u001b[0m \u001b[1;32mnot\u001b[0m \u001b[0misinstance\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mchunks\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m(\u001b[0m\u001b[0mlist\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mtuple\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 201\u001b[0m \u001b[0mchunks\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mlist\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mchunks\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m~\\Anaconda3\\lib\\json\\encoder.py\u001b[0m in \u001b[0;36miterencode\u001b[1;34m(self, o, _one_shot)\u001b[0m\n\u001b[0;32m 255\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mkey_separator\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mitem_separator\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msort_keys\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 256\u001b[0m self.skipkeys, _one_shot)\n\u001b[1;32m--> 257\u001b[1;33m \u001b[1;32mreturn\u001b[0m \u001b[0m_iterencode\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mo\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m0\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 258\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 259\u001b[0m def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,\n", "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\mpld3\\_display.py\u001b[0m in \u001b[0;36mdefault\u001b[1;34m(self, obj)\u001b[0m\n\u001b[0;32m 136\u001b[0m numpy.float64)):\n\u001b[0;32m 137\u001b[0m \u001b[1;32mreturn\u001b[0m \u001b[0mfloat\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mobj\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 138\u001b[1;33m \u001b[1;32mreturn\u001b[0m \u001b[0mjson\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mJSONEncoder\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdefault\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mobj\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 139\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 140\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m~\\Anaconda3\\lib\\json\\encoder.py\u001b[0m in \u001b[0;36mdefault\u001b[1;34m(self, o)\u001b[0m\n\u001b[0;32m 177\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 178\u001b[0m \"\"\"\n\u001b[1;32m--> 179\u001b[1;33m raise TypeError(f'Object of type {o.__class__.__name__} '\n\u001b[0m\u001b[0;32m 180\u001b[0m f'is not JSON serializable')\n\u001b[0;32m 181\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;31mTypeError\u001b[0m: Object of type ndarray is not JSON serializable" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAAD8CAYAAAB3u9PLAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAGVhJREFUeJzt3XuQXOV55/Hvw0iYEZiMWNDaukX2wnKxicEMWXZJ7SpkMTdFECok5IJZJ0ZVrlBRYA1GSxVOYrvIruyYmE0cUwQ7jl1mdx2WaEkRrYtLpcqK7YyQhcxFRsZgM2gjsUZZgxTQ5dk/zmnUGnqme2b6crrn+6mampm3+3S/faTu37zvec9zIjORJOmoXndAklQNBoIkCTAQJEklA0GSBBgIkqSSgSBJAgwESVLJQJAkAQaCJKk0r9cdmI4TTzwxV6xY0etuSFJf2bx580uZeVKz+/VVIKxYsYKxsbFed0OS+kpEPN/K/ZwykiQBBoIkqWQgSJIAA0GSVDIQJEnANAIhIoYiYktEPFD+/oWI+H5EfLv8OmuS7a6NiGfKr2vr2s+JiG0RsSMiPhMRMfuXI0maqemMENYCT01ouykzzyq/vj1xg4g4Afgo8K+AnwY+GhELy5s/C6wBTim/Lp5u5yVJ7dNSIETEUuAy4O5pPv5FwNcy80eZ+TLwNeDiiHg7cHxm/l0W1/D8InDFNB9bktRGrY4Q7gBuBg5NaP9ERDweEZ+OiLc02G4J8MO6318o25aUP09slyT1SNNAiIhVwK7M3DzhpnXAacC5wAnARxpt3qAtp2hv9PxrImIsIsZ2797drLuSpBlqZYRwPrA6Ip4D7gUuiIgvZebOLLwGfJ7iGMFELwDL6n5fCrxYti9t0P4mmXlXZo5m5uhJJzUtxSFJmqGmgZCZ6zJzaWauAK4GHs7MXy+PA1CuDroC+E6DzTcC74uIheXB5PcBGzNzJ/DjiDiv3P79wF+15yVJkmZiNuchfDkitgHbgBOBjwNExGhE3A2QmT8CPgb8ffn1+2UbwIcoDlLvAL4HPDiLvkiSZimKRT79YXR0NK12KknTExGbM3O02f08U1mSBBgIkqSSgSBJAgwESVLJQJAkAQaCJKlkIEiSAANBklQyECRJgIEgSSoZCJIkwECQJJUMBEkSYCBIkkoGgiQJMBAkSSUDQZIEGAiSpJKBIEkCYF6vOyBJauz+LeOs37idF/fsY/HIMDdddCpXnL2kY89nIEhSBd2/ZZx1921j3/6DAIzv2ce6+7YBdCwUnDKSpApav3H7G2FQs2//QdZv3N6x5zQQJKmCXtyzb1rt7WAgSFIFLR4ZnlZ7OxgIklRBN110KsPzh45oG54/xE0Xndqx5/SgsiRVTG110b79BxmK4GAmS1xlJElzy8TVRQcz3xgZdDIMwCkjSaqUXqwuqjEQJKlCerG6qMZAkKQK6cXqohoDQZIqpBeri2o8qCxJFdGr1UU1LQdCRAwBY8B4Zq6qa78T+EBmHtdgm6OBzwGjwCFgbWY+Wt72K8B/AhJ4Efj1zHxp5i9FkvrT/VvG+d0NT7Bn3/432rq5uqhmOiOEtcBTwPG1hogYBUam2OY6gMw8MyIWAQ9GxLkUU1V/BJyRmS9FxH8Brgd+d3rdl6TqqP2FP75n3xt/4U/2fcnIMD972kk8sHXnEUFQr7a6qFKBEBFLgcuATwA3lm1DwHrgV4FfmGTTM4CHADJzV0TsoRgtbAECODYi/i9FyOyY+cuQpO6rD4CgmO6oOZg55ffxPfv40jd+0PQ5urG6qKbVEcIdwM3AW+vargc2ZObOiJhsu63A5RFxL7AMOAdYlpnfiogPAduAV4FngN9q9AARsQZYA7B8+fIWuytJ7TdVAORkG81SN1YX1TQNhIhYBezKzM0RsbJsWwxcBaxssvk9wOkUxx6eBzYBByJiPvAh4GzgWeBOYB3w8YkPkJl3AXcBjI6OdmqfS9KkGs3xd+PDqFuri2paGSGcD6yOiEuBYyimd54AXgN2lKODBRGxIzNPrt8wMw8AN9R+j4hNFKOBs8rbv1e2/3fgllm/Gklqg6lGAt2ycMF8Pvrz7+ra8QNoIRAycx3FX++UI4QP168yKttfmRgGZfsCIDLz1Yi4EDiQmU+WI4wzIuKkzNwNXEhxwFqSuq4XU0GT6UUQ1LT9PISIWA2MZuZtwCJgY0QcAsaBawAy88WI+D3gbyNiP8V00n9od18kqZFuBMBRAYeSpquMHnl6d9eumdxMZPbPtPzo6GiOjY31uhuS+lSjYwHtUguAbp5I1qqI2JyZo83u55nKkgZeJ4KgygEwUwaCpIHR6MSwdh8U7uUcf6cZCJL62mTHA2ongM02DAZxJDAZA0FSX2j01//I8Hxeff0A+w+258Mf5lYATGQgSKq8RpeVBNpyTGAuB8BEBoKkSqofEXTCIB8LmCkDQVIldOvsYINgcgaCpJ7o1MlhtcdqdCLYXJ8SasZAkNQVnQqA+UcFxx0zjz1791fibN9+ZiBIaqturQYCDwS3m4EgaVZauUhMO88QHp4/xO1XnmkIdICBIGlGunmNAJeGdoeBIKkl3bxGgAHQGwaCpCl1YyRgAFSDgSAJ6HxhuNpqoJf37ncpaEUZCNIc08oHv4Xh5iYDQRowjT7wO/3BX2MA9DcDQRogkxWBa/cH/0SWgxgMBoI0QNZv3P5GGHSSI4HBZCBIA+L+LeMdqwxa40hgsBkIUp+qHSt4cc8+fqIsDdEutRGAq4HmFgNBqrDJ6gK9fuAge/cfeuN+My0N4Qe/6hkIUgU1OhlsNnWBLAWtVhgIUgV0qizEkpFhvn7LBW16NA06A0HqoU6WhRieP8RNF53apkfTXGAgSF3UyQJxXihGs2UgSB3QSnmIdoaBy0HVDgaC1EZTHQxuV12gkeH5ROBIQG1nIEiz0I1rBPjXv7rFQJBmoJMHgy0LoV4xEKQWOBLQXNByIETEEDAGjGfmqrr2O4EPZOZxDbY5GvgcMAocAtZm5qN1t/1XYGV5262Z+ZczfiVSG00VAI4ENKimM0JYCzwFHF9riIhRYGSKba4DyMwzI2IR8GBEnJuZh4BbgV2Z+S8j4ijghGn3XmqzTkwFWR5C/aKlQIiIpcBlwCeAG8u2IWA98KvAL0yy6RnAQwCZuSsi9lCMFr4F/AZwWnnbIeClGb8KaZrqC8MtHhnmZ087iQe27pxxTaBGnAJSv2l1hHAHcDPw1rq264ENmbkzIibbbitweUTcCywDzgGWRcR3y9s/FhErge8B12fmP0yz/1JLppoCGt+zjy994wezfg6ngNTvmgZCRKyimNrZXH54ExGLgaso5v+ncg9wOsWxh+eBTcCB8nmXAl/PzBsj4kbgk8A1DZ5/DbAGYPny5S29KKmmk6uBahwJaFBE5tRvj4i4neKD+gBwDMUxhNfKr38q77YceDYzT27yWJuAD1Ici3gFeGtmHoqIZcDfZOa7ptp+dHQ0x8bGmr4ozV2NrhGw/2B7I8CRgPpNRGzOzNFm92s6QsjMdcC68kFXAh+uX2VUtr/SKAwiYgFF6LwaERcCBzLzyfK2/0UxwngY+DngyWZ9kSbTaCTQruMBBoDmirafhxARq4HRzLwNWARsjIhDwDhHTgl9BPiLiLgD2A18oN190eDyvACp/ZpOGVWJU0ZzTytF4mZr4YL5XPZTb+eRp3e/serIkYAGSdumjKRum+yv/3YXiXMKqDsmLvFt1z7v1ON2Q1X7biCoMjqxIshrBPTW/VvGWXffNvbtPwgUS3zX3bcNYFb/Dp163G6oct8NBPVMp48DeAyg99Zv3P7GB1/Nvv0HWb9x+6z+XTr1uN1Q5b4bCOqabtQHAoOgSl7cs29a7b1+3G6oct8NBHVMtwLAYwLVtXhkmPEGH3SLR4Yr+bjdUOW+GwiatU5fLrL2WLXHtkhc/7jpolOPmC8HGJ4/xE0XnVrJx+2GKvfdQNCsTDxA1q6VQDVO//S32r9bu1fUdOpxu6HKffc8BLWs0UigUwwCqX08D0FtNdlIoF08DiD1noGgljRaKjcbBoBUPQaCGk4FTfw+WwaAVH0GwhzW6Mzg2of/xO/T5Uogqf8YCANsqr/8RzpwrYDh+UPcfuWZfvhLfcpAGECt/OU/22sFeE6ANHgMhAHQjWsD1CwZGebrt1zQwWeQ1CsGQp+buBy0k2FQlbMpJXWGgdDnOrUc1Ckhae4xECqolWWg7T5TeCiCT/3Se/zAl+YwA6ECpjoGMNky0OmGQe0v/8kcyjQMpDnOQOiRyUKg0xeJOf8PHq5s6V1JvWUgdEm3rg1QM9mcf5VL70rqLQOhQ7odADXNloVWufSupN4yEDqgm0tB67X6l/4VZy8xACS9iYEwQ7URQKO/stu5FHSyZaAuC5XUbl4gZ5oalYUAmH9UcNwx83h578xKQky8TKQf8JLaxQvktEmrZSH2H8oZh4FXB5NUBQbCBN04GOy1ASRVkYFQajQV1O7JNANAUpXN2UCwQqgkHWngA6FRXaB2TwUtmH8U+w9lw4vNeNKXpH5xVK870Em18wFqpRpq9X/aNRpYuGA+d/zyWTz5sUtY/4vvYUlZ/mEoAihGBl5BTFK/GOgRQqdKQzc6FuDJXpL6XcuBEBFDwBgwnpmr6trvBD6Qmcc12OZo4HPAKHAIWJuZj064zwbgnZn57hm9gim82KCI23S4GkjSXDKdEcJa4Cng+FpDRIwCI1Nscx1AZp4ZEYuAByPi3Mw8VG5/JfDKtHvdosUjww0rezbjeQGS5qKWjiFExFLgMuDuurYhYD1w8xSbngE8BJCZu4A9FKMFIuI44Ebg4zPpeCtuuuhUhucPNb3fUcWUP0tGhrnjl89iy23vMwwkzTmtjhDuoPjgf2td2/XAhszcGeVB1Aa2ApdHxL3AMuCc8vu3gI8BnwL2TvXEEbEGWAOwfPnyFrtbqK8t1OjqY04FSdJhTQMhIlYBuzJzc0SsLNsWA1cBK5tsfg9wOsWxh+eBTcCBiDgLODkzb4iIFVM9QGbeBdwFRS2jZv2dqNMHe6cqcidJ/aSVEcL5wOqIuBQ4huIYwhPAa8COcnSwICJ2ZObJ9Rtm5gHghtrvEbEJeAb4d8A5EfFc2YdFEfFoZq6c9Svqoollrsf37GPdfdsADAVJfafpMYTMXJeZSzNzBXA18HBmLszMt2XmirJ978QwAIiIBRFxbPnzhcCBzHwyMz+bmYvLbX8G+G6/hQE0Xta6b/9B1m/c3qMeSdLMtf08hIhYDYxm5m3AImBjRBwCxoFr2v18vTTZstbZLneVpF6YViCU5xA82qD9uLqfNwAbyp+fA6as21Dep+3nIHTDZMtavWC9pH400KUrOq3RslZrF0nqVwNduqLTvGC9pEFiIMySNYwkDQqnjCRJgIEgSSoZCJIkwECQJJUMBEkSYCBIkkoGgiQJ8DwEtcAS39LcYCBoSpb4luYOp4w0JUt8S3OHgaApWeJbmjsMBE1pslLelviWBo+BoClZ4luaOzyorClZ4luaOwwENWWJb2lucMpIkgQYCJKkkoEgSQIMBElSyYPKGkjWX5Kmz0DQwLH+kjQzThlp4Fh/SZoZRwgaONOpv+TUknSYIwQNnFbrL9Wmlsb37CM5PLV0/5bxLvRSqh4DQQOn1fpLTi1JR3LKSAOn1fpLlvaWjmQgaCC1Un9p8cgw4w0+/C3trbnKKSPNWZb2lo7UciBExFBEbImIBya03xkRr0yyzdER8fmI2BYRWyNiZdm+ICL+OiKejognIuIPZvUqpBm44uwl3H7lmSwZGSaAJSPD3H7lma4y0pw1nSmjtcBTwPG1hogYBUam2OY6gMw8MyIWAQ9GxLnlbZ/MzEci4mjgoYi4JDMfnF73pdmxtLd0WEsjhIhYClwG3F3XNgSsB26eYtMzgIcAMnMXsAcYzcy9mflI2f468BiwdCYvQJLUHq1OGd1B8cF/qK7temBDZu6cYrutwOURMS8i3gGcAyyrv0NEjAA/TxkckqTeaDplFBGrgF2ZubnuGMBi4CpgZZPN7wFOB8aA54FNwIG6x54HfAX4TGY+O8nzrwHWACxfvrxZdyVJMxSZOfUdIm4HrqH4ID+G4hjCa+XXP5V3Ww48m5knN3msTcAHM/PJ8vd7gFcy87db6ezo6GiOjY21cldJUikiNmfmaLP7NZ0yysx1mbk0M1cAVwMPZ+bCzHxbZq4o2/c2CoNyNdGx5c8XAgfqwuDjwE8AvzOdFyZJ6oy2n4cQEasj4vfLXxcBj0XEU8BHKEYatYPUt1IcdH4sIr4dER9sd18kSa2b1pnKmfko8GiD9uPqft4AbCh/fg5401k+mfkCENPqqaSGrNiqdrF0hdTHvBiQ2snSFVIfs2Kr2slAkPqYFVvVTgaC1MdavRiQ1AoDQepjVmxVO3lQWepjrV4MSGqFgSD1OSu2ql2cMpIkAQaCJKlkIEiSAANBklQyECRJgIEgSSoZCJIkwECQJJUMBEkSYCBIkkoGgiQJMBAkSSUDQZIEGAiSpJKBIEkCDARJUslAkCQBBoIkqWQgSJIAA0GSVDIQJEmAgSBJKhkIkiTAQJAklQwESRIwjUCIiKGI2BIRD0xovzMiXplkm6Mj4vMRsS0itkbEyrrbzinbd0TEZyIiZvwqJEmzNp0RwlrgqfqGiBgFRqbY5jqAzDwTuBD4VETUnvOzwBrglPLr4mn0RZLUZi0FQkQsBS4D7q5rGwLWAzdPsekZwEMAmbkL2AOMRsTbgeMz8+8yM4EvAlfM6BVIktqi1RHCHRQf/Ifq2q4HNmTmzim22wpcHhHzIuIdwDnAMmAJ8ELd/V4o294kItZExFhEjO3evbvF7kqSpqtpIETEKmBXZm6ua1sMXAXc2WTzeyg+7McoQmUTcABodLwgGz1AZt6VmaOZOXrSSSc1664kaYbmtXCf84HVEXEpcAxwPPAE8BqwozwWvCAidmTmyfUbZuYB4Iba7xGxCXgGeBlYWnfXpcCLs3gdkqRZajpCyMx1mbk0M1cAVwMPZ+bCzHxbZq4o2/dODAOAiFgQEceWP18IHMjMJ8tpph9HxHnl6qL3A3/VxtclSZqmVkYI0xIRq4HRzLwNWARsjIhDwDhwTd1dPwR8ARgGHiy/JEk9EsUin/4QEbuB52e4+YnAS23sTidUvY9V7x9Uv49V7x9Uv49V7x9Ur48/mZlND8L2VSDMRkSMZeZor/sxlar3ser9g+r3ser9g+r3ser9g/7oYyOWrpAkAQaCJKk0lwLhrl53oAVV72PV+wfV72PV+wfV72PV+wf90cc3mTPHECRJU5tLIwRJ0hQGPhAi4uKI2F6W2b6l1/0BiIhlEfFIRDwVEU9ExNqy/YSI+FpEPFN+X9jjfh5R8jwi3hER3yz7998i4uge928kIr4aEU+X+/JfV3Af3lD+G38nIr4SEcf0ej9GxD0RsSsivlPX1nC/ReEz5fvn8Yh4b4/6t778d348Iv5nRIzU3bau7N/2iLio0/2brI91t304IjIiTix/7/o+nKmBDoSyIusfA5dQVF79lYg4o7e9Aop6Tv8xM08HzgN+q+zXLcBDmXkKRZXYXgfYxJLn/xn4dNm/l4Hf7EmvDvsj4G8y8zTgPRR9rcw+jIglwG9TnKj5bmCI4mz/Xu/HL/DmcvOT7bdLOFyifg1F2fpe9O9rwLsz86eA7wLrAMr3zdXAu8pt/qR83/eij0TEMopS/z+oa+7FPpyRgQ4E4KeBHZn5bGa+DtwLXN7jPpGZOzPzsfLnH1N8kC2h6Nufl3f7c3pYEnxiyfOyxMgFwFfLu/S6f8cD/xb4M4DMfD0z91ChfViaBwxHxDxgAbCTHu/HzPxb4EcTmifbb5cDX8zCN4CRsnx9V/uXmf+7rI0G8A0O10K7HLg3M1/LzO8DOyje9x01yT4E+DRFZej6g7Nd34czNeiBsAT4Yd3vk5bZ7pWIWAGcDXwT+Oe1cuLl90W969mbSp7/M2BP3Zuy1/vyncBu4PPltNbdZd2syuzDzBwHPknx1+JO4B+BzVRrP9ZMtt+q+B76DQ6XuqlM/8qyPeOZuXXCTZXpYzODHggtl9nuhYg4DvhL4Hcy8//1uj810aDkOdXbl/OA9wKfzcyzgVfp/RTbEcp5+MuBdwCLgWMppg8mqsz/yQYq9e8eEbdSTLl+udbU4G5d719ELABuBW5rdHODtkr+mw96ILxAcUGemsqU2Y6I+RRh8OXMvK9s/ofaULL8vqtH3auVPH+OYprtAooRw0g59QG935cvAC9k5jfL379KERBV2YcA/x74fmbuzsz9wH3Av6Fa+7Fmsv1WmfdQRFwLrAJ+LQ+vl69K//4FRfBvLd83S4HHIuJtVKePTQ16IPw9cEq5quNoioNPG3rcp9p8/J8BT2XmH9bdtAG4tvz5WnpUEnySkue/BjwC/GKv+weQmf8H+GFEnFo2/RzwJBXZh6UfAOdFUQY+ONzHyuzHOpPttw3A+8uVMucB/9jkKokdEREXAx8BVmfm3rqbNgBXR8Rborgq4ynAt7rdv8zclpmL6i4J8ALw3vL/aSX2YUsyc6C/gEspViV8D7i11/0p+/QzFEPGx4Fvl1+XUszTP0RxEaGHgBMq0NeVwAPlz++keLPtAP4H8JYe9+0siqvxPQ7cDyys2j4Efg94GvgO8BfAW3q9H4GvUBzT2E/xwfWbk+03iumOPy7fP9soVkz1on87KObha++XP627/61l/7YDl/RqH064/TngxF7tw5l+eaayJAkY/CkjSVKLDARJEmAgSJJKBoIkCTAQJEklA0GSBBgIkqSSgSBJAuD/A9IpQV2gVZE2AAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "import mpld3\n", "import scipy.stats as stats #used to get correlation coefficient\n", "mpld3.enable_notebook()\n", "import seaborn as sns\n", "from mpld3 import plugins\n", "\n", "CC_Lex=pd.read_csv('CC_Lex.csv')\n", "x=CC_Lex.index1.values.tolist()\n", "y=CC_Lex.Latitude.values.tolist()\n", "\n", "fig, ax = plt.subplots()\n", "points = ax.scatter(x,y,\n", " s=50, alpha=0.3)\n", "\n", "labels = [\"Point {0}\".format(i) for i in range(151)]\n", "tooltip = plugins.PointLabelTooltip(points, labels)\n", "\n", "plugins.connect(fig, tooltip)\n", "\n", "scatter= plt.scatter(CC_Lex.index1, CC_Lex.Latitude)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data Cleanup \n", "\n", "As oppose to create a separate csv files with the messed up addresses and manually fixing them, you can correct the respective coordinate. This is acceptable if the coordinates are linear. \n", "\n", "Consider documenting the file changes prior to exporting the final block csv file. We can automate fixes without evening search up the data using the average of values preceding and succeeding index." ] }, { "cell_type": "code", "execution_count": 442, "metadata": {}, "outputs": [ { "ename": "TypeError", "evalue": "Object of type ndarray is not JSON serializable", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\IPython\\core\\formatters.py\u001b[0m in \u001b[0;36m__call__\u001b[1;34m(self, obj)\u001b[0m\n\u001b[0;32m 339\u001b[0m \u001b[1;32mpass\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 340\u001b[0m \u001b[1;32melse\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 341\u001b[1;33m \u001b[1;32mreturn\u001b[0m \u001b[0mprinter\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mobj\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 342\u001b[0m \u001b[1;31m# Finally look for special method names\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 343\u001b[0m \u001b[0mmethod\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mget_real_method\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mobj\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mprint_method\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\mpld3\\_display.py\u001b[0m in \u001b[0;36m\u001b[1;34m(fig, kwds)\u001b[0m\n\u001b[0;32m 408\u001b[0m \u001b[0mformatter\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mip\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdisplay_formatter\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mformatters\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;34m'text/html'\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 409\u001b[0m formatter.for_type(Figure,\n\u001b[1;32m--> 410\u001b[1;33m lambda fig, kwds=kwargs: fig_to_html(fig, **kwds))\n\u001b[0m\u001b[0;32m 411\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 412\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\mpld3\\_display.py\u001b[0m in \u001b[0;36mfig_to_html\u001b[1;34m(fig, d3_url, mpld3_url, no_extras, template_type, figid, use_http, **kwargs)\u001b[0m\n\u001b[0;32m 249\u001b[0m \u001b[0md3_url\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0md3_url\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 250\u001b[0m \u001b[0mmpld3_url\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mmpld3_url\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 251\u001b[1;33m \u001b[0mfigure_json\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mjson\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdumps\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfigure_json\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mcls\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mNumpyEncoder\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 252\u001b[0m \u001b[0mextra_css\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mextra_css\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 253\u001b[0m extra_js=extra_js)\n", "\u001b[1;32m~\\Anaconda3\\lib\\json\\__init__.py\u001b[0m in \u001b[0;36mdumps\u001b[1;34m(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)\u001b[0m\n\u001b[0;32m 236\u001b[0m \u001b[0mcheck_circular\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mcheck_circular\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mallow_nan\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mallow_nan\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mindent\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mindent\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 237\u001b[0m \u001b[0mseparators\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mseparators\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mdefault\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mdefault\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0msort_keys\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0msort_keys\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 238\u001b[1;33m **kw).encode(obj)\n\u001b[0m\u001b[0;32m 239\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 240\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m~\\Anaconda3\\lib\\json\\encoder.py\u001b[0m in \u001b[0;36mencode\u001b[1;34m(self, o)\u001b[0m\n\u001b[0;32m 197\u001b[0m \u001b[1;31m# exceptions aren't as detailed. The list call should be roughly\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 198\u001b[0m \u001b[1;31m# equivalent to the PySequence_Fast that ''.join() would do.\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 199\u001b[1;33m \u001b[0mchunks\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0miterencode\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mo\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0m_one_shot\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;32mTrue\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 200\u001b[0m \u001b[1;32mif\u001b[0m \u001b[1;32mnot\u001b[0m \u001b[0misinstance\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mchunks\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m(\u001b[0m\u001b[0mlist\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mtuple\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 201\u001b[0m \u001b[0mchunks\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mlist\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mchunks\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m~\\Anaconda3\\lib\\json\\encoder.py\u001b[0m in \u001b[0;36miterencode\u001b[1;34m(self, o, _one_shot)\u001b[0m\n\u001b[0;32m 255\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mkey_separator\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mitem_separator\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msort_keys\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 256\u001b[0m self.skipkeys, _one_shot)\n\u001b[1;32m--> 257\u001b[1;33m \u001b[1;32mreturn\u001b[0m \u001b[0m_iterencode\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mo\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m0\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 258\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 259\u001b[0m def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,\n", "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\mpld3\\_display.py\u001b[0m in \u001b[0;36mdefault\u001b[1;34m(self, obj)\u001b[0m\n\u001b[0;32m 136\u001b[0m numpy.float64)):\n\u001b[0;32m 137\u001b[0m \u001b[1;32mreturn\u001b[0m \u001b[0mfloat\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mobj\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 138\u001b[1;33m \u001b[1;32mreturn\u001b[0m \u001b[0mjson\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mJSONEncoder\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdefault\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mobj\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 139\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 140\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m~\\Anaconda3\\lib\\json\\encoder.py\u001b[0m in \u001b[0;36mdefault\u001b[1;34m(self, o)\u001b[0m\n\u001b[0;32m 177\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 178\u001b[0m \"\"\"\n\u001b[1;32m--> 179\u001b[1;33m raise TypeError(f'Object of type {o.__class__.__name__} '\n\u001b[0m\u001b[0;32m 180\u001b[0m f'is not JSON serializable')\n\u001b[0;32m 181\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;31mTypeError\u001b[0m: Object of type ndarray is not JSON serializable" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAbUAAAGoCAYAAADB4nuYAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAIABJREFUeJzt3X+03HV95/Hnm5sACUiDQhpDyAkUF0ERkEulpdulWkSBBqyi/KhrrUe23bKLoFhZOWxX8XS3KRXLaavUQm3Fiq2aTd3VtAVjW1OloTFGwEBqkXJhm1BNK5JSQt77x3wHx5u5985M7nd+fOb5OOeeO/cz8537nuEyr3y+38+PyEwkSSrBAYMuQJKk+WKoSZKKYahJkophqEmSimGoSZKKYahJkophqEmSimGoSZKKYahJkoqxYNAF9InLpkgadTHoAkaBPTVJUjHGpacmqTAf+/LDgy6hJ5e+bOWgSyiaPTVJUjHsqc1iVP8lKEnjyp6aJKkYhpokqRiGmiSpGIaaJKkYhpokqRiGmiSpGIaaJKkYhpokqRiGmiSpGIaaJKkYhpokqRiGmiSpGIaaJKkYhpokqRiGmiSpGIaaJKkYhpokqRiGmiSpGIaaJKkYhpokqRiGmiSpGIaaJKkYhpokqRiGmiSpGIaaJKkYhpokqRiGmiSpGIaaJKkYhpokqRiGmiSpGIaaJKkYhpokqRiGmiSpGIaaJKkYhpokqRiGmiSpGIaaJKkYhpokqRiGmiSpGIaaJKkYhpokqRiGmiSpGIaaJKkYhpokqRiGmiSpGIaaJKkYhpokqRiGmiSpGIaaJKkYkZmDrqF2EfE54IgeDj0CeHyey9lf1tQZa+rMMNYEw1nXoGt6PDNfNcDfPxLGItR6FRGbMnNy0HW0sqbOWFNnhrEmGM66hrEm7cvTj5KkYhhqkqRiGGqzu2XQBbRhTZ2xps4MY00wnHUNY02axmtqkqRi2FOTJBXDUJMkFcNQkyQVw1CTJBVjLELtVa96VQJ++eWXX6P81bFCP/M6Mhah9vjjw7bajiTVZ5w/88Yi1CRJ48FQkyQVw1CTJBXDUJMkFcNQkyQVw1CTJBXDUJMkFcNQkyQVw1CTJBXDUJMkFcNQkyQVw1CTJBXDUJMkFcNQkyQVw1CTJBXDUJMkFcNQkyQVw1CTJBXDUJMkFcNQkyQVw1CTJBXDUJMkFcNQkyQVw1CTJBXDUJMkFcNQkyQVw1CTJBXDUJMkFcNQkyQVw1CTJBXDUJMkFcNQkyQVw1CTJBXDUJMkFcNQkyQVw1CTJBWj9lCLiImI2BwRn5nWfnNEPDHDMQdGxG0RsTUitkTEWS33vSEivhoR90bEr9ZcviRphPSjp3YlcH9rQ0RMAktmOeatAJl5EnA2cGNEHBARzwPWAK/IzBcBPxgRr6inbEnSqKk11CJiBXAe8OGWtgkawfTOWQ49EbgTIDN3ALuASeBY4IHM3Fk97s+B185/5ZKkUVR3T+0mGuG1t6XtCmBdZj42y3FbgAsiYkFEHAOcBhwNbAdeGBGrImIBcGHVvo+IuDwiNkXEpp07d7Z7iCQVw8+8htpCLSLOB3Zk5j0tbcuBi4Cb5zj8VuARYBONYNwI7MnMbwO/ANwB/CXwELCn3RNk5i2ZOZmZk0ceeeR+vhpJGm5+5jUsqPG5zwRWR8S5wMHAYcC9wFPA9ogAWBwR2zPzuNYDM3MPcFXz54jYCDxY3fcnwJ9U7ZcDz9T4GiRJI6S2nlpmXpuZKzJzFXAxcFdmHp6ZyzJzVdX+5PRAA4iIxRFxSHX7bBq9tPuqn5dW3w8H/jMt1+skSeOtzp5aVyJiNTCZmdcDS4H1EbEXmALe2PLQD0TEydXt92TmA30uVZI0pPoSapm5AdjQpv3QltvrgHXV7YeA42d4rkvqqFGSNPpcUUSSVAxDTZJUDENNklQMQ02SVAxDTZJUDENNklQMQ02SVIyhmXwtSSVbu3mKNeu38eiu3SxfsohrzjmeC089atBlFcdQk6Sard08xbWf2srupxtL1U7t2s21n9oKYLDNM08/SlLN1qzf9mygNe1++hnWrN82oIrKZU9Nkmr26K7dXbXvr29999/42JcfnvfnvfRlK+f9OeebPTVJqtnyJYu6alfvDDVJqtk15xzPooUT39e2aOEE15zTdt127QdPP0pSzZqDQRz9WD9DTZL64MJTjzLE+sDTj5KkYhhqkqRiGGqSpGIYapKkYjhQRJJq5JqP/WWoSVJNXPOx/zz9KEk1cc3H/jPUJKkm/V7zUYaaJNXGNR/7z1CTpJq45mP/OVBEkmrimo/9Z6hJUo1c87G/PP0oSSqGoSZJKoahJkkqhqEmSSqGoSZJKoajHyWpJi5m3H+GmiTVwMWMB8PTj5JUAxczHozaQy0iJiJic0R8Zlr7zRHxxAzHHBgRt0XE1ojYEhFntdx3SdX+1Yj4XEQcUfNLkKSuuZjxYPTj9OOVwP3AYc2GiJgElsxyzFsBMvOkiFgKfDYiTqcRwh8ATszMxyPiV4ErgF+uqXZJ6ljzGtrULMHlYsb1qjXUImIFcB7wPuDqqm0CWANcCrxmhkNPBO4EyMwdEbELmAQ2AwEcEhH/RCMot9f5GiQJGoH1y+vuZdfupwE4IGBvwkQEz2QSQM7xHC5mXL+6e2o3Ae8EntPSdgWwLjMfi4iZjtsCXBARHweOBk4Djs7MuyPiF4CtwHeBB4FfrKt4SeOrtdfVLrD2Vg3PZOPGXIE2EcGv/PRJDhKpWW2hFhHnAzsy857mNbGIWA5cBJw1x+G3AicAm4BvAhuBPRGxEPgF4FTgG8DNwLXADW1+/+XA5QArV67c/xckaSxM75HB3IHVib2ZtQZa62feEcvGNzjr7KmdCayOiHOBg2mcKrwXeArYXvXSFkfE9sw8rvXAzNwDXNX8OSI20uiVnVLd/3dV+yeAd7X75Zl5C3ALwOTk5Hz8TUoqWLswm091X0tr/cw79oSXjO1nXm2jHzPz2sxckZmrgIuBuzLz8MxclpmrqvYnpwcaQEQsjohDqttnA3sy8z5gCjgxIo6sHno2jUEoktSz5pyyugLNa2n9MzSTryNiNTCZmdcDS4H1EbGXRpC9ESAzH42I/wH8RUQ8TePU5M8OqGRJBVi7eYq3f2LLs9fG5ktzIMlRriTSV5Hz/B9yGE1OTuamTZsGXYakITN91Y9uTB/92PxeY4jNOLJuumNPeEne8HufmfuBQ+TSl8059qGj1z80PTVJ6qduemj2ukaHoSZp7Fy3diu3f+nhOUc1Hr54If/9p15kiI0QQ03SWFm7eWrOQJuI4MbXn2yYjSBDTVLxOlm+qmnRwgknSY8wQ01S0boZDOKqH6PPUJNUrG4GgwR4yrEAhpqk4nS7OkgAl52x0kArgKEmqRi9LHXloJCyGGqSRl6v6zY6KKQ8hpqkkdbpnLPpnEhdJkNN0sjqZM5ZK3tm5TPUJI2sNeu3dRxorg4yHgw1SSNp7eapjiZTG2bjxVCTNFI6HRRimI0nQ03SSOg0zJpzzm648KT+FKahYqhJGkqt6zUGdHzt7P1vOMXe2Rgz1CQNlXY9sk4D7agliwy0MWeoSRoavc45g8Zw/WvOOX7ea9JoMdQkDYVu55y1clCImgw1SUOhmzlnTYaZpjPUJA1cp3POmgwzzcRQkzRQzU08Z3NAwN50vUbNzVCT1Hetw/Vn45wzdctQk9RXzZ7Z7qefmfOxzjlTtww1SX2zdvMUb//EFp7JuYeEOOdMvTDUJNWu2008nXOmXhlqkmrTy47UExHueaaeGWqSatHL6iBu4qn9ZahJmle99M7A4fqaH4aapHnRS5jZM9N8M9Qk7Zdee2auCqI6GGqSemKYaRgZapK60muYuTqI+sFQk9SRXsMM7J2pfww1SXPqdfNOw0z9VnuoRcQEsAmYyszzW9pvBt6cmYe2OeZA4EPAJLAXuDIzN0TEc4C/bHnoCuCjmfm2Ol+DNM562bzTMNOg9KOndiVwP3BYsyEiJoElsxzzVoDMPCkilgKfjYjTM/M7wCktz3MP8Klaqpb07FqNnQaaYaZBqzXUImIFcB7wPuDqqm0CWANcCrxmhkNPBO4EyMwdEbGLRq/t7pbnfgGwlO/vuUmaB91ePzPMNCzq7qndBLwTeE5L2xXAusx8LCJmOm4LcEFEfBw4Gjit+n53y2MuAe7I7GC5b0kd62ZrGMNMw6a2UIuI84EdmXlPRJxVtS0HLgLOmuPwW4ETaFyL+yawEdgz7TEXA2+c5fdfDlwOsHLlyu5fgDSGOt0axuH5w6f1M++IZeP7j4w6e2pnAqsj4lzgYBrX1O4FngK2V720xRGxPTOPaz0wM/cAVzV/joiNwIMtP58MLMjMe2b65Zl5C3ALwOTkpL05aRbdnG6ciODG159s72zItH7mHXvCS8b2M6+2UMvMa4FrAaqe2jtaRz9W7U9MD7SqfTEQmfndiDgb2JOZ97U85BLgD+uqXRoXvexz5lqNGmZDM08tIlYDk5l5PY0BIOsjYi8wxb6nGV8PnNvnEqUiuLyVStaXUMvMDcCGNu2HttxeB6yrbj8EzLjtbWYeO981SqXrNcw83ahRMjQ9NUnza+3mKdas38bUrt0EdL0aCHi6UaPHUJMK065H1kugebpRo8hQk0bcfPTIWhlmGmUdhVo0xt9fBhybme+JiJXAssy8e45DJdVo+kTp/Qk0w0wl6LSn9ls0FhZ+OfAe4DvAJ4HTa6pLUgfWrN/W0cofszHMVJJOQ+1lmfnSiNgMkJnfrlbSlzRAj+7a3dNxBplK1WmoPV0tRJwAEXEkjZ6bpAFavmQRUx0E2wEBexOOWrKIa8453jBTsToNtd8APg0sjYj3Aa8DrqutKkkdueac42ddfNgemcZNR6GWmbdXe5e9gsZaphdm5v21ViapIwctOODZULNHpnE3a6hFxHNbftxBy3qLEfHczPxWXYVJml27LWIOWuBkaY23A+a4/x4a27/cA+wEHqCxWv7Oqk3SADS3iJl+2nH308+wZv22AVUlDd6sPbXMPAYgIj5IY2PP/1v9/GrgJ+svT1KrTtZv7HVEpFSCTgeKnJ6ZP9/8ITM/GxHvrakmSZVeVgtZvmRR3WVJQ6vTUHs8Iq4DPkrj/6ufAf6ptqqkMdfr+o2LFk5wzTkzbnAhFW+ua2pNlwBH0hjWv5bGfmeX1FWUNM6aA0B62SLGQSIad50O6f8WcGXNtUhjrzkA5JnsbhVHt4iRGjpd0PjztDn7kZkvn/eKpDF13dqt3P6lh7telNgJ1tL3dHpN7R0ttw8GXgvsmf9ypPHT647Uhpm0r05PP06fk/bFiPhCDfVIY6XdBOqZuFqINLdOTz+2rixyAHAasKyWiqQx0en1s4kIbnz9yYaY1IFOTz/eQ+OaWtA47fj3wFvqKkoqWTenGwMMNKkLnYbaCZn5r60NEXFQDfVIxer22lkAl52x0kCTutBpqG0EXjqt7a/btEmappeBIA4CkXoz1yr9y4CjgEURcSqNfzwCHAYsrrk2aWT1srwVeP1M2l9z9dTOAX4WWAH8ekv7d4D/VlNN0kibPqKx00BzArW0/+Zapf8jwEci4rWZ+ck+1SSNrF5XBPF0ozQ/5jr9+DOZ+VFgVURcPf3+zPz1NodJY6nZQ+sm0AwzaX7NdfrxkOr7oW3u63Y1H6lY3fbQDDOpHnOdfvxQdfPPM/OLrfdFxJm1VSWNiG5HNjoQRKpXp0P6b2bf4fvt2qSx0MswfQeCSPWb65rajwA/Chw57ZraYcBEnYVJw6qb9Rqbw/ldr1Hqj7l6agfSuJ62AHhOS/u/AK+rqyhpWHVz7cxTjVL/zXVN7QvAFyLi9zLzm32qSRo63Z5u9FSjNBidXlN7MiLWAC+isZ8a4CahGg/dbt7pyEZpcDoNtduBO4DzgZ8H3gTsrKsoaVis3TzVcaAZZtLgHdDh456Xmb8LPJ2ZX8jMnwPO6OTAiJiIiM0R8Zlp7TdHxBMzHHNgRNwWEVsjYktEnDXtvlsi4oGI+HpEvLbD1yB1pXn9bK5Am4jgpjecwubrX2mgSQPWaU+teSHhsYg4D3iUxnqQnbgSuJ/GiEkAImISWDLLMW8FyMyTImIp8NmIOD0z9wLvBnZk5r+LiAOA587yPFLXurl+5rUzabh0Gmo3RMQPAG+nMT/tMOBtcx0UESuA84D3AVdXbRPAGuBS4DUzHHoicCdAZu6IiF3AJHA38HPAC6v79gKPd/gapFl1OxjE043S8Oko1DKzeerwn4GfAIiIOUMNuAl4J98/HeAKYF1mPhYR7Y+CLcAFEfFx4GjgNODoiHiguv+91SnJvwOuyMx/7OR1SO30unnnDReeVG9hkrrWaU+tnatphFZbEXE+jdOE9zSviUXEcuAi4Kw5nvtW4ARgE/BNGpuU7qnqXQF8MTOvriaE/xrwxja//3LgcoCVK1d287o0JnpZFcS5ZxpWrZ95Rywb37/P/Qm1GbtZlTOB1RFxLo1pAIcB9wJPAdurXtriiNiemce1HpiZe4Crnv1FERuBB4F/Ap4EPl3d9UfAW9r98sy8BbgFYHJy0sWX9axewgy8fqbh1vqZd+wJLxnbz7xORz+2M+ublpnXZuaKzFwFXAzclZmHZ+ayzFxVtT85PdAAImJxRBxS3T4b2JOZ92VmAn/C93p6rwDu24/XoDFz3dqtXHXHV7oOtMMXLzTQpBEw19qP36F9eAWwaD4LiYjVwGRmXg8sBdZHxF5giu8/vfhLwB9ExE005sq9eT7rULm6mXPW5GAQabTMtUzWc2a7v1OZuQHY0Kb90Jbb64B11e2HgONneK5vAj8+H3VpfHQ656zJMJNG0/5cU5OGnsP0pfFiqKlIhpk0ngw1Fafb/c6ccyaVw1BTUdzvTBpvhpqK4H5nksBQ04jrZSK118+kchlqGlndXDsDw0waB4aaRs7azVOsWb+NqV27O3q8186k8WGoaWT0cqrRa2fSeDHUNPR6XYDY043S+DHUNNS6vW4Ghpk0zgw1Da1u5pw1LVm0kM3Xv7LGqiQNs/3ZekaqTbOH1k2gLVo4wS+vflGNVUkadvbUNHS66aEdELA34agli7jmnOM95SiNOUNNQ6ObASFeN5PUjqGmodDpgBDnnEmajaGmgepmIrVzziTNxVDTQHQ792wiwkCTNCdDTX3X7dwze2iSOmWoqa+6nXvmgBBJ3TDUVKvWa2YBdD7rzInUkrpnqKkW7a6ZdRNoTqSW1AtDTfOul/UawYnUkvafoaZ5t2b9tq4CzblnkuaLoaZ50+3mneDIRknzy1DTfnO/M0nDwlDTfunm+pnXzCTVzVBTz7qdc/aNXzmv5ookjTv3U1NPut3v7Kgli2quSJIMNfWomxGOixZOcM05x9dckSR5+lE9WLt5as4Rjl4/kzQIhprm1M1SV845kzRIhppmdd3ardz+pYefDbLZAs05Z5IGzVBTW73MPTPQJA2aoaZ99LJ241FLFhlokgbOUBtzrdfLJiJ4JvPZ751ydKOkYVF7qEXEBLAJmMrM81vabwbenJmHtjnmQOBDwCSwF7gyMzdU920Ang80h9+9MjN31PkaStTu9GIzyLoJNJe6kjRM+tFTuxK4Hzis2RARk8CSWY55K0BmnhQRS4HPRsTpmbm3uv+yzNxUV8El63Wdxukcqi9pGNUaahGxAjgPeB9wddU2AawBLgVeM8OhJwJ3AmTmjojYRaPXdned9Zau133OmuyVSRp2dffUbgLeCTynpe0KYF1mPhYRMx23BbggIj4OHA2cVn1vhtptEfEM8Enghsx9z5dFxOXA5QArV66ch5cy+nrZ52xvJsvtlUlDr/Uz74hl4/v/am2hFhHnAzsy856IOKtqWw5cBJw1x+G3AifQuBb3TWAjsKe677LMnIqI59AItTcCvz/9CTLzFuAWgMnJyc4vEhXMfc6kcrV+5h17wkvG9jOvzp7amcDqiDgXOJjGNbV7gaeA7VUvbXFEbM/M41oPzMw9wFXNnyNiI/Bgdd9U9f07EfEx4IdpE2r6nuZ1tLm4tJWkUVfbgsaZeW1mrsjMVcDFwF2ZeXhmLsvMVVX7k9MDDSAiFkfEIdXts4E9mXlfRCyIiCOq9oXA+cDX6noNJWheR5ttYMjhixdy0xtO4Ru/ch43veEUAK664yuc+T/vYu3mqX6VKkn7bWjmqUXEamAyM68HlgLrI2IvMEXjFCPAQVX7QmAC+HPgdwZR76jo5Dra5utfCew7kGRq126u/dRWAHttkmr1sS8/vE/bpS/rfjxEX0KtmmO2oU37oS231wHrqtsPAfvM5s3M79IYNKIOPTrHdbTWfc7aBeDup59hzfpthpqkkeB+aoVbPsvmnNNXApkpAOcKRkkaFoZa4a4553gWLZzYp/3wxQv3Gd04UwDOFoySNEyG5pqaujN9ZZDmyMXp6zdO/z7byMZrzjl+n8nZrusoaZQYaiNmpmWu9lazUqav39j6vRlQM10fa7avWb+NR3ftdtK1pJFjqI2Q/V3mqpNBHxeeepQhJmlkGWojYu3mKd7+iS1draDfjoM+JJXMUBty87WqfpODPiSVzFAbUvMdZuCgD0nlM9SGTK9hNtfoR9dzlDQODLUh0u1AkIkIbnz9yQaVJFUMtSHR7UAQt4aRpH0ZakPgurVbuf1LD9PpuEZ3oJak9gy1AVu7earjQDPMJGl2htqArVm/bc5AM8wkqTOG2oDNNhnagSCS1B1DrQ/Wbp6acT3F5UsWMdUm2AIMNEnqklvP1Kw5TH9q126Sxm7Sb7vjK5z6nj9l7eaptlvDBHDZGSsNNEnqkj21Gs02TP/bTz7N2+74yj6Tpp0kLUm9M9Rq0M2qIK1bxsy1NYwkaXaefpxn163dylV3fKWnNRubW8NIknpjT22ezNcCxG4NI0m9M9Tmwf5u3tnKrWEkqXeG2n7qZs3GhQcECyeCJ5/e2/Z+t4aRpP1jqPWo29ONrauCNOetTe3a7ahHSZpHhloPujnd2JxzdsOFJz3bduGpRxleklQDQ61L3ZxudM1GSeovQ61D3ZxudM1GSRoMQ20O3V47c/NOSRocQ62N1oEc3fB0oyQNlqE2TS9zzjzdKEnDwWWyplmzfltXgbZo4YSBJklDwp7aNN0sU+XpRknD6LmHHMilL1s56DIGwlCbZqZNO1sZZpI0nDz9OE27TTubAviZM1ay+fpXGmiSNITsqU3TDCuXsZKk0VN7qEXEBLAJmMrM81vabwbenJmHtjnmQOBDwCSwF7gyMzdMe8w64NjMfPF81+wyVpI0mvrRU7sSuB84rNkQEZPAklmOeStAZp4UEUuBz0bE6Zm5tzr+p4En6itZkjSKar2mFhErgPOAD7e0TQBrgHfOcuiJwJ0AmbkD2EWj10ZEHApcDdxQT9WSpFFV90CRm2iEV+sGYlcA6zLzsVmO2wJcEBELIuIY4DTg6Oq+9wI3Ak/O9osj4vKI2BQRm3bu3NnzC5CkUeBnXkNtoRYR5wM7MvOelrblwEXAzXMcfivwCI1rcTcBG4E9EXEKcFxmfnqu35+Zt2TmZGZOHnnkkb2+DEkaCX7mNdR5Te1MYHVEnAscTOOa2r3AU8D2iABYHBHbM/O41gMzcw9wVfPniNgIPAj8B+C0iHioqn1pRGzIzLNqfB2SpBFRW08tM6/NzBWZuQq4GLgrMw/PzGWZuapqf3J6oAFExOKIOKS6fTawJzPvy8zfzszl1bE/BjxgoEmSmoZmnlpErAYmM/N6YCmwPiL2AlPAGwdanCRpJPQl1Ko5ZhvatB/acnsdsK66/RBw/BzP+RAw73PUJEmjy2WyJEnFMNQkScUw1CRJxTDUJEnFMNQkScUw1CRJxTDUJEnFMNQkScUw1CRJxTDUJEnFMNQkScUw1CRJxTDUJEnFMNQkScUw1CRJxTDUJEnFMNQkScUw1CRJxTDUJEnFMNQkScUw1CRJxTDUJEnFMNQkScUw1CRJxTDUJEnFMNQkScUw1CRJxTDUJEnFMNQkScUw1CRJxTDUJEnFMNQkScUw1CRJxTDUJEnFMNQkScWoPdQiYiIiNkfEZ6a13xwRT8xwzIERcVtEbI2ILRFxVst9n6va7o2ID0bERM0vQZI0IvrRU7sSuL+1ISImgSWzHPNWgMw8CTgbuDEimrW+PjNPBl4MHAlcNO8VS5JGUq2hFhErgPOAD7e0TQBrgHfOcuiJwJ0AmbkD2AVMVj//S/WYBcCBQM574ZKkkVR3T+0mGuG1t6XtCmBdZj42y3FbgAsiYkFEHAOcBhzdvDMi1gM7gO8Af9zuCSLi8ojYFBGbdu7cuZ8vQ5KGm595DbWFWkScD+zIzHta2pbTOF148xyH3wo8AmyiEYwbgT3NOzPzHOD5wEHAy9s9QWbekpmTmTl55JFH7s9LkaSh52dew4Ian/tMYHVEnAscDBwG3As8BWyPCIDFEbE9M49rPTAz9wBXNX+OiI3Ag9Me868RsQ64APizGl+HJGlE1NZTy8xrM3NFZq4CLgbuyszDM3NZZq6q2p+cHmgAEbE4Ig6pbp8N7MnM+yLi0Ih4ftW+ADgX+Hpdr0GSNFrq7Kl1JSJWA5OZeT2wFFgfEXuBKeCN1cMOAdZFxEHABHAX8MFB1CtJGj59CbXM3ABsaNN+aMvtdcC66vZDwPFtHv+PwOk1lSlJGnGRWf6I+IjYCXyzh0OPAB6f53L2lzV1xpo6M4w1wXDWNeiaHs/MV3XywIj4XKePLc1YhFqvImJTZk4Ouo5W1tQZa+rMMNYEw1nXMNakfbn2oySpGIaaJKkYhtrsbhl0AW1YU2esqTPDWBMMZ13DWJOm8ZqaJKkY9tQkScUw1CRJxTDU2oiIV0XEtojYHhHvGlANR0fE5yPi/mpD1Cur9udGxJ9FxIPV98MHUNv3bfwaEcdExJermu6IiAMHUNOSiPjjiPh69Z79yKDfq4i4qvpv97WI+MOIOLjf71VE3BoROyLiay1tbd+XaPiN6u/+qxHx0j7WtKb6b/fViPh0RCxpue/aqqZtEXFOv2pque8dEZERcUT1c1/eJ/XGUJum2u/tN4FX09jX7ZKIOHEApewB3p6ZJwBnAL/fFOlqAAAFUUlEQVRY1fEu4M7MfAGNPecGEbrTN379X8D7q5q+DbxlADV9APhcZr4QOLmqb2DvVUQcBfxXGku/vZjGsm4X0//36veA6ZNwZ3pfXg28oPq6HPjtPtb0Z8CLM/MlwAPAtQDV3/zFwIuqY34r6tntvl1NRMTRNDYqfriluV/vk3pgqO3rh4HtmfmNzPw34OM0dgLoq8x8LDP/trr9HRof0kdVtXykethHgAv7WVdM2/g1GtstvJzv7Ws3iJoOA34c+F2AzPy3zNzFgN8rGsvQLaoW314MPEaf36vM/AvgW9OaZ3pfLgB+Pxu+BCxpLiBed02Z+afV7hwAXwJWtNT08cx8KjP/HthO4//R2muqvJ/GnpCtI+r68j6pN4bavo4C/qHl50eqtoGJiFXAqcCXgR9sbrBafV/a53Kmb/z6PGBXywfSIN6vY4GdwG3VadEPV7s8DOy9yswp4Ndo/Av/MeCfgXsY/HsFM78vw/K3/3PAZ6vbA6upWmR9KjO3TLtrWN4ntWGo7SvatA1s3kNEHAp8EnhbZv7LoOqoatln41eG4/1aALwU+O3MPBX4LoM5Lfus6jrVBcAxwHIaO0y8us1Dh2lOzcD/W0bEu2mcer+92dTmYbXXFBGLgXcD17e7u03bMP13HGuG2r4eAY5u+XkF8OggComIhTQC7fbM/FTV/I8te8o9H9jRx5KaG78+ROO07Mtp9NyWVKfYYDDv1yPAI5n55ernP6YRcoN8r34S+PvM3JmZTwOfAn6Uwb9XMPP7MtC//Yh4E3A+cFl+bwLtoGr6IRr/INlS/b2vAP42IpYNsCZ1wFDb198AL6hGqR1I4yL1un4XUV2r+l3g/sz89Za71gFvqm6/Cfjf/appho1fLwM+D7xuEDVVdf0/4B8iorld0SuA+xjge0XjtOMZ0djwNlpqGuh7VZnpfVkH/MdqdN8ZwD83T1PWLSJeBfwSsDozn5xW68URcVBEHENjcMbdddeTmVszc2nLhsaPAC+t/tYG9j6pA5np17QvGjtqPwD8HfDuAdXwYzROaXwV+Er1dS6Na1h3Ag9W3587oPrOAj5T3T6WxgfNduCPgIMGUM8pwKbq/VoLHD7o9wr4HzR2Zv8a8AfAQf1+r4A/pHFN72kaH8xvmel9oXFa7Terv/utNEZu9qum7TSuUzX/1j/Y8vh3VzVtA17dr5qm3f8QcEQ/3ye/evtymSxJUjE8/ShJKoahJkkqhqEmSSqGoSZJKoahJkkqhqEmARGxscvHnxXVLgU9/K4XRsRfR8RTEfGOXp5DUnsL5n6IVL7M/NE+/rpv0VjBv98LLEvFs6cmARHxRPX9rIjYEN/bm+32akWQ5j57X4+IvwJ+uuXYQ6r9uP6mWlD5gqr96oi4tbp9UjT2VVucmTsy829oTPSVNI8MNWlfpwJvo7Gf3rHAmRFxMPA7wE8B/x5Y1vL4d9NYMux04CeANdUuATcBx0XEa4DbgP+U378ElKR5ZqhJ+7o7Mx/JzL00lmxaBbyQxgLFD2ZjGZ6Ptjz+lcC7IuIrwAbgYGBldfzP0lgi6wuZ+cX+vQRpPHlNTdrXUy23n+F7/5/MtKZcAK/NzG1t7nsB8ASN7Wck1cyemtSZrwPHRMQPVT9f0nLfeuC/tFx7O7X6/gPAB2jsyv28iHgdkmplT03qQGb+a0RcDvyfiHgc+CvgxdXd76Vx/eyrVbA9RGNfsPcDv5WZD0TEW4DPR8Rf0PjH5CbgMGBvRLwNODEHvAmsVAJX6ZckFcPTj5KkYhhqkqRiGGqSpGIYapKkYhhqkqRiGGqSpGIYapKkYvx/AvW1yIhBJOsAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "\n", "Index=[12,40,62,67,91,111,122,145]\n", "\n", "for i in Index:\n", " CC_Lex.iloc[[i],[3]]= (CC_Lex.iloc[[i-1],[3]].values + CC_Lex.iloc[[i+1],[3]].values)/2 \n", " \n", "import seaborn as sns\n", "import scipy.stats as stats #used to get correlation coefficient\n", "\n", "# Plot the graph to see that outliers are taken care of\n", "j=sns.jointplot(x='index1', y='Latitude', data= CC_Lex)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }