{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Application 2: CollabMap Data Quality\n", "Assessing the quality of crowdsourced data in CollabMap from their provenance\n", "\n", "* **Goal**: To determine if the provenance network analytics method can identify trustworthy data (i.e. buildings, routes, and route sets) contributed by crowd workers in [CollabMap](https://collabmap.org/).\n", "* **Classification labels**: $\\mathcal{L} = \\left\\{ \\textit{trusted}, \\textit{uncertain} \\right\\} $.\n", "* **Training data**:\n", " - Buildings: 5175\n", " - Routes: 4710\n", " - Route sets: 4997\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reading data\n", "The CollabMap dataset is provided in the [`collabmap/depgraphs.csv`](collabmap/depgraphs.csv) file, each row corresponds to a building, route, or route sets created in the application:\n", "* `id`: the identifier of the data entity (i.e. building/route/route set).\n", "* `trust_value`: the beta trust value calculated from the votes for the data entity.\n", "* The remaining columns provide the provenance network metrics calculated from the dependency provenance graph of the entity." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
trust_valueentitiesagentsactivitiesnodesedgesdiameterassortativityaccacc_e...mfd_e_amfd_e_agmfd_a_emfd_a_amfd_a_agmfd_ag_emfd_ag_amfd_ag_agmfd_derpowerlaw_alpha
id
Route41053.00.83333390615263-0.2722070.8910910.809409...102000002-1.00000
RouteSet9042.10.6000006039152-0.4129740.8796300.847222...101000001-1.00000
Building19305.00.42857160410132-0.5270460.9012350.822222...1010000013.19876
Building1136.00.42857160410132-0.5270460.9012350.822222...1010000013.19876
Building24156.00.83333390514243-0.3639370.8380340.757639...202200002-1.00000
\n", "

5 rows × 23 columns

\n", "
" ], "text/plain": [ " trust_value entities agents activities nodes edges \\\n", "id \n", "Route41053.0 0.833333 9 0 6 15 26 \n", "RouteSet9042.1 0.600000 6 0 3 9 15 \n", "Building19305.0 0.428571 6 0 4 10 13 \n", "Building1136.0 0.428571 6 0 4 10 13 \n", "Building24156.0 0.833333 9 0 5 14 24 \n", "\n", " diameter assortativity acc acc_e ... \\\n", "id ... \n", "Route41053.0 3 -0.272207 0.891091 0.809409 ... \n", "RouteSet9042.1 2 -0.412974 0.879630 0.847222 ... \n", "Building19305.0 2 -0.527046 0.901235 0.822222 ... \n", "Building1136.0 2 -0.527046 0.901235 0.822222 ... \n", "Building24156.0 3 -0.363937 0.838034 0.757639 ... \n", "\n", " mfd_e_a mfd_e_ag mfd_a_e mfd_a_a mfd_a_ag mfd_ag_e \\\n", "id \n", "Route41053.0 1 0 2 0 0 0 \n", "RouteSet9042.1 1 0 1 0 0 0 \n", "Building19305.0 1 0 1 0 0 0 \n", "Building1136.0 1 0 1 0 0 0 \n", "Building24156.0 2 0 2 2 0 0 \n", "\n", " mfd_ag_a mfd_ag_ag mfd_der powerlaw_alpha \n", "id \n", "Route41053.0 0 0 2 -1.00000 \n", "RouteSet9042.1 0 0 1 -1.00000 \n", "Building19305.0 0 0 1 3.19876 \n", "Building1136.0 0 0 1 3.19876 \n", "Building24156.0 0 0 2 -1.00000 \n", "\n", "[5 rows x 23 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv(\"collabmap/depgraphs.csv\", index_col='id')\n", "df.head()" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
trust_valueentitiesagentsactivitiesnodesedgesdiameterassortativityaccacc_e...mfd_e_amfd_e_agmfd_a_emfd_a_amfd_a_agmfd_ag_emfd_ag_amfd_ag_agmfd_derpowerlaw_alpha
count14882.00000014882.00000014882.014882.00000014882.00000014882.00000014882.00000014882.00000014882.00000014882.000000...14882.00000014882.014882.00000014882.00000014882.014882.014882.014882.014882.00000014882.000000
mean0.76670613.3846930.06.79337520.17806739.1188682.771267-0.3637910.8061230.762426...1.5454240.01.7425750.9871660.00.00.00.01.802782-0.226061
std0.11530117.1656770.07.24770624.14788859.6485350.9172980.2386580.2036270.200090...1.0440790.01.0126151.3917630.00.00.00.00.9389741.590865
min0.1538462.0000000.00.0000002.0000001.0000001.000000-1.0000000.0000000.000000...0.0000000.00.0000000.0000000.00.00.00.01.000000-1.000000
25%0.7500005.0000000.02.0000007.00000010.0000002.000000-0.5000000.8203090.757639...1.0000000.01.0000000.0000000.00.00.00.01.000000-1.000000
50%0.8000009.0000000.05.00000014.00000024.0000003.000000-0.3308350.8497900.809409...1.0000000.02.0000000.0000000.00.00.00.02.000000-1.000000
75%0.83333314.0000000.09.00000022.00000040.0000003.000000-0.2512560.8800830.854159...2.0000000.02.0000002.0000000.00.00.00.02.000000-1.000000
max0.965517178.0000000.070.000000248.000000706.00000013.0000000.4940081.0000001.000000...13.0000000.012.00000013.0000000.00.00.00.012.0000004.674298
\n", "

8 rows × 23 columns

\n", "
" ], "text/plain": [ " trust_value entities agents activities nodes \\\n", "count 14882.000000 14882.000000 14882.0 14882.000000 14882.000000 \n", "mean 0.766706 13.384693 0.0 6.793375 20.178067 \n", "std 0.115301 17.165677 0.0 7.247706 24.147888 \n", "min 0.153846 2.000000 0.0 0.000000 2.000000 \n", "25% 0.750000 5.000000 0.0 2.000000 7.000000 \n", "50% 0.800000 9.000000 0.0 5.000000 14.000000 \n", "75% 0.833333 14.000000 0.0 9.000000 22.000000 \n", "max 0.965517 178.000000 0.0 70.000000 248.000000 \n", "\n", " edges diameter assortativity acc acc_e \\\n", "count 14882.000000 14882.000000 14882.000000 14882.000000 14882.000000 \n", "mean 39.118868 2.771267 -0.363791 0.806123 0.762426 \n", "std 59.648535 0.917298 0.238658 0.203627 0.200090 \n", "min 1.000000 1.000000 -1.000000 0.000000 0.000000 \n", "25% 10.000000 2.000000 -0.500000 0.820309 0.757639 \n", "50% 24.000000 3.000000 -0.330835 0.849790 0.809409 \n", "75% 40.000000 3.000000 -0.251256 0.880083 0.854159 \n", "max 706.000000 13.000000 0.494008 1.000000 1.000000 \n", "\n", " ... mfd_e_a mfd_e_ag mfd_a_e mfd_a_a \\\n", "count ... 14882.000000 14882.0 14882.000000 14882.000000 \n", "mean ... 1.545424 0.0 1.742575 0.987166 \n", "std ... 1.044079 0.0 1.012615 1.391763 \n", "min ... 0.000000 0.0 0.000000 0.000000 \n", "25% ... 1.000000 0.0 1.000000 0.000000 \n", "50% ... 1.000000 0.0 2.000000 0.000000 \n", "75% ... 2.000000 0.0 2.000000 2.000000 \n", "max ... 13.000000 0.0 12.000000 13.000000 \n", "\n", " mfd_a_ag mfd_ag_e mfd_ag_a mfd_ag_ag mfd_der powerlaw_alpha \n", "count 14882.0 14882.0 14882.0 14882.0 14882.000000 14882.000000 \n", "mean 0.0 0.0 0.0 0.0 1.802782 -0.226061 \n", "std 0.0 0.0 0.0 0.0 0.938974 1.590865 \n", "min 0.0 0.0 0.0 0.0 1.000000 -1.000000 \n", "25% 0.0 0.0 0.0 0.0 1.000000 -1.000000 \n", "50% 0.0 0.0 0.0 0.0 2.000000 -1.000000 \n", "75% 0.0 0.0 0.0 0.0 2.000000 -1.000000 \n", "max 0.0 0.0 0.0 0.0 12.000000 4.674298 \n", "\n", "[8 rows x 23 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Labelling data\n", "Based on its trust value, we categorise the data entity into two sets: _trusted_ and _uncertain_. Here, the threshold for the trust value, whose range is [0, 1], is chosen to be 0.75." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
trust_valueentitiesagentsactivitiesnodesedgesdiameterassortativityaccacc_e...mfd_e_agmfd_a_emfd_a_amfd_a_agmfd_ag_emfd_ag_amfd_ag_agmfd_derpowerlaw_alphalabel
id
Route41053.00.83333390615263-0.2722070.8910910.809409...02000002-1.00000Trusted
RouteSet9042.10.6000006039152-0.4129740.8796300.847222...01000001-1.00000Uncertain
Building19305.00.42857160410132-0.5270460.9012350.822222...010000013.19876Uncertain
Building1136.00.42857160410132-0.5270460.9012350.822222...010000013.19876Uncertain
Building24156.00.83333390514243-0.3639370.8380340.757639...02200002-1.00000Trusted
\n", "

5 rows × 24 columns

\n", "
" ], "text/plain": [ " trust_value entities agents activities nodes edges \\\n", "id \n", "Route41053.0 0.833333 9 0 6 15 26 \n", "RouteSet9042.1 0.600000 6 0 3 9 15 \n", "Building19305.0 0.428571 6 0 4 10 13 \n", "Building1136.0 0.428571 6 0 4 10 13 \n", "Building24156.0 0.833333 9 0 5 14 24 \n", "\n", " diameter assortativity acc acc_e ... \\\n", "id ... \n", "Route41053.0 3 -0.272207 0.891091 0.809409 ... \n", "RouteSet9042.1 2 -0.412974 0.879630 0.847222 ... \n", "Building19305.0 2 -0.527046 0.901235 0.822222 ... \n", "Building1136.0 2 -0.527046 0.901235 0.822222 ... \n", "Building24156.0 3 -0.363937 0.838034 0.757639 ... \n", "\n", " mfd_e_ag mfd_a_e mfd_a_a mfd_a_ag mfd_ag_e mfd_ag_a \\\n", "id \n", "Route41053.0 0 2 0 0 0 0 \n", "RouteSet9042.1 0 1 0 0 0 0 \n", "Building19305.0 0 1 0 0 0 0 \n", "Building1136.0 0 1 0 0 0 0 \n", "Building24156.0 0 2 2 0 0 0 \n", "\n", " mfd_ag_ag mfd_der powerlaw_alpha label \n", "id \n", "Route41053.0 0 2 -1.00000 Trusted \n", "RouteSet9042.1 0 1 -1.00000 Uncertain \n", "Building19305.0 0 1 3.19876 Uncertain \n", "Building1136.0 0 1 3.19876 Uncertain \n", "Building24156.0 0 2 -1.00000 Trusted \n", "\n", "[5 rows x 24 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "trust_threshold = 0.75\n", "df['label'] = df.apply(lambda row: 'Trusted' if row.trust_value >= trust_threshold else 'Uncertain', axis=1)\n", "df.head() # The new label column is the last column below" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Having used the trust valuue to label all the data entities, we remove the `trust_value` column from the data frame." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(14882, 23)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# We will not use trust value from now on\n", "df.drop('trust_value', axis=1, inplace=True)\n", "df.shape # the dataframe now have 23 columns (22 metrics + label)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Filtering data\n", "We split the dataset into three: buildings, routes, and route sets." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "((5175, 23), (4997, 23), (4710, 23))" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_buildings = df.filter(like=\"Building\", axis=0)\n", "df_routes = df.filter(regex=\"^Route\\d\", axis=0)\n", "df_routesets = df.filter(like=\"RouteSet\", axis=0)\n", "df_buildings.shape, df_routes.shape, df_routesets.shape # The number of data points in each dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Balancing Data\n", "This section explore the balance of each of the three datasets and balance them using the [SMOTE Oversampling Method](https://www.jair.org/media/953/live-953-2037-jair.pdf)." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from analytics import balance_smote" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Buildings" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Trusted 4491\n", "Uncertain 684\n", "Name: label, dtype: int64" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_buildings.label.value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Balancing the building dataset:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Original data shapes: (5175, 22) (5175,)\n", "Balanced data shapes: (8982, 22) (8982,)\n" ] } ], "source": [ "df_buildings = balance_smote(df_buildings)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Routes" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "Trusted 3908\n", "Uncertain 1089\n", "Name: label, dtype: int64" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_routes.label.value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Balancing the route dataset:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Original data shapes: (4997, 22) (4997,)\n", "Balanced data shapes: (7816, 22) (7816,)\n" ] } ], "source": [ "df_routes = balance_smote(df_routes)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Route Sets" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Trusted 3019\n", "Uncertain 1691\n", "Name: label, dtype: int64" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_routesets.label.value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Balancing the route set dataset:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Original data shapes: (4710, 22) (4710,)\n", "Balanced data shapes: (6038, 22) (6038,)\n" ] } ], "source": [ "df_routesets = balance_smote(df_routesets)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Cross Validation\n", "\n", "We now run the cross validation tests on the three balanaced datasets (`df_buildings`, `df_routes`, and `df_routesets`) using all the features (`combined`), only the generic network metrics (`generic`), and only the provenance-specific network metrics (`provenance`). Please refer to [Cross Validation Code.ipynb](Cross%20Validation%20Code.ipynb) for the detailed description of the cross validation code." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from analytics import test_classification" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Building Classification\n", "\n", "We test the classification of buildings, collect individual accuracy scores `results` and the importance of every feature in each test in `importances` (both are Pandas Dataframes). These two tables will also be used to collect data from testing the classification of routes and route sets later." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accuracy: 90.03% ±0.0576 <-- combined\n", "Accuracy: 90.06% ±0.0557 <-- generic\n", "Accuracy: 89.90% ±0.0581 <-- provenance\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AccuracyMetricsData Type
29950.891982provenanceBuilding
29960.885301provenanceBuilding
29970.918708provenanceBuilding
29980.887528provenanceBuilding
29990.902004provenanceBuilding
\n", "
" ], "text/plain": [ " Accuracy Metrics Data Type\n", "2995 0.891982 provenance Building\n", "2996 0.885301 provenance Building\n", "2997 0.918708 provenance Building\n", "2998 0.887528 provenance Building\n", "2999 0.902004 provenance Building" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Cross validation test on building classification\n", "res, imps = test_classification(df_buildings)\n", "\n", "# adding the Data Type column\n", "res['Data Type'] = 'Building'\n", "imps['Data Type'] = 'Building'\n", "\n", "# storing the results and importance of features\n", "results = res\n", "importances = imps\n", "\n", "# showing a few newest rows\n", "results.tail()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Route Classification" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accuracy: 96.98% ±0.0368 <-- combined\n", "Accuracy: 96.63% ±0.0404 <-- generic\n", "Accuracy: 95.97% ±0.0433 <-- provenance\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AccuracyMetricsData Type
59950.943734provenanceRoute
59960.960358provenanceRoute
59970.964194provenanceRoute
59980.957692provenanceRoute
59990.958974provenanceRoute
\n", "
" ], "text/plain": [ " Accuracy Metrics Data Type\n", "5995 0.943734 provenance Route\n", "5996 0.960358 provenance Route\n", "5997 0.964194 provenance Route\n", "5998 0.957692 provenance Route\n", "5999 0.958974 provenance Route" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Cross validation test on route classification\n", "res, imps = test_classification(df_routes)\n", "\n", "# adding the Data Type column\n", "res['Data Type'] = 'Route'\n", "imps['Data Type'] = 'Route'\n", "\n", "# storing the results and importance of features\n", "results = results.append(res, ignore_index=True)\n", "importances = importances.append(imps, ignore_index=True)\n", "\n", "# showing a few newest rows\n", "results.tail()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Route Set Classification" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accuracy: 95.70% ±0.0493 <-- combined\n", "Accuracy: 95.20% ±0.0526 <-- generic\n", "Accuracy: 95.34% ±0.0500 <-- provenance\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AccuracyMetricsData Type
89950.950331provenanceRoute Set
89960.947020provenanceRoute Set
89970.968543provenanceRoute Set
89980.965232provenanceRoute Set
89990.950166provenanceRoute Set
\n", "
" ], "text/plain": [ " Accuracy Metrics Data Type\n", "8995 0.950331 provenance Route Set\n", "8996 0.947020 provenance Route Set\n", "8997 0.968543 provenance Route Set\n", "8998 0.965232 provenance Route Set\n", "8999 0.950166 provenance Route Set" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Cross validation test on route classification\n", "res, imps = test_classification(df_routesets)\n", "\n", "# adding the Data Type column\n", "res['Data Type'] = 'Route Set'\n", "imps['Data Type'] = 'Route Set'\n", "\n", "# storing the results and importance of features\n", "results = results.append(res, ignore_index=True)\n", "importances = importances.append(imps, ignore_index=True)\n", "\n", "# showing a few newest rows\n", "results.tail()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Saving experiments' results (optional)\n", "\n", "Optionally, we can save the test results to save time the next time we want to re-explore them:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": true }, "outputs": [], "source": [ "results.to_pickle(\"collabmap/results.pkl\")\n", "importances.to_pickle(\"collabmap/importances.pkl\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next time, we can reload the results as follows:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "((9000, 3), (3000, 23))" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "results = pd.read_pickle(\"collabmap/results.pkl\")\n", "importances = pd.read_pickle(\"collabmap/importances.pkl\")\n", "results.shape, importances.shape # showing the shape of the data (for checking)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "### Charting the accuracy scores" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%matplotlib inline\n", "import seaborn as sns\n", "sns.set_style(\"whitegrid\")\n", "sns.set_context(\"paper\", font_scale=1.4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Converting the accuracy score from [0, 1] to percentage, i.e [0, 100]:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AccuracyMetricsData Type
090.444444combinedBuilding
189.866370combinedBuilding
290.311804combinedBuilding
390.200445combinedBuilding
488.307350combinedBuilding
\n", "
" ], "text/plain": [ " Accuracy Metrics Data Type\n", "0 90.444444 combined Building\n", "1 89.866370 combined Building\n", "2 90.311804 combined Building\n", "3 90.200445 combined Building\n", "4 88.307350 combined Building" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "results.Accuracy = results.Accuracy * 100\n", "results.head()" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from matplotlib.font_manager import FontProperties\n", "fontP = FontProperties()\n", "fontP.set_size(12)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY0AAAEJCAYAAABohnsfAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XlclWX+//HXATwqioEa2Ji5pqKJk04qXxdwVxQJconc\nKlMcl3CdwjVRyxotl8rGyjWkxERwwSkTq3EbMwYsUbPGMkwgFRXEDsv5/eHPMxIgB5QDxPv5ePjw\nnPvc13197nNx8+G6r/u+boPZbDYjIiJiBbuyDkBERCoOJQ0REbGakoaIiFhNSUNERKympCEiIlZT\n0hAREauVatLIysri6aef5uDBgwAkJyczcuRI+vfvz9NPP83Fixct682aNYv+/fvj6+tLQkJCgduL\niYlhwIAB9OnTh7feeqs0QxcRkQKUWtI4ffo0I0aMIC4uzrJswYIF+Pn5ERMTg4+PD4sXLwYgLCyM\n7OxsYmJiWLZsGdOnTyc7OzvP9lJTU1myZAkbNmxg165dHD58mC+//LK0whcRkQKUWtLYsmUL48aN\nw8PDA7jZmzh06BCDBg0C4PHHHyc2NpasrCxiY2Px9/cHoHnz5ri6uuZJNgAHDhygQ4cO1K1blypV\nquDn58fu3btLK3wRESlAqSWNOXPm0LNnT8v7tLQ0HB0dMRqNABiNRhwdHbl06RLJycm4urpa1nV1\ndeXChQt5tpeSkoKbm1uedZKTk0srfBERKYCDrSrKzc0tcLmdnR0FzWRiZ5c3nxVU3mAw3LHOY8eO\nFSNCERG5pX379gUut1nSqF27NtevX8dkMmE0GjGZTFy/fh1nZ2fc3NxITU2ladOmwM3xi9t7FQD1\n6tXj0KFDlvepqanUq1evyHoL23ERESnYnf7gttklt1WqVKFjx45ER0cDEBUVRadOnahSpQre3t5E\nRkYCcObMGc6dO2cZC7nF09OTI0eOkJKSQlZWFtHR0Xh7e9sqfBERwcb3acyfP5+dO3cyYMAAIiMj\nmTt3LgAjRozAwcGBAQMGMGXKFJYsWYLRaCQ5ORk/Pz8A3NzceOGFF3j22WcZOHAgrVq1onfv3rYM\nX0Sk0jP8kadGP3bsmE5PiYgU051+d+qOcBERsZqShoiIWE1JQ0RErKakISIiVlPSEBERq9ns5r4/\nss4vBZbKdg+8FF4q27VGjx49WLduHQ0bNsyzfMWKFffscufC6riX3tq7uVS2O7HXU6Wy3bsRHh5O\nbm4uw4cPt3ndl25cK7Vt167mVGrbluJT0pBiCQ4OLusQpBCBgaXzx4vI7ZQ0/iA2bdpEWFgY9vb2\ntG/fnueff565c+fy888/YzAYGDt2LL6+vmzbto39+/eTmppKcnIy3bt35/7772fv3r1kZmayatUq\nmjRpAsDbb7/N6dOncXBwIDQ0FHd3d1588UXat2+Pp6cn48ePp02bNnz77bdUrVqV119/nQYNGvDt\nt9/y8ssvc/36dRwdHZk7dy4tW7bkl19+YebMmVy5coWHH36Y3377rYy/tdK3YsUKdu3ahZOTE02b\nNqVBgwa0a9eO5cuXk52dTe3atQkNDaV+/fqMHDmSNm3aEBcXR3JyMpMmTSIgIIDMzEwWLVrEiRMn\nyM7OZujQoYwcOZIjR47w6quvAjen2XF3dyc7O5upU6dy5MgRXnnlFXJycqhduzbLli2jbt26Zfxt\nlJ4jR46wYsUKqlWrRlJSEo888ggvv/wynTt3pl27dvz8889s2bKF8PBwIiMjsbe3x8PDgzlz5rB1\n61ZOnjxpeVTD+vXrSUpK4sUXX2TZsmUcOnSI7OxsunfvztSpU0lKSir0Zz8mJoZ169Zx48YNbty4\nQWhoKJ06dSq0bU0mE4sWLeLIkSM4ODgwcuRInnzyyUKPofJAYxp/AN9++y3r1q0jPDycnTt3kpaW\nxsCBA2nbti07duxg7dq1LFu2jFOnTgHw9ddfs3r1aqKjo4mIiMDJyYmtW7fSvXt3IiIiLNtt0KAB\nkZGRTJo0ib/97W/56v3uu+948skniY6Opm3btoSFhZGVlcWLL77IkiVLiIyMZPbs2ZbeycKFC+nd\nuzc7duxgyJAh/Prrr7b5gspIbGwsX3zxBdHR0XzwwQd8//33XL58mSVLlvCPf/yDyMhIAgMDmTNn\njqXM9evXCQ8PZ9WqVZaEsHr1aho1akRkZCQRERHs2rXLMjfQDz/8wNq1a3n77bct2zCZTEybNo3Q\n0FB27NiBt7c3mzZtsu3Ol4Hjx48za9Ys9uzZQ1ZWFhs3buTatWuMGDGC3bt38/XXXxMTE0NERATR\n0dEYDAbefPNNBgwYQGxsLCaTCYDo6Gj8/f3ZunUrN27cYNu2bWzfvp2zZ8+yY8cOoOCffbPZzIcf\nfmg5toKCgnjvvfcs8RXUtmFhYVy8eJHdu3fz4YcfEhYWxqVLlwo9hsoD9TT+AI4cOUL37t1xcXEB\nYOXKlXTq1ImhQ4cCULduXXr06MGRI0eoWbMm7du3x9nZGQBnZ2c6d+4MQP369fnmm28s2w0ICADA\ny8uLmTNncvny5Tz1Ojs707ZtW+Dmc1Di4uL473//y7lz55g0aZJlvYyMDC5fvsyhQ4d4+eWXgZtz\niTVo0KA0vo5y48CBAwwcOJBq1aoBMGjQIKKiorhw4QLPPPMMAGazmYyMDEsZLy8v4Ob3mZaWBsC/\n/vUvMjMz2blzJ3Dz+zx58iTNmjWjcePGlra85fTp09SuXdsyf9utuv7o2rVrR7NmzQDw8/Nj8+bN\nluUAhw4dYsCAAdSoUQOAYcOGMW/ePGbOnMmjjz7KF198QZMmTcjKyqJVq1a88847nDhxgscffxyA\nGzdu0LBhQ9q1a1fgz77BYOCtt95i3759/Pe//+Xf//53nhm8C2rbI0eO4O/vj729PU5OTuzYsYPT\np08XegzdOsbLkpLGH4C9vX2eaeIvXrzIlStX8qxjNpvJyckBbk4eeTsHh4J/DOzt7fOU/325qlWr\nWl4bDAbMZjO5ubk88MADREVFWT67cOECzs7OGAyGPFPc3779P6KCpv03m820bdvW8hdodna25bHH\n8L/v9Pb2NJvNLFmyxPJL6tKlS9SoUYP//Oc/loR0u9+3Z2ZmJr/++usfPknfvt9ms9ny83XrOyqo\nLW49IfTxxx9n9+7dNGrUyJIkcnNzmTZtGj4+PgBcuXIFBwcHLl++XODPfkZGBgEBAQwaNIjHHnuM\nFi1asGHDBst6BbXt74/dc+fO3fEYKg90euoP4LHHHuPzzz/n2rVrmM1m5s+fT25uLlu2bAHg119/\n5bPPPqNDhw7F2u6trvinn35Kw4YNqVmzZpFlmjRpwvXr1y3Phd+7dy8jRowAoEuXLnz88ccAxMXF\n8dNPPxUrnoqmc+fO7Nmzh99++w2TyURMTAw9e/bk+PHjfPfdd8DN0xMzZsy443Y6depkOf1x9epV\nhgwZQnx8fKHrN27cmCtXrnDixAkAPvroozynr/6ovv76a3755Rdyc3PZvn27pQd9i6enJ7t27SIj\nIwOz2cxHH31Ex44dgZu9gPj4ePbs2YOvr69l/Y8++giTyYTJZGLcuHF88sknhdZ/9uxZACZMmECn\nTp344osvCn2O0C0dOnRg586d5Obmkp6ezujRozEajYUeQ+WBehp/AK1atWLMmDE89dRT5Obm0qFD\nB7744gsWLFiAr68vOTk5TJ48mdatW1vGNazx888/4+fnR/Xq1XnttdesKmM0Glm5ciWLFy9myZIl\nVKlSheXLl2MwGJg7dy4vvPAC0dHRNG7cmIceeqiku1wheHl5cfz4cfz9/alRowYuLi5UrVqV1157\njZkzZ5KTk8N9993HK6+8csftTJw4kUWLFuHr60tWVhZPPfUUHTp04MiRIwWuf2tgdt68eZhMJu6/\n/36r268ic3V1ZdasWZw/f56OHTvy1FNPWU6Hws32OHXqFMOGDSM7O5s2bdoQEhIC3Py59fb25scf\nf7Q8RXTYsGH89NNP+Pv7k52dTY8ePXj88cdJSkoqsP6WLVvyyCOP0K9fPxwdHfnLX/5CUlLSHRNH\nYGAgZ8+exc/Pj9zcXP7617/SpEmTQo+h8kCz3IqUkvj4eE6fPs2QIUMwm808//zzPPHEE3oOTCk4\ncuQIy5cvJzy87O5t+iPRLLciZaBRo0Z8+umn+Pr6MmjQIBo1aqSEIRWeehoiIpKHehoiInJPKGmI\niIjVyuTqqTVr1vDxxx9jNBrx8fGhS5cuee6KvXXjy+eff56n3NWrV+nWrVueCe4iIiIwGo22CVxE\npJKzedI4ePAgUVFRbN26FUdHRyZOnEjTpk0tN7KYTCaGDBlS4LXrCQkJdO/enTfeeMPWYYuICGVw\neurEiRN07twZJycn7O3t6dq1K3v37rV8vnbtWlq1akXXrl3zlY2PjycpKYnBgwczbNgwjh49asvQ\nRUQqPZv3NFq3bk1kZCSXL1/G0dGRffv2WW7vT09PZ+PGjURGRhZY1sHBgf79+/PMM8+QkJDA+PHj\n2blzJ7Vr17blLoiIVFo2Txqenp4MHjyYUaNG4ezsjKenp2VKhOjoaLp164abm1uBZYOCgiyvPTw8\n8PDw4KuvvqJPnz6F1peYmHhvd0BEpBKzedJIT0+nV69elpk3161bR/369YGbc6w8++yzhZYNCwuj\nd+/eltv8oehJ79zd3e9B1CIilcetqfcLYvMxjVsPMMnKyiI9PZ2IiAj69euH2Wzm+PHjlmmMCxIX\nF0dYWBgAZ86c4ZtvvrFMOCYiIqXP5j2NFi1aMGjQIPz8/MjJyWHUqFF06NCBixcvYmdnh6OjY571\nw8PDSUlJITg4mJCQEEJCQhg4cCAGg4GlS5daNfOqiIjcG5pGRERE8rjT705NjS5/CJ1fCixROeOx\ni9hdzyHX0R5T+zrFLn/gJc2qKpWLphERERGrqachlVpJehcilZl6GiIiYjX1NETuwlt7Nxe7zN53\nP+bqrzcn5axV15leY58oVvmJvZ4qdp0i94p6GiIiYjX1NERsrLg9C5HyRD0NERGxmpKGiIhYTUlD\nRESspqQhIiJW00C4SCUxZswYzp49S6NGjXj//ffLOhypoJQ0RCqYSzeulahcjjnX8n9JtlG7mlOJ\n6pU/FiUNkUri9bdXlnUI8gegpCEiZUozFFcsGggXERGrqachIhWSZiguG+ppiIiI1dTTEBGxoZJe\n/TZtwvOc+/EnGjR8qEQXNdyrq9/KpKexZs0a+vbti6+vL6tXrwZgw4YNdOvWDT8/P/z8/Fi6dGm+\ncllZWcyaNYv+/fvj6+tLQkKCrUMXkUpuzJgx9OzZkzFjxpR1KGXC5j2NgwcPEhUVxdatW3F0dGTi\nxIl88sknJCQksGjRIrp161Zo2bCwMLKzs4mJieH06dNMnDiRmJgYHBzUYRKR4inJs1AALqWnWf4v\nyTYCu/iWqN7ycsm0zX/bnjhxgs6dO+PkdLOr1LVrV/bu3Ut8fDwZGRksXbqUFi1aMGfOHO677748\nZWNjYxk/fjwAzZs3x9XVlbi4OB577DFb74aIVFKVfWp7myeN1q1bExkZyeXLl3F0dGTfvn2YTCYa\nNWrE1KlTad68Oa+99hqhoaEsW7YsT9nk5GRcXV0t711dXblw4cId60tMTCyV/RCpbHQsVWz3qv1s\nnjQ8PT0ZPHgwo0aNwtnZGU9PT+Lj4/nHP/5hWScoKIgePXrkK2s2m/Mts7O787CMu7v73QctIjqW\nKrjitN+xY8cK/czmA+Hp6en06tWLHTt2sGnTJqpXr07dunUJD//f3Zlms7nAcQo3NzdSU1Mt71NT\nU3Fzc7NJ3CIiUgZJIykpifHjx5OVlUV6ejoRERH4+fmxfPlyvvvuOwA2btxIr1698pX19vYmMjIS\ngDNnznDu3Dk8PDxsGr+ISGVm89NTLVq0YNCgQfj5+ZGTk8OoUaPo0KEDr732GtOmTSMrK4umTZvy\nyiuvABAeHk5KSgrBwcGMGDGCBQsWMGDAAAwGA0uWLMFoNNp6F0REKq0yuVY1KCiIoKCgPMu8vLzw\n8vLKt25g4P8mMzMajSxevLjU4xMRkYJpGhEREbGakoaIiFhNSUNERKympCEiIlZT0hAREaspaYiI\niNWUNERExGpKGiIiYjUlDRERsZqShoiIWE1JQ0RErKakISIiVlPSEBERqxU5y63JZOLw4cP88MMP\nGAwGmjZtiqenJ/b29raIT0REypE7Jo2wsDBWr16Nm5sbDRs2JDs7m+3bt5Oamsr48eMZMWKEreIU\nEZFyoNCk8fzzz9O0aVMiIiJ44IEH8nyWnJzMBx98wIQJE3j77bdLPUgRESkfCk0a06dPp2HDhgV+\n5ubmxvTp0zl79mxpxSUiIuVQoQPhBSUMk8lERkaG5X2jRo1KJSgRESmfrL56avfu3Xh7e9O9e3fW\nrFlzV5WuWbOGvn374uvry+rVqwE4cOAA/v7+DBo0iFGjRnHu3Ll85a5evcqf//xn/Pz8LP9MJtNd\nxSIiItYr9PTU9evXcXR0tLyPjo5m//79AAQEBDBu3LgSVXjw4EGioqLYunUrjo6OTJw4kd27d7N4\n8WI++OADGjduzJYtW1i4cGG+5JSQkED37t154403SlS3iIjcnUJ7GhMmTGD37t2W946Ojqxdu5aN\nGzdStWrVEld44sQJOnfujJOTE/b29nTt2pXdu3cze/ZsGjduDIC7uzu//PJLvrLx8fEkJSUxePBg\nhg0bxtGjR0sch4iIFF+hPY13332XtWvXMn78eGbOnMncuXNZv349JpOJ5cuXl7jC1q1bExkZyeXL\nl3F0dGTfvn2YzWZ8fHwAyM7OZsWKFfTs2TN/sA4O9O/fn2eeeYaEhATGjx/Pzp07qV27dqH1JSYm\nljhWEfkfHUsV271qv0KTRpUqVQgKCiIpKYlXX32V+vXrM3ny5DynrErC09OTwYMHM2rUKJydnfH0\n9CQ+Ph6AGzduMGPGDMxmMxMmTMhXNigoyPLaw8MDDw8PvvrqK/r06VNofe7u7ncVr4jcpGOpYitO\n+x07dqzQzwo9PZWRkcHmzZs5dOgQS5cu5S9/+QtjxozJc8qqJNLT0+nVqxc7duxg06ZNVK9enfr1\n65OWlsbo0aOpWrUqq1evxmg05isbFhZGSkpKnmW6M11ExHYKTRrPP/88p06d4t///jdz5syhZ8+e\nrF+/nlOnTvHcc8+VuMKkpCTGjx9PVlYW6enpRERE0K9fPyZMmEDbtm1ZtmxZgQkDIC4ujrCwMADO\nnDnDN998Q8eOHUsci4iIFE+hp6dSU1N5//33AfDz8wOgatWqTJ06lR9++KHEFbZo0YJBgwbh5+dH\nTk4Oo0aNIjMzk2PHjnHt2jVLXS4uLqxfv57w8HBSUlIIDg4mJCSEkJAQBg4ciMFgYOnSpdSsWbPE\nsYiISPEUmjQaNmzI2LFj+e2332jfvn2ez5o0aXJXlQYFBeUZnwA4depUgesGBgZaXtepU+eu7xER\nEZGSKzRpLF++nC+//BKj0cj//d//2TImEREppwpNGufPn8fb2/uOhc+dO0eDBg3udUwiIlJOFToQ\n/uqrr/Lmm2+Smpqa77Nff/2VN954g8WLF5dqcCIiUr4U2tNYtWoVYWFhBAQE8Kc//YkHH3yQnJwc\nfvrpJ1JSUggKCiI4ONiWsYqISBkrNGkYDAZGjBjB0KFDOXToEN9//z12dnYEBATg6elJlSpVbBmn\niIiUA0U+7tVoNOLl5YWXl5ct4hERkXLM6qnRRURElDRERMRqShoiImK1IpNGnz592LBhA+np6baI\nR0REyrEik8arr77KiRMn6NmzJ/PmzeP06dO2iEtERMqhIq+eevTRR3n00UdJS0sjKiqKiRMnUq9e\nPUaPHk2vXr1sEaOIiJQTVo1p/Pbbb8TGxrJnzx4MBgOdO3dmw4YNzJ07t7TjExGRcqTInsaiRYuI\njo6mTZs2BAUF4eXlhcFg4Nlnn6VLly4sXLjQFnGKiEg5UGTSyM3NJTw8nKZNm+ZZbjQaWbFiRakF\nJiIi5U+Rp6eCg4OJiYkBbs5qO3fuXMuVVJ6enqUbnYiIlCtFJo3Zs2dbkoSzszOOjo7Mmzev1AMT\nEZHyp8ik8eOPP/Liiy8C4OTkREhICGfOnCn1wEREpPwpMmmYTCZu3LhheX/jxg3MZvNdVbpmzRr6\n9u2Lr68vq1evBuD06dMMGTKEfv36ERwczPXr1/OVy8rKYtasWfTv3x9fX18SEhLuKg4RESmeIpNG\n7969GTVqFBs3bmTTpk08/fTT9O7du8QVHjx4kKioKLZu3cr27duJj4/nk08+YebMmUybNo09e/bQ\nqFEjSzK5XVhYGNnZ2cTExLBs2TKmT59OdnZ2iWMREZHiKTJpTJs2DT8/Pw4dOsTRo0fx9/dn8uTJ\nJa7wxIkTdO7cGScnJ+zt7enatSubNm0iLS3NMrA+ePBgdu/ena9sbGws/v7+ADRv3hxXV1fi4uJK\nHMu9MGbMGHr27MmYMWMqRb0iUrkVecmtnZ0dw4cPZ/jw4ZZl6enp1KxZs0QVtm7dmsjISC5fvoyj\noyP79u2jSpUquLm5WdZxdXUlOTk5X9nk5GRcXV3zrHfhwoU71peYmGhVXM99FGrlHvxOA6h2Fk42\nuE7nlwKLXfypLr4lqvZSeprl/7f2bi52+R71Hy1RvVJ5WXssSfl0r9qvyKTx+eefs3z5cq5du4bZ\nbCY3N5fLly/zn//8p0QVenp6MnjwYEaNGoWzszOenp4cPnw433oGgyHfsoLGUuzs7txZcnd3L1Gc\n1jIeu2j539S+TqnWdbteY5+4q/Il+V7GjBnD2bNnAWjUqBHvv//+XcUgFUtpH0tSuorTfseOHSv0\nsyKTxssvv8xzzz3Htm3bePbZZ9mzZw8PPPCA1ZX/Xnp6Or169eKZZ54BYN26ddSvXz9P4khJSaFe\nvXr5yrq5uZGammq50TA1NTVPD6Us2DJR3EuXblwrdpkcc26e18XdRu1qTsWuU0TKlyLHNIxGI0OG\nDOHRRx/FxcWFJUuWcODAgRJXmJSUxPjx48nKyiI9PZ2IiAj8/f2pWbMmhw4dAuDjjz8u8PGy3t7e\nREZGAnDmzBnOnTuHh4dHiWOR4nn97ZVE7NpOxK7tvP72yrIOR0TKQJFJw9HREYAGDRrw/fffU6VK\nFXJzc4soVbgWLVowaNAg/Pz8eOKJJxg+fDgdOnRg2bJlLF++HB8fH06cOEFwcDAA4eHhlulKRowY\ngYODAwMGDGDKlCksWbIEo9FY4lhERKR4ijw99fDDDzNnzhxGjx7N1KlTuXz58l0lDYCgoCCCgoLy\nLGvevDkfffRRvnUDA/83uGw0Glm8ePFd1S0iIiVXZE9j7ty5dO/enYcffpjRo0dz/PhxQkNLeKWR\niIhUaEX2NBYsWMDLL78MwJAhQxgyZEipByUiIuVTkT2N48eP2yIOERGpAIrsadSvX5/AwEDat29P\ntWrVLMsnTZpUqoGJiEj5U2TScHFxwcXFhYsXL9oiHhERKceKTBqvvPKKLeIQEZEKoMikMXLkyAKn\n9Ni4cWOpBCQiIuVXkUkjICDA8jorK4u9e/fSpk2bUg1KRETKpyKTxq2pyG9/P2rUqLuaHl1ERCqm\nIi+5/T17e3t+/fXX0ohFRETKuSJ7GiEhIXnenzx5kiZNmpRaQCIiUn5ZdZ/G7dq0acOgQYNKLSAR\nESm/ikwakyZN4sCBA3Tu3JlLly5x4MCBEj+1T0REKrYixzSWLVvGO++8A4DJZOKDDz7gzTffLPXA\nRESk/Ckyaezbt8/yWM969eqxadMm9uzZU+qBiYhI+VNk0sjOzs7zoCOj0VjgzX4iIvLHV+SYhru7\nO6GhoQwdOhSDwcC2bdto3ry5LWITEZFypsiexrx587h48SLDhw9n9OjR/Prrr8yZM8cWsYmISDlT\nZE+jdu3aLFy4kFq1anHjxg0uXryIi4vLXVUaFRXFmjVrMBgMtGnThsDAQObOnWv5PC0tDYDPP/88\nT7mrV6/SrVs3GjZsaFkWERGh54SLiNhIkT2NnTt3WuafOn/+PE888QR79+4tcYUZGRksWrSIDRs2\nsGPHDtLS0jh16hRRUVFERUURERGBs7MzixYtylc2ISGB7t27W9aNiopSwhARsaEik8aaNWssM9o2\nadKEyMhI3nrrrRJXmJubS25uLpmZmeTk5GAymahatarl87Vr19KqVSu6du2ar2x8fDxJSUkMHjyY\nYcOGcfTo0RLHISIixVfk6anc3Fz+9Kc/Wd4/8MADmM3mElfo5OTElClT8PHxwdHRkWbNmjFgwAAA\n0tPT2bhxI5GRkQUH6+BA//79eeaZZ0hISGD8+PHs3LmT2rVrlzgeERGxXpFJw8nJiT179tCvXz8A\n9u7di5OTU4krTExMZMuWLcTGxlKrVi1mzJjBO++8w8SJE4mOjqZbt264ubkVWDYoKMjy2sPDAw8P\nD7766iv69Olzx/qkfFBbVGxqv4rtXrVfkUlj9uzZTJgwgXnz5gE3k8iqVatKXOGBAwfo1KkTdevW\nBW4+r2PTpk3AzYT07LPPFlo2LCyM3r174+rqallmb29/x/rc3d1LHKvcW2qLik3tV7EVp/2OHTtW\n6GdFjmk88sgjxMbGsn79ej744AO6devG8OHDra7891q2bMnhw4fJyMgAYP/+/TzyyCOYzWaOHz9O\nu3btCi0bFxdHWFgYAGfOnOGbb76hY8eOJY5FRESKp8ieBsDx48dZu3Ytn332GS1atODvf/97iSvs\n0qULiYmJBAQEYDQaad26NS+88AKXLl3Czs4OR0fHPOuHh4eTkpJCcHAwISEhhISEMHDgQAwGA0uX\nLtXkiSIiNnTHpLF3717ee+89Tpw4gbe3Ny4uLmzbtu2uKx07dixjx47Ns6x69eocOXIk37qBgYGW\n13Xq1GHNmjV3Xb+IiJRMoUmjb9++VK9eHX9/f1avXo2Liws9e/a0ZWwiIlLOFDqmceveibS0NNLT\n020WkIiIlF+F9jSio6M5evQomzZton///rRr147MzEyysrKoUqWKLWMUEZFy4o5XTz322GOsXLmS\nTz/9lD9aW1XSAAASsklEQVT/+c+YzWZ69OjBe++9Z6v4RESkHCnyklu4eRf4tGnT+Pzzz5k6dSq7\nd+8u7bhERKQcsipp3GI0GgkICLgnV1CJiEjFU6ykISIilZuShoiIWE1JQ0RErKakISIiVlPSEBER\nqylpiIiI1ZQ0RETEakoaIiJiNSUNERGxmpKGiIhYTUlDRESspqQhIiJWs+oZ4fdaVFQUa9aswWAw\n0KZNG0JDQ9m8eTPvv/8+Li4uAHTt2pUZM2bkKZeVlcX8+fOJi4vDwcGBxYsX4+HhURa7ICJSKdk8\naWRkZLBo0SJiYmKoU6cOEyZMYPv27SQkJLBo0SK6detWaNmwsDCys7OJiYnh9OnTTJw4kZiYGBwc\nyiT3iYhUOjY/PZWbm0tubi6ZmZnk5ORgMpmoWrUq8fHxbN68mUGDBjFz5kyuXLmSr2xsbCz+/v4A\nNG/eHFdXV+Li4my9CyIilZbNk4aTkxNTpkzBx8eHzp07c+PGDfr160ejRo2YOnUqUVFR1K1bl9DQ\n0Hxlk5OTcXV1tbx3dXXlwoULtgxfRKRSs/l5ncTERLZs2UJsbCy1atVixowZvPvuu3keIRsUFESP\nHj3ylTWbzfmW2dndOe8lJibefdByT6gtKja1X8V2r9rP5knjwIEDdOrUibp16wIQEBDAunXrqF27\nNoGBgcDN5FDQOIWbmxupqak0bdoUgNTUVNzc3O5Yn7u7+z3eAykptUXFpvar2IrTfseOHSv0M5uf\nnmrZsiWHDx8mIyMDgP3799O0aVOWL1/Od999B8DGjRvp1atXvrLe3t5ERkYCcObMGc6dO6erp0RE\nbMjmPY0uXbqQmJhIQEAARqOR1q1bM3/+fLy8vJg2bRpZWVk0bdqUV155BYDw8HBSUlIIDg5mxIgR\nLFiwgAEDBmAwGFiyZAlGo9HWuyAiUmmVybWqY8eOZezYsXmWeXl54eXllW/dW6esAIxGI4sXLy71\n+EREpGC6I1xERKympCEiIlZT0hAREaspaYiIiNWUNERExGpKGiIiYjUlDRERsZqShoiIWE1JQ0RE\nrKakISIiVlPSEBERqylpiIiI1ZQ0RETEakoaIiJiNSUNERGxmpKGiIhYTUlDRESspqQhIiJWK5PH\nvUZFRbFmzRoMBgNt2rQhNDSUf//73yxdupScnBycnZ1ZvHgxDRo0yFPu6tWrdOvWjYYNG1qWRURE\n6DnhIiI2YvOkkZGRwaJFi4iJiaFOnTpMmDCBjz/+mFWrVvHBBx/QuHFjtmzZwsKFC1mzZk2esgkJ\nCXTv3p033njD1mGLiAhlcHoqNzeX3NxcMjMzycnJwWQyUbVqVWbPnk3jxo0BcHd355dffslXNj4+\nnqSkJAYPHsywYcM4evSorcMXEanUbN7TcHJyYsqUKfj4+ODo6EizZs0YNGgQ9vb2AGRnZ7NixQp6\n9uyZP1gHB/r3788zzzxDQkIC48ePZ+fOndSuXdvWuyEiUinZPGkkJiayZcsWYmNjqVWrFjNmzOCd\nd95h4sSJ3LhxgxkzZmA2m5kwYUK+skFBQZbXHh4eeHh48NVXX9GnT5871iflg9qiYlP7VWz3qv1s\nnjQOHDhAp06dqFu3LgABAQFs2rSJtLQ0goKCePDBB3n99dcLHNwOCwujd+/euLq6Wpbd6qEUxt3d\n/d7ugJSY2qJiU/tVbMVpv2PHjhX6mc3HNFq2bMnhw4fJyMgAYP/+/TzyyCNMmDCBtm3bsmzZskKv\nhoqLiyMsLAyAM2fO8M0339CxY0ebxS4iUtnZvKfRpUsXEhMTCQgIwGg00rp1a7y9vXnnnXe4du0a\nfn5+ALi4uLB+/XrCw8NJSUkhODiYkJAQQkJCGDhwIAaDgaVLl1KzZk1b74KISKVVJvdpjB07lrFj\nx+ZZdurUqQLXDQwMtLyuU6dOvstwRUTEdnRHuIiIWE1JQ0RErKakISIiVlPSEBERqylpiIiI1ZQ0\nRETEakoaIiJiNSUNERGxmpKGiIhYTUlDRESspqQhIiJWU9IQERGrKWmIiIjVlDRERMRqShoiImI1\nJQ0REbGakoaIiFhNSUNERKympCEiIlYrk6QRFRXFgAEDGDhwICEhIWRlZXH69GmGDBlCv379CA4O\n5vr16/nKZWVlMWvWLPr374+vry8JCQllEL2ISOVl86SRkZHBokWL2LBhAzt27CAtLY3t27czc+ZM\npk2bxp49e2jUqBGrV6/OVzYsLIzs7GxiYmJYtmwZ06dPJzs729a7ICJSadk8aeTm5pKbm0tmZiY5\nOTmYTCYcHBxIS0vD09MTgMGDB7N79+58ZWNjY/H39wegefPmuLq6EhcXZ9P4RUQqMwdbV+jk5MSU\nKVPw8fHB0dGRZs2a8dBDD+Hm5mZZx9XVleTk5Hxlk5OTcXV1zbPehQsX7ljfsWPHrIprpe8MK/fg\nj+G/3562fZ2luO3K1H5l0XZQeu1XmdoOKn772TxpJCYmsmXLFmJjY6lVqxYzZszg0KFD+dYzGAz5\nlpnN5nzL7OwK7yy1b9/+7oIVEZE8bH566sCBA3Tq1Im6detiNBoJCAjgyJEjpKamWtZJSUmhXr16\n+cq6ubnlWS81NTVPD0VEREqXzZNGy5YtOXz4MBkZGQDs37+fdu3aUbNmTUuP4+OPP8bLyytfWW9v\nbyIjIwE4c+YM586dw8PDw3bBi4hUcgZzQed8Stm7777L1q1bMRqNtG7dmvnz53Pu3Dnmzp3LtWvX\nePDBB1m2bBlOTk6Eh4eTkpJCcHAwJpOJBQsW8J///AeDwcDs2bMtg+ciIlL6yiRpiIhIxWTzgfDK\n7ueff6Zfv340bdoUs9lMdnY27u7uLFq0iOrVqxdY5rPPPiMhIYGpU6fSo0cP1q1bR8OGDfOs8+KL\nL9K+fXuGDBmCn58f27Ztw97e3ha7VKnd3p5w82KN9PR0vLy8mDt37h0v1ChIbGws33//Pc8991xp\nhFsp3Os2gZK1y9GjR3n99dfJyMggJyeH9u3bM2vWLKpVq1ZomXPnzvHWW2+xZMmSYsdoK0oaZaB2\n7dpERUUBN3+gJ0yYQEREBKNGjSpw/Z49e9KzZ0+rt39r22Ibt7cnQHp6OgMHDqR79+5069atWNv6\n5ptvdMPqPXAv2wSK3y4mk4kpU6awefNmGjZsSE5ODnPnzmXFihW88MILhZY7f/48P/74Y7HjsyXN\nPVXGsrKyMJlM3H///bz44otERERYPuvWrRs///wz27ZtY8aMvNeym0wmQkJC6Nu3LyNHjuTcuXOW\nz1q0aEF2djarVq1i1qxZjBo1ip49e7JgwQLLOitWrKBPnz488cQT/O1vf2PVqlWlv7OVxKVLl8jM\nzMTZ2Znt27fj6+uLr68vkydP5tKlSwD06NHD8svhxx9/pEePHpw6dYoPP/yQrVu3EhYWRmZmJrNn\nz8bf3x9fX182bdpUlrtVod3eJkCpt0tmZibXrl2zXPBjb2/PlClT6Nu3ryWeyZMnExAQwOOPP265\nmTk0NJSTJ08SEhJS6t9JSamnUQYuXbqEn58fAL/88gv16tWjc+fOfP7551ZvY/PmzaSnpxMTE8PF\nixct2/u9xMREPvzwQ7Kzs+nbty9PPvkk58+f54svviA6Ohqz2cyIESNo0KDBPdm3yuhWe5pMJi5f\nvkyzZs2YP38+1apVY+XKlURERFCnTh3eeustQkNDWb58eYHbadGiBU8++STZ2dkMHz6c119/nUaN\nGrF48WJu3LjB008/TatWrXT/kRUKaxMPDw9Onz5d6u1y3333MWnSJIYOHcpDDz1Ehw4d6NGjh6WX\ns3jxYvr374+Pjw9Xr15l6NChtGnThnnz5rF8+XJeeeUVm3xPJaGkUQZu7zrn5OSwdOlSpk6dyv33\n32/1Ng4fPkxAQAB2dnbcf//9eHt7F7hep06dqFq1KlWrVuXBBx/kypUrHDhwgIEDB1rOrQ4aNIir\nV6/e9X5VVrfa02w2s3r1av75z3/SrVs3oqKi8Pb2pk6dOgAEBgbSr18/q7f7r3/9i8zMTHbu3Anc\nnLft5MmTShpWKKxN4OZYgy3aZdy4cQwdOpSDBw9y+PBhZs6ciZ+fH7NmzeJf//oX3333Hf/4xz+A\nm2cOvvvuO2rUqHEvdr9UKWmUMXt7e/z8/Bg6dCgDBgzIc9d7UedQc3NzLa8dHApuSqPRaHltMBgw\nm83Y2dkVeHe93B2DwcBf//pXDh06xGuvvcbDDz+cb53b2/RWGxTWzmazmSVLltC2bVvg5l/PFeGX\nSnny+zYJDQ3Nc9zccq/bJS4ujm+//ZYRI0bg4+ODj48Po0aNYvDgwcyaNQuz2czatWupW7cucPNG\nZWdnZ77++ut7st+lSWMa5cCXX35Jq1atcHZ25uTJkwB89dVXXLx4sdAyXbt2Zfv27eTk5HD58uVi\nndrq3Lkze/bs4bfffsNkMhETE1PgtC1SfLfuH9q2bRuPPfYYsbGxlnbcvHkzHTt2BMDFxcXS1p98\n8omlvL29PTk5OcDNXmJYWBhms5mrV68yZMgQ4uPjbbxHFd/tbXLy5Ek6dOhQ6u3i4uLCm2++ybff\nfmtZdubMGdzd3S3buDUWcv78eQYOHMiFCxfy1FNeqadRBm4f08jKyqJu3bosWbIEe3t7pk2bxsCB\nA2nTpg0tW7YsdBvDhg3jzJkz+Pj4UKdOHZo3b251/V5eXhw/fhx/f39q1KiBi4sLVatWvev9kpta\ntmyJn58fL7/8MpMnT+bpp58mJyeHhx56iNDQUACmTJnCwoULWbNmDb169bKU7dixI8HBwdSqVYuJ\nEyeyaNEifH19ycrK4qmnnqJDhw5ltVsV2u1tsnHjxlJvl0aNGvH3v/+dl156ibS0NOzt7WnWrBlv\nvPEGAHPnzmX+/Pn4+vqSk5PDrFmzaNCgAU5OTly9epVJkybx5ptv2u4LKgbd3FcJxcfHWx56ZTab\nef7553niiScKHRcREblFSaMSunLlCjNnzuSXX34Bbs7pNX369DKOSkQqAiUNERGxmgbCRUTEakoa\nIiJiNSUNERGxmi65Ffn/WrRoQdOmTalSpQpms5msrCy6du3KtGnT7jgz6S3PPfccCxYsoH79+lbX\n+eqrr3Lw4EEAfvrpJ+677z7uu+8+4OZlmX/5y19KtjMipUQD4SL/X4sWLfjss8948MEHgZvTQ0yf\nPh2j0cjKlSuLXb64Ro4cib+/PwEBASUqL2IL6mmIFKJGjRrMmzePHj168N///pfGjRuzbt06du3a\nRXZ2NmlpaQwbNoy//vWvlkuWx40bxxtvvIGjoyMLFiwgPT2d1NRU6tSpw9KlS3nooYesrj85OZk+\nffqwd+9ey7xkY8aMwdfXl9zcXKKjo3FwcCA5OZkqVarw0ksvWR5//N5777Fr1y7MZjP33Xcfs2fP\nLtYNoCKF0ZiGyB386U9/skzvkpyczKeffsr69evZvn0769evZ+XKlaSlpbFs2TIA1qxZQ4sWLYiI\niKBnz558+OGHfPbZZzRo0KDYU5u7ubnRo0cPPv74Y+DmVN2JiYn4+PgAcOzYMaZNm8aOHTsYPXo0\nkydPJjs7m6ioKI4dO8ZHH33E9u3bCQoKIigoqMA5l0SKSz0NkSIYDAaqV6+Om5sbK1as4JNPPuGn\nn37ihx9+IDc3l4yMDMtzGm6ZOnUqhw8fZt26dfz0008kJCSUaHxi5MiRzJgxg3HjxvHhhx8yePBg\nyySUHTt2pFWrVgD4+fmxcOFCTp06xb59+zhx4gRDhgyxbCcrK4vz58+X+NSZyC1KGiJ3kJSUxOXL\nl3n44Yc5efIkY8eOZeTIkbRr146hQ4fyz3/+s8AZg//2t79x5coVBgwYwGOPPYajo+MdJ6AsTLt2\n7XB2dubLL79kx44deR7S9fuZjXNzc7G3t8dsNjNy5EjLo0nNZrPluS0id0unp0QKkZaWRmhoKAMH\nDqR+/focPnyYpk2bMm7cOLp168a+ffuA/01Rf/sMpbGxsYwbNw4/Pz/q1avH/v37Szx76ciRIwkN\nDeXPf/4zDzzwgGX5kSNHLE+Z2759u2Xiyq5du7J161bL0+i2bt1KYGBguZ89VSoG9TREbjNu3Diq\nVKmCwWAgJycHb29vJk+eDICvry+fffYZffv2pVq1ari7u9OwYUPOnj3LQw89RN++fXn22Wd59dVX\nmTFjBi+88AK1atXCzs6ORx99lNOnT5copgEDBhAaGsrw4cPzLK9Xrx6hoaGkpKRQq1YtVq9ejZ2d\nHYMHD+bixYuMHDkSOzs7atSowTvvvEOVKlXu+vsR0SW3IuXcF198wdKlS4mOjrYs27ZtG5GRkXpu\nuNicehoi5djTTz/Nzz//zNKlS8s6FBFAPQ0RESkGDYSLiIjVlDRERMRqShoiImI1JQ0REbGakoaI\niFhNSUNERKz2/wC2BHoX53CmZwAAAABJRU5ErkJggg==\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "pal = sns.light_palette(\"seagreen\", n_colors=3, reverse=True)\n", "plot = sns.barplot(x=\"Data Type\", y=\"Accuracy\", hue='Metrics', palette=pal, errwidth=1, capsize=0.02, data=results)\n", "plot.set_ylim(80, 100)\n", "plot.legend(loc='upper center', bbox_to_anchor=(0.5, 1.0), ncol=3)\n", "plot.set_ylabel('Accuracy (%)')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Saving the chart above to `Fig4.eps` to be included in the paper:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": true }, "outputs": [], "source": [ "plot.figure.savefig(\"figures/Fig4.eps\")" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "### Charting the importance of features\n", "\n", "In this section, we explore the relevance of each features in classifying the data quality of CollabMap buildings, routes, and route sets. To do so, we analyse the feature importance values provided by the decision tree training done above - the `importances` data frame." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Rename the columns with Math notation for consistency with the metrics symbols in the paper\n", "feature_name_maths_mapping = {\n", " \"entities\": \"$n_e$\", \"agents\": \"$n_{ag}$\", \"activities\": \"$n_a$\", \"nodes\": \"$n$\", \"edges\": \"$e$\",\n", " \"diameter\": \"$d$\", \"assortativity\": \"$r$\", \"acc\": \"$\\\\mathsf{ACC}$\",\n", " \"acc_e\": \"$\\\\mathsf{ACC}_e$\", \"acc_a\": \"$\\\\mathsf{ACC}_a$\", \"acc_ag\": \"$\\\\mathsf{ACC}_{ag}$\",\n", " \"mfd_e_e\": \"$\\\\mathrm{mfd}_{e \\\\rightarrow e}$\", \"mfd_e_a\": \"$\\\\mathrm{mfd}_{e \\\\rightarrow a}$\",\n", " \"mfd_e_ag\": \"$\\\\mathrm{mfd}_{e \\\\rightarrow ag}$\", \"mfd_a_e\": \"$\\\\mathrm{mfd}_{a \\\\rightarrow e}$\",\n", " \"mfd_a_a\": \"$\\\\mathrm{mfd}_{a \\\\rightarrow a}$\", \"mfd_a_ag\": \"$\\\\mathrm{mfd}_{a \\\\rightarrow ag}$\",\n", " \"mfd_ag_e\": \"$\\\\mathrm{mfd}_{ag \\\\rightarrow e}$\", \"mfd_ag_a\": \"$\\\\mathrm{mfd}_{ag \\\\rightarrow a}$\",\n", " \"mfd_ag_ag\": \"$\\\\mathrm{mfd}_{ag \\\\rightarrow ag}$\", \"mfd_der\": \"$\\\\mathrm{mfd}_\\\\mathit{der}$\", \"powerlaw_alpha\": \"$\\\\alpha$\"\n", "}\n", "importances.rename(columns=feature_name_maths_mapping, inplace=True)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": true }, "outputs": [], "source": [ "grouped =importances.groupby(\"Data Type\") # Grouping the importance values by data type" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Building Axes(0.1,0.15;0.235294x0.75)\n", "Route Axes(0.382353,0.15;0.235294x0.75)\n", "Route Set Axes(0.664706,0.15;0.235294x0.75)\n", "dtype: object" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAA8IAAAF0CAYAAADo54DsAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3Xl0U3X+//EXhdIFEFQEdXRYtZWWtbWAIgpY9Mc6Ioii\nBUE8FGdGhUGWQRGVryyiILIKjqg4bJVFKIsIAwIzgtQBLJUZgaqgglbABdq00Pv7g2kkdElumpuk\nuc/HOR7JzefevG9v+up9JzefVDIMwxAAAAAAADYRFugCAAAAAADwJxphAAAAAICt0AgDAAAAAGyF\nRhgAAAAAYCs0wgAAAAAAW6ERBgAgBPGlEABCVTDmWzDWhLLRCMNrHTt2VExMjPO/uLg4dezYUdOm\nTVNBQYGp7UydOlWStGLFCsXExMjhcJQ6PiUlRcOGDZMk7dq1SzExMTp8+HD5dgYALJSSkuKSlzEx\nMYqPj1f79u01duxY/fTTTz59vBkzZmjFihU+3SYAXMrf2Sb5Lt+OHDmi4cOHq23btoqPj1fHjh01\nYcIEnTx50tR2fvrpJz355JPKzs4ud03wryqBLgAVW8+ePdWvXz9JksPh0MGDBzV9+nSdP39eI0aM\n8GgbM2fO1OWXX+7V48fFxWnp0qW67rrrvFofAPzllltu0RNPPOG8nZeXp/3792vWrFn66aefNHPm\nTJ891qxZszR+/HifbQ8ASuPPbJN8k2/Hjx/XAw88oLi4OI0fP161atXSkSNHNG/ePP3rX//SihUr\nFBER4dG2Dh48qPXr1+vPf/5zuWqC/9EIo1zq1KmjFi1aOG+3bt1aOTk5WrZsmceNcJMmTbx+/OrV\nq7s8PgAEq1q1ahXLqzZt2ujMmTOaN2+ezpw5o2rVqgWoOgDwTkXMtrS0NIWHh2vevHkKDw+XdOEc\ntnnz5rrnnnu0YcMG9ezZM8BVwmpcGg2fq1GjhvPfo0eP1n333edy/+LFixUTE+O8ffGl0ZcqKCjQ\nxIkT1aZNGyUlJWnevHku9196aXRKSoqmTp2qyZMnq02bNmrZsqX+8pe/6Ndff3Wuk5ubq3Hjxql1\n69ZKSkrS5MmTNWbMGI0ePbrc+w4AZlWvXl3Sb58vW7dune655x41b95cnTp10vz5810+exYTE6PF\nixe7bOO+++5zZlhRvo4fP14pKSnOMe+9957uvvtuxcfHq0uXLlq/fr2l+wXA3i7NNil48u3HH38s\nVpt04c2ZkSNHqkGDBs5lv/zyi5555hm1bt1aLVq0UGpqqr755htJF85D+/fvL0nq0qWLXnvtNQ9+\nMggWNMIoF8MwdO7cOZ07d055eXnau3evFi9erL59+/pk+xMmTNDy5cv1+OOPa+LEidq4caM+/fTT\nMtdZvHixsrOzNWXKFI0cOVIffPCB5syZ47x/zJgx2rhxo0aMGKEXX3xRO3bs0Nq1a31SLwCU5uK8\nPHfunH755Rdt27ZNb775ptq3b6/q1atr0aJFGj58uJKSkjRr1izdc889evXVV/XSSy95/DhLly6V\nJA0ePFjPPvusc9nTTz+tjh07as6cObr11ls1bNgwbdmyxZJ9BWAfnmSbpKDKt1tvvVU//PCDHnjg\nAS1dulRHjx513vfII4+oWbNmkqTCwkINGTJE27Zt05gxYzR16lT98MMP6t+/v86cOaO4uDiNGzdO\nkjRt2jT16dPH3A8PAcWl0SiXBQsWaMGCBS7LGjVqpEceeaTc2z59+rSWL1+usWPHOj+HHB8fr06d\nOpW5XrVq1TRz5kxVqXLh6b1nzx599NFHeuqpp5Sdna3169frtddeU+fOnSVJTZs21Z133lnuegGg\nLOvXry/2LkW1atV01113afTo0Tp//rxee+019e7dW2PGjJEktWvXTpUqVdKcOXM0ePBgXXHFFW4f\np+gSxeuuu06NGzdWYWGhZsyYoT59+mjkyJGSpNtuu02nT5/Wq6++qo4dO/p4TwHYibtskxR0+Xbn\nnXfqqaee0muvveZsZH/3u98pOTlZgwcP1lVXXSVJ2r59uzIyMrR06VLnYyclJemOO+7Q8uXL9fDD\nD6tx48aSLrxbffXVV5v62SGweEcY5XLPPfcoLS1NaWlpWrx4sSZNmiSHw6GHH35Y+fn55dr2vn37\ndP78ebVv3965rG7dumrevHmZ68XFxTmb4KJ1zp49K0n65JNPFBYWpg4dOrjc37Jly3LVCgDutGvX\nTmlpaVq+fLnGjRunyMhI9erVSy+++KJq1qypI0eO6PTp07r77rtd1uvSpYsKCgq0b98+rx43Oztb\nOTk5at++vcu7NrfddpsOHjyo06dP+2L3ANiUu2yTFJT5NnjwYG3fvl1TpkxR9+7dlZ+fr4ULF6pb\nt2764osvJEm7d+9WrVq1FB8f79x2dHS0WrRooV27dnlVM4IH7wijXGrXrq2mTZs6b7dq1UoNGjRQ\n3759tXnz5nJt++eff5akYjNKX3nllWWuFxkZ6XI7LCzM+RmQU6dOqUaNGs6JEYrUrl27XLUCgDuX\nXXaZMy+bNWumatWqadSoUbrqqqs0ZMgQ59eMXJpHRZl38VwHZpw6dUqS9Mc//rHE+3NyclSrVi2v\ntg0A7rJNUtDm22WXXaaePXuqZ8+eMgxDH374oUaOHKnp06dr1qxZOn36tE6fPq24uLhi6158/ouK\niUYYPlc0kcHRo0dVqVIlnT9/3uX+ondn3Sl6FfHHH390fr5EuhCmnlw+U5I6dero559/Vn5+vqpW\nrepcfvLkSS5nAeBXf/jDH7R69WrNnDlTd999tzPzcnJyXMYV3S66X7rwubWLlZWrl112mSRp4sSJ\nuuGGG4rdz9fPAfClS7OtXr16QZVv58+fV6dOnTR48GA99NBDzuWVKlVScnKyevTooU8++cS5/d/9\n7nd69dVXi20nKiqq1LpQMXBpNHzuwIEDki6ET3R0tE6cOOEyK19GRoZH22nZsqXCw8P1wQcfOJed\nPn1a+/fv97q2li1bqlKlStq6datz2alTp7R3716vtwkA3hozZozOnTunl19+WQ0bNlStWrW0YcMG\nlzHr169X5cqVnZO3REdH6/jx4877T548qS+//NJlnbCw3/68F203JydHTZs2df73n//8R6+//rrL\nWADwhYuzTVJQ5VvlypVVu3ZtLVu2TA6Ho9j9X331lfNzvy1bttT333+vK6+80rnt+Ph4vfnmm9q5\nc6dze6iYeEcY5fL99987m0jDMHTs2DFNnz5d119/vTp16qSoqCgtWrRIkyZNUocOHbR161aPG+Ea\nNWro4Ycf1muvvabw8HDVr19fr7/+erF3mM2oX7++unTponHjxunnn3/WFVdcoXnz5snhcKhSpUpe\nbxcAvHHjjTfqD3/4g1asWKH9+/frscce08SJE1WtWjW1b99ee/fu1Zw5c5SSkuK8vO+2227TsmXL\nFBMTo2rVqmn27NnFvqPzsssu0+7du9WyZUvFxsZqyJAhmj59ugoKCpSQkKCDBw9q2rRp6tGjh8vV\nMQDgCxdn27///W+1bNkyqPJt9OjRGjhwoPr06aOUlBTVr19fP/30k1atWqV9+/ZpyZIlkqQOHTro\nhhtu0ODBg/XYY4/piiuu0LJly7Rp0yY98MADkn772tDNmzcrIiKCq2wqEBphlMvq1au1evVqSRde\nobv88svVpk0bDR8+XBEREerQoYMef/xxLV68WEuXLtXtt9+uZ599VsOGDfNo+8OHD1d4eLjmz5+v\n3Nxc3Xvvvc6Z/Lz13HPP6cUXX9SkSZMUFhamvn37qkqVKoqOji7XdgHAG08++aTWrVunKVOmaPHi\nxYqIiNCbb76pd999V9dcc42GDRumQYMGOcePHTtWzz77rMaOHauaNWvqkUce0WeffeayzaFDh+rV\nV1/VoUOHtGbNGg0aNEiRkZF66623NGfOHNWpU0cDBw4s9XN1AFBel2bbgAEDgibfEhMTtXz5cs2d\nO1czZszQqVOnVL16dSUlJWn58uXOd4TDw8P1xhtv6KWXXtILL7wgh8OhG2+8UXPmzNHNN98sSbrh\nhhvUtWtXzZgxQ8ePH3fOQo3gV8m49JukgRB28uRJ7dy5U506dXI2vufPn1fHjh01aNAgDRgwIMAV\nAgAAALAa7wjDVqpWrapnn31WW7ZsUZ8+fVRYWKjly5fr7Nmz6tKlS6DLAwAAAOAHvCMM29mzZ4+m\nT5+uzz//XIZhqGXLlnrqqacUGxsb6NIAAAAA+AGNMAAAAADAVvjOBAAAAACArdAIAwAAAABsxdRk\nWUXfcbhjx44S71+7dq2mTZumH3/8Ua1bt9b//d//qXbt2h5t29PvlgWASyUkJAS6BJ8jEwGUB7kI\nAL8pMRMNDxQWFhrLly83EhISjKSkpBLHfP7550arVq2MvXv3Grm5ucZf//pXY/DgwZ5s3jAMw9iz\nZ0+xZbm5ucaePXuM3Nxcj7djdh2rx1MTNQVyvB1qKik7QkFp+xXqxzNUawqFfaCmilOTnXLRDsez\nvOOpiZrsXlNpmejRpdFz587V22+/rdTU1FLHrFmzRp06dVLz5s0VGRmpESNGaPv27crJyfG6cwcA\nAAAAwNc8ujT63nvvVWpqqnbv3l3qmCNHjqhly5bO25dffrlq1qyp7Oxsjy+PzsvLc7ntcDhc/u8J\ns+tYPZ6aqCmQ4+1aU6i4NBMlex7PUKgpFPaBmip2TaGivOeKoXA8Q2EfqImaAl2TZPLrk3bt2qXH\nH39cu3btKnbfww8/rI4dO6p///7OZXfccYeef/55tW/f3u22+dwHAG/xWTgAcEUuAsBvSspEU5Nl\nlSUyMrLYq3S5ubmKjo72eBtxcXEutx0Ohw4dOqTGjRsrIiLCo22YXcfq8dRETeyDtY9x4MABjx6z\nIro0E6XQP56hWlMo7AM1VZya7JSLdjieobgP1ERN/qyptEz0WSPcqFEjZWdnO2+fPHlSP/30kxo1\nauTxNiIjI0tcHhERUep9pTG7jtXjqYmaAjnerjVVdGXtqx2PZyjUFAr7QE0Vu6aKzlfniqFwPENh\nH6iJmgJZk8++R7hbt2764IMPtGfPHjkcDr3yyitq3769Lr/8cl89BAAAAAAA5Vaud4THjRsnSXr+\n+ed100036YUXXtDYsWP1ww8/KDExURMnTvRJkQAAAAAA+IqpRrh169YuE2U9//zzLvd36dJFXbp0\n8U1lAAAAAABYwGeXRgMAAAAAUBHQCAMAAAAAbIVGGAAAAABgKzTCAAAAAABboREGAAAAANgKjTAA\nAAAAwFZohAEAAAAAtkIjDAAAAACwFRphAAAAAICt0AgDAAAAAGyFRhgAAAAAYCs0wgAAAAAAW6ER\nBgAAAADYCo0wAAAAAMBWaIQBAAAAALZCIwwAAAAAsBUaYQAAAACArdAIAwAAAABshUYYAAAAAGAr\nNMIAAAAAAFuhEQYAAAAA2AqNMAAAAADAVmiEAQAAAAC2QiMMAAAAALAVGmEAAAAAgK3QCAMAAAAA\nbIVGGAAAAABgKzTCAAAAAABboREGAAAAANgKjTAAAAAAwFZohAEAAAAAtkIjDAAAAACwFRphAAAA\nAICt0AgDAAAAAGyFRhgAAAAAYCs0wgAAAAAAW6ERBgAAAADYCo0wAAAAAMBWaIQBAAAAALZCIwwA\nAAAAsBUaYQAAAACArXjUCGdlZal3795q0aKFevbsqb1795Y4bvbs2brtttuUmJioRx55REePHvVp\nsQAAAAAAlJfbRtjhcCg1NVW9evXSJ598opSUFA0dOlRnzpxxGbdlyxatWrVK7733nv71r3/p97//\nvcaOHWtZ4QAAAAAAeMNtI/zxxx8rLCxM/fr1U3h4uHr37q3atWtr27ZtLuO+/PJLFRYWqrCwUIZh\nqHLlyoqMjLSscAAAAAAAvFHF3YDs7Gw1atTIZVmDBg105MgRl2Vdu3bV0qVLdfvtt6ty5cqqU6eO\nFi9ebKqYvLw8l9sOh8Pl/54wu47V46mJmgI53q41hYpLM1Gy5/EMhZpCYR+oqWLXFCrKe64YCscz\nFPaBmqgp0DVJUiXDMIyyBsyePVtZWVmaOXOmc9nIkSNVp04djRgxwrns6NGjmj17toYMGaKrrrpK\nEydO1KFDh7R48WJVqlTJbSEZGRkeFw0AF0tISAh0CT5HJgIoD3IRAH5TUia6fUc4Kiqq2KtveXl5\nio6Odlk2YcIEJScnq379+pKkp59+Wq1atdJ///tfxcTEeFRgXFycy22Hw6FDhw6pcePGioiI8Ggb\nZtexejw1URP7YO1jHDhwwKPHrIguzUQp9I9nqNYUCvtATRWnJjvloh2OZyjuAzVRkz9rKi0T3TbC\nDRs21KJFi1yWZWdnq1u3bi7Lvv32W+Xn5ztvh4WFKSwsTFWquH0Ip9I+UxwREWH688Zm17F6PDVR\nUyDH27Wmiq6sfbXj8QyFmkJhH6ipYtdU0fnqXDEUjmco7AM1UVMga3I7WVbbtm2Vn5+vd955RwUF\nBUpLS1NOTo7atWvnMu6OO+7QG2+8oaNHjyo/P18vv/yybrjhBjVo0MBU8QAAAAAAWMltI1y1alXN\nnz9f6enpSkpK0qJFizRnzhxFR0dr8ODBmjt3riTpz3/+szp37qx+/frptttu09dff61Zs2YpLMyj\nryoGAAAAAMAvPLpuOTY2VkuWLCm2fMGCBc5/V61aVaNGjdKoUaN8Vx0AAAAAAD7G27UAAAAAAFuh\nEQYAAAAA2AqNcAhISEhQYmKioqKiFB8fH+hyAAAAACCo0QiHgIyMDNUbtVa5ubnKzMwMdDkAAAAA\nENQ8/5JfBJ3mz32gn3ILnLdjx2+WJNWMCte+ZzsHqiwAAAAACGo0whVYwdVTVCPyRPHleXUl0QgD\nAAAAQElohCuws9nDSlxeMyrcz5UAAAAAQMVBI1yBfTmpqyQpLy9PseM36+D4ToqMjAxwVQAAAAAQ\n3JgsKwQkJCToq8ndmDUaAAAAADzAO8IhICMjQwcOHFBcXBzvCAMAAACAG7wjDAAAAACwFRphAAAA\nAICt0AgDAAAAAGyFRhgAAAAAYCs0wgAAAAAAW6ERBgAAAADYCo0wAAAAAMBWaIQBAAAAALZCIwwA\nAAAAsBUaYQAAAACArdAIAwAAAABshUYYAAAAAGArNMIAAAAAAFuhEQYAAAAA2AqNMADA1uLj4xUV\nFaXExEQlJCQEuhwAAOAHNMIAAFvLzMxUbm6u6o1aq4yMjECXAwAA/IBGGAAAAABgK1UCXQAAAIHS\ndH6yVPW4JKnGTdLNS/93R/7V+uzRTYErDAAAWIpGGABgW78cfrLE5TWjwv1cCQAA8CcaYQCAbX05\nqaskKS8vT7HjN+vg+E6KjIwMcFUAAMBqfEYYAGBrRbNGfzW5G7NGAwBgE7wjDACwtczMTOXl5enA\ngQOKi4sLdDkAAMAPeEcYAAAAAGArNMIAAAAAAFuhEQYAAAAA2AqNMAAAAADAVmiEAQAAAAC2QiMM\nAAAAALAVGmEAAAAAgK3QCAMAAAAAbMWjRjgrK0u9e/dWixYt1LNnT+3du7fEcZs2bdLdd9+tli1b\n6r777tPBgwd9WiwAAAAAAOXlthF2OBxKTU1Vr1699MknnyglJUVDhw7VmTNnXMZlZWXpr3/9qyZM\nmKCMjAzdeeedeuKJJywrHAAAAAAAb7hthD/++GOFhYWpX79+Cg8PV+/evVW7dm1t27bNZdySJUvU\np08fJSYmKiwsTAMHDtTLL7+swsJCy4oHAAAAAMCsKu4GZGdnq1GjRi7LGjRooCNHjrgsy8rK0h13\n3KH+/fvrP//5j5o0aaJx48YpLMzzjyHn5eW53HY4HC7/94TZdaweT03UFMjxdq0pVFyaiZI9j2co\n1BQK+0BNFbumUFHec8VQOJ6hsA/URE2BrkmSKhmGYZQ1YPbs2crKytLMmTOdy0aOHKk6depoxIgR\nzmXJycnKzc3VnDlzFBMToxkzZujDDz/U2rVrVaWK235bGRkZHhcNABdLSEgIdAk+RyYCKA9yEQB+\nU1Imuu1Qo6Kiir36lpeXp+joaJdlVatWVXJyspo2bSpJeuKJJ7Rw4UIdOXJEN954o0cFxsXFudx2\nOBw6dOiQGjdurIiICI+2YXYdq8dTEzWxD9Y+xoEDBzx6zIro0kyUQv94hmpNobAP1FRxarJTLtrh\neIbiPlATNfmzptIy0W0j3LBhQy1atMhlWXZ2trp16+ayrEGDBsrPz3feNgzD+Z+nIiMjS1weERFR\n6n2lMbuO1eOpiZoCOd6uNVV0Ze2rHY9nKNQUCvtATRW7porOV+eKoXA8Q2EfqImaAlmT2w/wtm3b\nVvn5+XrnnXdUUFCgtLQ05eTkqF27di7j7rnnHq1atUr79+9XQUGBpk+frvr163v8bjAAAAAAAP7g\nthGuWrWq5s+fr/T0dCUlJWnRokWaM2eOoqOjNXjwYM2dO1eS1KlTJz3zzDMaNWqUkpKStH//fs2a\nNUuVKlWyfCcAAAAAAPCU+1msJMXGxmrJkiXFli9YsMDlds+ePdWzZ0/fVAYAAAAAgAU8/24jAAAA\nAABCAI0wAAAAAMBWaIQBAAAAALZCIwwAAAAAsBUaYQAAAACArdAIAwAAAABshUYYAAAAAGArNMIA\nAAAAAFuhEQYAAAAA2AqNMAAAAADAVmiEAQAAAAC2QiMMAAAAALAVGmEAAAAAgK3QCAMAAAAAbIVG\nGAAAAABgKzTCAAAAAABboREGAAAAANgKjTAAAAAAwFZohAEAAAAAtkIjDAAAAACwFRphAAAAAICt\n0AgDAAAAAGyFRhgAAAAAYCs0wgAAAAAAW6ERBgAAAADYCo0wAAAAAMBWaIQBAAAAALZCIwwAAAAA\nsBUaYQAAAACArdAIAwAAAABshUYYAAAAAGArNMIAAAAAAFuhEQYAAADgtYSEBCUmJioqKkrx8fGB\nLgfwCI0wAAAAAK9lZGSo3qi1ys3NVWZmZqDLATxCIwwAAAAAsBUaYQAAAACArdAIAwAAAABshUYY\nAAAAAGArNMIAAAAAAFuhEQYAAABCVHx8vKKiopSYmKiEhIRAlwMEDRphAAAAIERlZmYqNzdX9Uat\nVUZGRqDLAYJGFU8GZWVlady4cTp06JDq1aun5557Ti1atCh1fFpaml566SXt2rXLZ4UCAAAACB7N\nn/tAP+UWOG/Hjt/s/HfNqHDte7ZzIMoCPOK2EXY4HEpNTVVqaqr69Omj1atXa+jQofrwww9VrVq1\nYuOPHj2qSZMmqXLlypYUDAAAAMC90hpVXzWpP+UW6MtJXZWXl6cDBw4oLi5OkZGRkqT6o9PLvX3A\nSm4b4Y8//lhhYWHq16+fJKl379566623tG3bNnXp0sVl7Pnz5zVy5Ej17dtXaWlp1lQMAAAAwK2C\nq6eoRuSJ4svz6kri3VrYm9tGODs7W40aNXJZ1qBBAx05cqTY2Ndff1033HCD2rdv71UjnJeX53Lb\n4XC4/N8TZtexejw1UVMgx9u1plBxaSZK9jyeoVBTKOwDNVXsmkJFec8VQ+F4mhl/NnuYDo7vJIfD\noUOHDqlx48aKiIhQ7PjNJf6N8eYx8vLySh3vq8fwx3hqsldNklTJMAyjrAGzZ89WVlaWZs6c6Vw2\ncuRI1alTRyNGjHAuy8zM1FNPPaW0tDRlZmbq8ccfN/UZYT68D8BboTgLJpkIoDzIRUjSvcuPl7i8\nenglvfWHuj7Z/nt9rjZ9H+BvJWWi23eEo6Kiir2ak5eXp+joaJfbo0eP1oQJE0r83LCn4uLiXG5f\n+uqVJ8yuY/V4aqIm9sHaxzhw4IBHj1kRXZqJUugfz1CtKRT2gZoqTk12ykU7HM/yjD/4v5+Xw+FQ\n84k7tG9MO98+xvLjiouLK3n8/+7zxX74Yzw1hW5NpWWi20a4YcOGWrRokcuy7OxsdevWzXk7MzNT\nR48e1ZAhQyRd+Kxwbm6uEhMT9f777+vaa6/1aAeKPlx/qYiIiFLvK43ZdaweT03UFMjxdq2poitr\nX+14PEOhplDYB2qq2DVVdL46VwyF4xks+3Dx/ZeO9+Sx7PJzoqbgq8nt9wi3bdtW+fn5euedd1RQ\nUKC0tDTl5OSoXbt2zjGJiYnat2+f9uzZoz179mju3LmqWbOm9uzZ43ETDAAAAMC34uPjFRUVpa8m\ndwvJS+YBb7lthKtWrar58+crPT1dSUlJWrRokebMmaPo6GgNHjxYc+fO9UedAAAAAEzKzMxUbm6u\n9uzZw+esgYu4vTRakmJjY7VkyZJiyxcsWFDi+NatW5uaKAsAAAAAAH9x+44wAAAAAAChhEYYAAAA\nAGArNMIAAAAAAFuhEQYAAAAA2AqNMAAAAADAVmiEAQAAAAC2QiMMAAAAALAVGmEAAAAAgK3QCAMA\nAAAAbIVGGAAAAABgKzTCAAAAAABboREGAAAAANgKjTAAAAAAwFZohAEAAAAAtkIjDAAAAACwFRph\nAAAAAICt0AgDAAAAAGyFRhgAAAAAYCs0wgAAAAAAW6ERBgAAAADYCo0wAAAAAMBWaIQBAAAAALZC\nIwwAAAAAsBUaYQAAAACArdAIAwAAAABshUYYAAAAAGArNMIAAAAAAFuhEQYAAAAA2AqNMAAAAADA\nVmiEAQAAAAC2QiMMAAAAALAVGmEAAAAAgK3QCAMAAAAAbIVGGAAAAABgKzTCAAAAAABboREGAAAA\nANgKjTAAAAAAwFZohAEAAAAAtkIjDAAAAACwFRphAAAAAICt0AgDAAAAAGzFo0Y4KytLvXv3VosW\nLdSzZ0/t3bu3xHGzZ8/WHXfcocTERKWkpOi///2vT4sFAAAAAKC83DbCDodDqamp6tWrlz755BOl\npKRo6NChOnPmjMu4FStWaPXq1XrnnXf08ccfq23bthoyZIgKCwstKx4AAAAAALPcNsIff/yxwsLC\n1K9fP4WHh6t3796qXbu2tm3b5jLu1KlTSk1N1fXXX68qVaqof//++vbbb3X8+HHLigcAAAAAwKwq\n7gZkZ2erUaNGLssaNGigI0eOuCx75JFHXG5v2bJFtWrV0tVXX+1xMXl5eS63HQ6Hy/89YXYdq8dT\nEzUFcryqQLbvAAAgAElEQVRdawoVl2aiZM/jGQo1hcI+UFPFrilUlPdcMRSOZ7DtQ15eXqnjS/o7\n5o+avBlPTfaqSZIqGYZhlDVg9uzZysrK0syZM53LRo4cqTp16mjEiBElrrN7924NGTJEzz//vLp3\n7+5RIRkZGR4XDQAXS0hICHQJPkcmAigPchH+cO/y43qvT8lvepV1H+BvJWWi23eEo6Kiir2ak5eX\np+jo6BLHr1q1Ss8995yeeeYZj5vgInFxcS63HQ6HDh06pMaNGysiIsKjbZhdx+rx1ERN7IO1j3Hg\nwAGPHrMiujQTpdA/nqFaUyjsAzVVnJrslIt2OJ5BvQ/LjysuLq7k8f+7z+81eTmemkK3ptIy0W0j\n3LBhQy1atMhlWXZ2trp161Zs7KxZs/T2229r9uzZatu2rUdFXywyMrLE5REREaXeVxqz61g9npqo\nKZDj7VpTRVfWvtrxeIZCTaGwD9RUsWuq6Hx1rhgKxzNY9uHi+y8d78lj2eXnRE3BV5PbybLatm2r\n/Px8vfPOOyooKFBaWppycnLUrl07l3Hvvfee3nrrLf3973/3qgkGAAAAAMAf3DbCVatW1fz585We\nnq6kpCQtWrRIc+bMUXR0tAYPHqy5c+dKkl5//XWdOXNGvXv3VsuWLZ3/HT582PKdAAAAAADAU24v\njZak2NhYLVmypNjyBQsWOP+9ceNG31UFAAAAAIBF3L4jDAAAAABAKKERBgAAAADYCo0wAAAAAMBW\naIQBAAAAALZCIwwAAAAAsBUaYQAAAACArdAIAwAAAABshUYYAAAAAGArNMIAAAAAAFuhEQYAAJZI\nSEhQYmKioqKiFB8fH+hyAABwohEGACAEBGPTmZGRoXqj1io3N1eZmZmBLicoxcfHKyoqKuiOHQCE\nuiqBLgAAAJRfRkaGYsdv1sHxnRQZGRnocuChzMxM5eXlcexQIUU3mKamb43+bUHmxffVldTV7zUB\nnqIRBgAAAGDa2exh+nJSV+Xl5enAgQOKi4tzvphTf3R6gKsDykYjDABABdb8uQ/0U26B83bs+M2S\npJpR4dr3bGdqCmL8nBAKXBre5ced/6wZFR6AagDP0QgDAFCB/ZRbUOI7MoF8NyYYawpG/JxQ0X05\n6cKlz1zej4qIybIAAAAAALZCIwwAAADAawkJCfpqcjdmPkeFwqXRAAAAALyWkZFRbLIsINjxjjAA\nAAAAwFZohAEAAAAAtkIjDAAAAACwFRphAAAAAICt0AgDAAAAAGyFRhgAAAAAYCs0wgAAAAAAW6ER\nBgAAAOCUkJCgxMRERUVFKT4+PtDlAJagEQYAAADglJGRoXqj1io3N1eZmZmBLgewRJVAFwAAAAAg\n8Jo/94F+yi1w3o4dv1mSVDMqXPue7RyosgBL0AgDAAAAUMHVU1Qj8kTx5Xl1JdEII7TQCAMAAADQ\n2exhJS6vGRXu50oA69EIAwAAANCXk7pKkvLy8hQ7frMOju+kyMjIAFcFWIPJsgAAQY3ZSwHAvxIS\nEvTV5G7kLkKard8Rjo+P14EDB5y34+LimBkPAIKEc9KW7lNUr/uFZb/+bzmTtgCAdTIyMnTgwAHF\nxcXxjjBClm0b4ebPfaBfu01WvW6/LeMECwCCR2mTthTmXy0mbQEAAOVh20b4p9wCfTmpq/Ly8lxe\n8ao/Oj3QpQEAJGUN+VDShat1srKynP/OzNxU5noXX+3TpEkTlyt/AAAAJD4jDAAIchkZGdqzZ49y\nc3M9+vhKZmamcnNzVW/UWmVkZFhSE59bBuAr5AkQGLZ9RxgAEFqcnym+SOz4zZIufPWHLz/2kpGR\nwYyqgA344woTO+YJV+4gGNAIAwD8xsqTn9I+Uyz57nPFlzbbnjbaCQkJl1zezcSMQEWQmZnp/Cqh\njPGdfLptb/MkFFj5cwU8RSMMAPAbK09+ij5TLFn3HZjeTuAVTO/4FJ18f/vGYyrI+VqSFF7797r2\nkdm2OAEHrGTmxb7S8qQgr65CeUJAO78AgOBCIwwAFUQwXkpmpiZ/nfxcXFPCGt/+nIqabU8b7abz\nk6WqxyVJNW6Sbl560Z35V+uzR8ue+MsKRSffMVMvk3Tx5xFHMyM38D/efNTi0m8kOSOp/uj0Usef\nzR5my4lb+UYA+Jq350ceNcJZWVkaN26cDh06pHr16um5555TixYtio1buHCh3njjDZ05c0YdO3bU\n888/r+joaBO7Aas1f+4DfT7zUd4FQIXCZaUX+ONSMrM/azM1+evkp6imohNLKyQkJOirrCxFTS77\n5/TL4SdL3UbNqHBLanPn4mae7wmtmMjEC6x8cbDo20Wk4r8rpTWq3nwjict9yy+8aBaobPAXsy8o\nAmUx+wLUxdw2wg6HQ6mpqUpNTVWfPn20evVqDR06VB9++KGqVavmHPePf/xDb7zxht5++23Vrl1b\nw4cP15QpUzR+/Pjy7Bt8rODqKabfBQjGd6FgD85X5LtPUb3uF5bZ9fu+rX431ezP2pt3S0Lp5Ccj\nI8OjJvLiE+mKvs8IvCbz7lTlyBMKGxmm+Iv+jjeZd6fLRwPsoDwnv56IbjBNTd8a7bows+i+upK6\nul/HzXhv8yEUXgix8sod2Et5XmR32wh//PHHCgsLU79+/SRJvXv31ltvvaVt27apS5cuznGrV69W\n79691aBBA0nSE088oZSUFD3zzDOqXLmymf2BhbKGfGj6UkYr/9AAZeHyqd9Y/Vkysz/rsiamKqsm\nu578ePoOMlCW0r5bOyvTXk2wZP3fh7PZw0q9r7R3bEtbx1fv8Jb2gmVFPC/zx5U7sAezvc3F3DbC\n2dnZatSokcuyBg0a6MiRIy7Ljhw5ouTkZJcxZ8+e1YkTJ3Tttdd6VExeXp7L7VatWunzzz+XdGGn\nPPk+SIfD4fJ/d49X0vhL6yhPTRe/anfTTTfp008/dVuXmX3wZvyePXvkcDh06NAhNW7cuMz9Lf0P\nTV3l5bUvdT2rj50/xgdjTaGwD2bW+XTAWkklPZ/WlPm8DRUX72OpJ1iRVcr8WZj9WTscDjWfuEP7\nxrRTREREsTrc1eOuJjP5Y3YfyrOO1eP/+c9/Ovc5IiLC5/vtzd8zq3+u/vgba7Ymf4w3u443Pydv\nnk+homhfzWZWEU+PzcGLPurh6WMUrWNVTWXPjF/2eVlF/z3xZnwwnrMHY02hcOy8ObeQpEqGYRhl\nDZg9e7aysrI0c+ZM57KRI0eqTp06GjFihHNZcnKyRo8erU6dLoRAYWGhbrrpJq1bt65YI12Si3cw\nZc8oVS7hF/18Xl29kzi5xPXNrlPaeMNRV28leD7eHzWV9Rj+cN999zlf+GjYsKGWLVtW6lhf7bM3\n61g9npr8X5OnEhISvF43WJUU+lb+LnrzGN6Mh3tW/z0LxpqC8W9sMJ6LmGGHXAyVTCR3y1aRfnep\nyf81eaqkTHTbCL/55pvauXOnFixY4Fz2+OOPKzY2Vo899phzWffu3TV06FDn5dJnzpxRq1attG3b\nNl199dVui8vIyCh2acTFnX3RK2rumF3H6vHURE3sg7WPceDAgZA94SvpcrFQP56hWlMo7AM1VZya\n7JSLdjieobgP1ERN/qyptEx0e2l0w4YNtWjRIpdl2dnZ6tatm8uyRo0auVwunZ2drcsuu0x16tTx\nqHhJpU4QEBERYXpyEbPrWD2emqgpkOPtWlNFV9a+2vF4hkJNobAP1FSxa6rofHWuGArHMxT2gZqo\nKZA1hbkb0LZtW+Xn5+udd95RQUGB0tLSlJOTo3bt2rmM69Gjh5YuXaovvvhCv/76q2bMmKFu3bop\nLMztQwAAAAAA4Dduu9SqVatq/vz5Sk9PV1JSkhYtWqQ5c+YoOjpagwcP1ty5cyVJHTt21KOPPqoh\nQ4bojjvuUI0aNTRy5EjLdwAAAAAAADPcXhotSbGxsVqyZEmx5Rd/bliS+vfvr/79+/umMgAAAAAA\nLMB1ywAAAAAAW6ERBgAAAADYCo0wAAAAAMBWaIQBAAAAALZCIwwAAAAAsBUaYQAAAACArdAIAwAA\nAABspZJhGEagi5CkjIyMQJcAoIJKSEgIdAk+RyYCKA9yEQB+U1ImBk0jDAAAAACAP3BpNAAAAADA\nVmiEAQAAAAC2QiMMAAAAALAVGmEAAAAAgK3QCAMAAAAAbIVGGAAAAABgKzTCAAAAAABboREGAAAA\nANgKjTAAAAAAwFaqBLqAsnzzzTcqLCzU9ddfH7Aazp8/rx07dkiSbrzxRl1zzTUerXf48GHt379f\n1atXV3Jycolj9u/fr9jYWFWtWtVUPR06dNDmzZsVHh7u8XrlZfZYTJs2TVu3bpVhGIqJiVF8fLwG\nDBhQ6vj169erSZMmqlevnqV1+UOw1WSmHn8dN3eOHj0aND+/YBIszy1vctGqTCyqJ9hzkUwMnpqs\nPG6SNblIJpYsWJ5bZOIFwfi7FSzPkYsFW03Beo7vji9yMWgb4UmTJunYsWP6+uuvtXLlSm3cuFFd\nunQpdbw3P2RPDuSTTz6p3bt3q3r16vr22291xRVXKD4+XvHx8frzn/9c4nZXrFihCRMmqFmzZtq3\nb5/+/e9/KzMzU+fPn1fz5s2d4wYOHCiHw6EGDRqoSZMmuummm9SkSRM1adJE1atXL3HblStXVkRE\nhM6fP+9RuOXk5Gj37t0l/uzWrVun1q1b68orryxzG2aPhSRt375dq1evVkpKijp37qxNmzaVOf6j\njz7S5MmTVaNGDdWoUUPNmjVTly5d1KxZM5/V5Y9fXCtr8kc9/jhuknT8+HFdffXVpd4/aNAgdejQ\nQUOHDtXll1/uXL5+/Xr9v//3/8rcdqgKlkyUzOeilZkomctFX2SiFJy/W95kdbBlkNU5bfVxk7w7\ndmSied48380+X+yQiVJgzhX98bsVCplodU3Beo7vLhMl3+Ri0F4a/dlnn2nmzJmqWbOmKleurLS0\ntDLHf/TRRxowYIC6d++ufv36adKkSdq/f3+Z6xQdyJo1a6pz5846cOBAsTE7duzQ+++/r82bN2v3\n7t2aOnWqEhMTdfjw4VK3O2vWLC1cuFALFy5UlSoXXmuIiIjQlClTXMZlZGRo/fr1euKJJ1S3bl1N\nnTpV/fv3180336zk5GQ9/vjjmjt3brHt9+/fXwsWLChz34rMnz9fX331VYn3HTt2TPPnz3e7DbPH\nQpLzCRkWFqbk5GSdP3++zPHfffedtmzZojVr1mjgwIE6efKk5s+fryVLlvisLk+O98W8eU5ZWZM/\n6vHHcZOksWPHlnn/ypUrdfjwYSUnJ2vmzJnatGmT3nrrLT399NNlrhfKgiUTJfO5aHUmSp7noi8y\nUQrO3y1vsjrYMsjqnLb6uEneHTsy0Txvnu9mny92yEQpMOeK/vjdCoVMtLqmYD3Hd5eJkm9yMWgb\n4fDwcDkcDlWqVEmSVFBQUOZ4b37InhzIunXrOsfVqFFDbdu21aOPPqrp06eXut1ff/3V+SpHUf31\n69fXoUOHio29/vrrdeedd+ro0aMaPXq09u/fr02bNqlXr17avn27du7cWWydWbNmae7cuUpNTVV6\nerqOHTtWai3btm1Tnz59SryvV69e+sc//lHqukXMHgtJat++vc6ePav69etr5cqVysrKKnN8Xl6e\n8+efnJysX375Ra+++qpWr17ts7r88YtrZU3+qMcfx02SYmJiynwxadiwYTp69KhuueUWZWVlafjw\n4VqwYIFefPHFMrcbyoIlEyXzuWh1Jkqe56IvMlEKzt8tb7I62DLI6py2+rhJ3h07MtE8b57vZp8v\ndshEKTDniv743QqFTLS6pmA9x3eXiZJvcjFoG+FBgwZp0KBB+vHHH5Wenu52vDc/ZE8O5MMPP6zF\nixebqr1Ro0batWuXJMkwDElSlSpVdO7cuVLX+eijj/TQQw+patWquu666zR06FBNnDixxEtRXn31\nVQ0fPlzVqlXTjBkzlJycrDZt2uiRRx4pNjYnJ0e1a9cu8TFr166tnJwct/tj9lhI0oABAxQdHa0R\nI0bo0KFDeuqpp8ocP3DgQA0aNEirV6/W6tWr9c033ygsLExhYaU/Rc3W5Y9fXCtr8kc9/jhukrRr\n1y7169dPw4YN01tvvaWMjAyX+z/99FOtWrVKM2bM0OzZs7VmzRpde+21+vbbb93uQ6gKlkyUzOei\n1ZkoeZ6LvshEKTh/t7zJ6mDLIKtz2urjJnl37MhE87x5vpt9vtghE6XAnCv643crFDLR6pqC9Rzf\nXSZKPspFI4h99dVXxqxZs4xp06YZ3377bZljN2zYYDz00EPGqlWrjFWrVhndu3c3DMMw+vXr5/Zx\nfv75Z2PKlCnG5s2bi9136623Gk2bNjWGDx9upKenG0ePHnW7ve3btxu33HKLsWnTJuPmm282DMMw\nNm/ebHTr1q3UdZKTk439+/e7LCsoKDBuu+02t4/3yy+/GB9//LHxt7/9rdh9bdq0MU6cOFHieidO\nnDCSkpLcbt8wzB0Lbx09etSYPXu28corrxiHDx82CgoKjGnTpvm8rrKO98W8fU5ZVZM/6zHDm+Nm\nGIbx5ZdfGpmZmUZaWprxwgsvuNzXvXt34/Dhwy7LTp065fHzNVQFQyYahvlc9HcmGkbpueirTDSM\n4Pzd8ramYMsgK3M6WP+ekYnmmT2W3j6HQzkTDSO0zxVDJROtrCkYj5thlJ2JhuGbXAyaRvjcuXPG\n1q1bja1bt3p9ENz9kL15jJ07dxrz5s0znnjiCSM5OdmIjY01WrdubQwaNKjM9VavXm0kJSUZsbGx\nRq9evYwWLVoY6enppY5PT083OnToYGzdutXlsVu3bu1RnaV57LHHjJdffrnE+1555RXjscceK9f2\nf/jhh1L3Kz093cjJyfFoO6+88orRo0cPo3v37saIESOMhQsXlqsuX/G2ybNLPd4et4kTJxp//OMf\nje7duxvnzp0z1q5d63L/qlWrjI4dOxqbN282zp07ZxiGYXz++edG27Ztfb4PwSpYM9EwvMtFMpFM\ntEtN3hw7MtE9X2SiYZT9fLFjJhqGtbnoq0wsqiXYcjHY8icYa7IiEw3DN7kYNLNGlzbjXtOmTfWn\nP/3J7fqXzqi2fft2DRgwQE8++aTbxyhrBuhbbrlFt9xyi/P2r7/+qszMTH3++edl1tOjRw8lJydr\n9+7dOnXqlJo2bapGjRqVOr5Lly7Ky8vTyJEjVVhYqCuvvFJHjx5Vamqq230vS2pqqh588EGdPHlS\nXbt2Vd26dXXixAmlp6drzZo1+vvf/+4c682sgfPnz1etWrVKfOxjx45p//79Gj16tNs6vZk90Gqe\nPKdCtZ4TJ06obt26bsd5e9w+++wzvfvuu0pJSVHlypX13nvvqWvXrs77e/bsqbNnz2rMmDHKz8/X\nVVddpW+//Vb9+vXzep8qmmDNRMm7XKyImSiZz0UyMTRr8jQTJe+OHZnoXnkzUXL/fLFjJkrWniv6\nKhOl4MtFO2eiZO25ortMlHyTi0HTCO/YsUMbNmxQ3bp19csvvygzM1OZmZluZ4ws4skPubyPIUnV\nq1dXmzZt1KZNG7djo6KidPvtt3u87V69eqlLly7avXu3fvzxRzVu3FhNmzb1eP2SNG3aVHPmzNHz\nzz+vtLQ0VapUSYZhqF69epozZ47i4uKcY70Jq23btmnRokWl7s+DDz7oUcBdOhHAhg0bPN1FUyco\nZsaXJ3CtqMlf9cyaNUvz5s3Ta6+95vb56+1x82RyhgceeED33XefMjIy9N133+maa67RzTff7NH2\nQ0FFyUTJ81ysaJkomc/FipiJnq4TbJlYnpqsykTJu2NHJrrni7xy93yxYyZK1p4r+ioTJe9zMRQy\n0eqazNZj9bmip5N4lTcXg2ayLG9mZ76YVTNA+1tkZKTat2+ve+65xyfhJkm33nqrNm7cqA0bNujd\nd9/Vhg0btHHjRpdXLyXvZg301cQz3sweKF34RUxOTta2bdt8Pt6bqf2trMkf9WzZskWnT59WgwYN\n9O677+r48eNljvf2uHk6OUPlypWVlJSknj17KikpyRmIdkAmXhDITJTM52JFy0Qz6wRbJnpbk5WZ\nKHl37MhE93yRV+6eL3bNRMm6c0VfZaLk3e9WKGSi1TWZrccf54pmJvEqTy4GTSPszezMF7NqBuiK\nLicnR+vWrZN0YWr+Vq1aqX79+pIuXMLy448/uow1G1bh4eH6/vvvS1zn+++/d34/njvezB5o9hfR\nH7+4Vtbkj3o6dOigsWPHKioqSvPmzXP7ZebeHLeifZk4caK6du2qL774oth3J4JMtIqZTCwabyYX\nK1Imml0n2DLRm5qszkTJu2NHJrrni7xy93yxYyZK1p4r+ioTJfO/W6GQiVbX5E09/jhX9Fsmevxp\nYot5MztzSXw9A3RF9+KLLxqzZ88u8b558+YZEydOdN72ZtZAqyeeKUthYaFhGIbRt29f5799Ob6I\np7OX+qsmK+sp0rdvX4/HmjFo0CBj6tSpRnp6upGdnW3JY4QKMtEaZjLRMMznYkXKRG/XCbZMNFMT\nmVhx+TKvSnu+2DETDcPac0UyMbhr8jYTi9bxNX9nYtB8RnjKlCnKzMxUVlaWpk+frqNHj6pmzZqK\ni4vTG2+8UWz8+fPntWPHDknSjTfeqGuuuUbShUtZSnulwexjhAIzn81o1aqVFi1apOHDhxcb++67\n7yoxMbHYcqsnninLxZc+eHIZhNnxRcp6TgWiJivrsdrUqVN18OBBrVu3TlOnTlVUVJRq1qypJk2a\n6Omnnw50eUGFTLSG2c+rmc3FipSJ3q4TbJlopiYyseLyJq/M5qIdM1Gy9lzRbCZKvsvFUMhEq2uy\neyYGTSNsdsY9f84AXZGZuYTFm7CyeuIZBIZhGKbGezrJwuWXX662bdtq7ty5+vDDDxUWFqb333+/\nzC+ftysy0RpmL+szm4tkYmgym4mSZ7lIJnrOm7wym4t2zETJ2nNFs5kokYsVhRXnin7PRMvfc7ZI\nixYtjOPHjxuGceHt/3/+85/G66+/bjzxxBMBriy4mL2EZceOHUbnzp2NmJgYIzY21oiJiTE6d+5s\n7Ny50+1jZWdnGxkZGWVeynDXXXcZP/zwQ4n3/fDDD0bnzp3dPs6l7rvvPkvHFz3PrHwMM+P9Uc+y\nZcs8Hjtz5kyjadOmLt9t6E7fvn0Nh8PhvO3ue7nhHpnoGW8+AuJtLlaUTDS7TrBlomGYr8nKTDQM\n87lIJlqDXPSMv84VPclEw/B9LoZCJppdx+pMNAxrzxX9lYkVthG+6667XH5AKJm3n83wNKzMflF6\nQkJCmdtr1apVmfeXxOwJitVNnpU1+aseT23evNmYMGGC0aNHD+PRRx81vvvuO4/W+/DDD43+/fsb\n77//vpGWlmZ06dLFkvrshEz0THk+r+ZJLlbETDSzTrBlorc1WZWJhuFdLpKJ1iAXPWPluaLZTDQM\n3+diKGSi1TXZNRMrjx8/frw17zVbq0qVKtq7d69atGgR6FKC2u9//3uNGzdO33zzjaKiolRQUKCD\nBw9q3rx5WrZsmSZNmqQ6depIunBpzNatW3XDDTeoVq1auuaaa5yXpqxbt05XXHGFoqOjXbb/6quv\nyuFwlPidXf/4xz+0c+dOtWvXzrnszTffVI8ePVStWrVi47///nstXbpUjz76qKl9LOmyGl+M37Jl\niw4dOqSzZ8/qyJEjuvnmm1W9evWA1eSPetavX6+qVauWeknSperXr6/27dtr9erVWrhwoWrUqOHR\neg0bNlRSUpI++eQT5eTkaNiwYbriiis8WhclIxM9YyYTJfO5WBEz0dN1gi0Ty1OTVZkoeZeLZKI1\nyEXPWHmuaDYTJd/nYihkotU1manHH+eK/srESobhxQdfgkC7du30888/Kzk5WZ06dVKzZs103XXX\nBbqsoLRz5049//zz+uqrr1w+m/Hss8+6fBZm4sSJqlWrloYOHVpsG6+//rpOnjxZ7DMZd999txYt\nWlTiZ0tycnL04IMPauPGjc5lf/zjH9WoUaMSJ1mYNm2aDh06pFmzZnm8b998840KCwt1/fXX+3y8\nYRiqVKmS7r//fi1evDjgkwh4U8/69evVpEkT1atXz6PHGDNmjP71r3+pRo0aqlGjhpo1a6YuXbqo\nWbNmZa53//33a8mSJS7Lvv32W48e89prr/VoHMpGJnrO00yUzOdiRctEM+sEWyZ6U5O/MlEqnotk\nov+Ri56z6lzRbCZKvs1FMtG3mSj57lwxGDIxaCbLMsuuM/t5o+hL0r/88kudPHlSV1xxherXr68v\nvvhCEyZMcM7CZnY2Vcn6iWfKMmnSJB07dkxff/21Vq5cqY0bN5Y4w6C3472ZSW/atGnaunWrDMNQ\nTEyM4uPjNWDAAJ+M96aejz76SJMnT/Y4rL777jtt2bJFYWFh2rRpkzZt2qT58+fr1ltv1f333+/R\nYxbp2LGjR3WG+iQk/kImes7TTJTM52JFykSz6wRbJnpTE5loL+Si56w6VzSbiZLvcjEUMtHsOlZn\nouS7XAyGTKywjbBdZ/Yrj/r16+vaa6/VunXrNHr0aO3bt8/lciFvwqroi9IvvpSwSElflO7N7IGl\n+eyzz/Tuu+8qJSVFlStXVlpaWpkBZ3a8N7Zv367Vq1crJSVFnTt31qZNm3w63iyzYZWXl6fz588r\nLCxMycnJWrFihWbNmqUHH3ywzHAr6cKSbdu2Of/90UcfaeXKlfrTn/6k6667TseOHdPs2bP1hz/8\nwTc7CjLRC+4yUTKfixUpE71dxwy7ZqJUPBfJRP8jF83z9bmi2UyUfJeLoZCJ3q7jKW+aWl+dKwZD\nJlbYRvhS1atXV5s2bdSmTZtAlxKUDh06pCVLlmjNmjXKzc1VYWGh5s+f7/K5DG/CypvvHi7tVUez\nwsPD5XA4nK8mFRQU+HR8ETOfHrj88sslyRkOGzZs8Ol4s/WYDauBAwdq0KBB6t27t6QLlweFhYUp\nLLxjStkAAA75SURBVCyszMcpGn+xi6fIf+ONN/Tuu+86v/fv97//vW688UY99NBDJa6L8iMTy+ZJ\nJkrmc7EiZaK36wRbJpqpyV+ZKBXPRTIx8MjFsllxruhNJkq+ycVQyERv17EqEyXfnSsGQyaGTCOM\nkq1atUrLli3Tp59+qpiYGP3pT39S9+7d1a1bN910000uY70JK7OXr1z8Jen169d3CTUzX5IuSYMG\nDdKgQYN0+vRppaen+3x8ETO/gO3bt9fZs2dVv359rVy5UllZWT4db7Yes2F11113KS4uTmvWrFFe\nXp6mTZumc+fOlTjJxcX69OlT5v05OTmKjIx0WRYZGVniVQaAlcxkomQ+FytSJnq7TrBlopma/JWJ\nUtm5SCYimFh5rujNZc6+ysVQyERv17EqEyVrzhUDlYkVdrIseCY2Nla1atXS5MmTdfvttzuXt2vX\nTqtXr3YJks8++0wPPvigevToUWpYlXQ5ipUTz7jz9ddfa+3atcrPz1ffvn11zTXX+HS8t3755RfN\nnTtXCQkJ6tixo8/Hm3Hs2DFnWPXo0UP16tXTzJkz9eSTT5Z7255OspCamqrw8HCNGjVK1157rb75\n5hu99NJLys/P19y5c8tdB+ApM5koeZeLFSkTvV3HLLtkouRZLpKJCCZWnyuayUTJt7kYKpno7Tqe\nsHMm0giHuDlz5mj58uX6/vvvdfvtt+vee+/VHXfc4ZzG/NKTPrNhdbGSJlhYunSpy8QzZmcPNDuj\nnK9noCtp5sBA12Qls5M4eDpzYE5Ojv7yl79o165dzsuNkpKS9PLLL5f6WSPACmYzUfI+F4MhE71d\npzRkovuJbTzJRTIRwcRf54qeZKJkLhdDIRN9XZPVrDhXDFQm0gjbgGEY+uijj7R8+XJt3bpVtWrV\n0s8//6ylS5eWeCmg5HlYXSo/P1/r1q3TkiVLnBMsLF682Hl/YmKi9uzZU+r6CQkJysjIcN6OjY01\nNaOc2fFlKW3mwEDVZPVMg9KFGR9XrFihlJQU9e/fX5s2bdKUKVNKHf/www/rb3/7m8skC7m5uaVO\nsnDixAmdOHFCdevWdflsCOBP3mSi5F0uBjoTvV2nJMGWiZL1s1KbzUTJXC6SiQgW/jpXdJeJkrlc\nDIVM9GVNVmeiZO25ot8z0YCtnDhxwpg5c6bRoUMHo0mTJsbjjz9e6liHw2GsXLnS6Nu3rxEbG2vc\nf//9pY794osvjBdeeMFISkoymjZtasTFxRnbt28vNq5NmzbGiRMnSq0tKSnJZdnx48ed/y1btsx4\n4IEHjJ07dxpfffWVsXPnTuPBBx80li9f7vX4svTr188wDMN46KGHDMMwjIEDBwa0pnvuucdZzwcf\nfGA89dRTPl9n0KBBhmEYRv/+/Q3DMIzhw4eXOb5v375Gfn6+83Zqaqpx/vx54/777zd69erlXP7a\na6+5rRUIBDOZaBie52KwZKK365Qk2DLRMMxnnNWZaBil52KLFi2cy8hEBDMrzhU9zUTDMJeLoZCJ\nvqzJ6kw0DN+dKwZDJtII21RhYaGxdetWY+jQocXuMxNWK1euNB544AEjJibG6NGjh/H2228bp06d\nMm699VYjJyen2PjHHnvMePnll0vc1iuvvGI89thjpdZ81113FdvmDz/8YNx1110+GX+pAQMGGHl5\neUZKSophGL8FXaBq8uaEzOw6CxcuNM6cOWOMGzfOWLFihXH33XeXOX7Dhg3GQw89ZKxatcpYtWqV\n0b17d8MwLvxxSEhIMAoLCw3DMIyWLVu6rRUIpLIy0TA8z8VgzkRv1ykSbJloGOYzzupMNIzSc/Gm\nm24iE1Gh+OJc0WwmGob3uRgKmVjemqzORMPw3bliMGQis0bbVKVKlXT77be7TIpgdjZVSRo9erRq\n1aqlefPmuWyrNOX5knSzM8qVdwY6T2YO9GdN/phpsOhymBEjRmju3Ll66qmnyhxf1syBkZGRevDB\nB9WoUSPl5///9u7nJaoujuP4x6dVrSooKSJoVRAVoQi6cSOtMgIX/QvCCANBIJEgpNlKisQ25VoI\nUYKojS3aBS3dtKwRyVaZtXkq51k8jj8b55xz7zn33HvfLxAe8pzxexfz4Zwe59O/GhkZ+etr3L9/\nv+VzAL79LRMl+1yMORNd9zTElomS/1Zq20yUmufi6dOnyUTkShpnRdtMlNxzsQiZmHSmEE39aZ0V\nY8hEPiOMLbZtqlLY4hnbRrk0GuhaNQdmMVPSpsELFy4Y7UlayLC2tqbZ2VnVajUtLCyov7//r+sm\nJiYS/RzAJ9tcjDkTXffsFGMmSslaqclEwFyITJTccrEImZjGTFLypv4Q5V0xZCIXYWxxDat6oOIZ\n20Y5k/VJ3+hpzxSiaTCLoprBwUH+SRDkkksuxpqJJntiy0TJfyt1mpkomeUimYi8CpmJkl0uFiET\nfczk8gyhz4pZZSIXYeySJKwk6evXr3rx4oXm5ua0urqqvr4+PX78uOl6k/bAvWwb5Q5an9YbPa2Z\nQjQNrq6ubv3Zu3fvND8/r6GhIZ05c0bLy8uanp7WzZs3jf4xdtPmwD9//uj169daWlrSz58/d32P\nXwNE7JLkYoyZeNCe2DLRZaYsM1Eyy0UyEXkWMhMl+1wsQiamOZPLM4Q+K2aWiaE/lIz8sG1T3Smt\n4hnb5mHb9S4tfT5nCt00GKqo5t69e/Xu7u56tVqtDw8P7/oC8sQ1F7PKRNs9sWWiy0xZZmK9bpaL\nZCKKwlcm1utmuViETPQ9U9JG6hBnxawykYswWjIJK1O27YG2zcNJmopN3+ihZgrRNNjR0VH/8ePH\nrj9bX1+vd3R0GM1o2hzY2dlZX1lZMXpNIA/SykXfmei6p16PLxNtZnJdnzQT63WzXCQTUTRZnRWL\nkIkhZnJd35jN91kxq0ykNRotNWtTdWHbHnjlyhWrRjnb9TuZtvSFmilE02BnZ6eGh4f3FTJ0dnY2\n/Rk7mTYHHjt2TMePHzd6TSAP0spF35noukeKLxNtZnJdnzQTJbNcJBNRNFmdFYuQiSFmcl0vhTkr\nZpWJ/wT/iSi1arWqI0eOqFKpqFKp6O3bt9rY2Gi6fnJyUr29vapvfpT99+/ff/1yXb9T442+vLys\njY0N1Wo13b17d98bPdRMpvMk2TM2Nqbv37+rr69PFy9e1LVr17S2tqaxsbGtNSsrKy2/1tfXdefO\nnabNhNVqVePj4/r27VvT2YEy8p2Jrnuk+DLRZibX9SaZKCXPRTIRaM4mF4uQiSFmcl0vhTkrZpWJ\nlGUhuLpjyYJto5ztepdmP58z+WgabMZ3odjHjx81NDSk5eVlHTp0aNf3lpaWWr42UGShMtF2T2yZ\n6DKTj0yUkucimQgczCUXi5CJvmdyzUTJ71kxq0zkIoxM2bQH2jbKuTbQ2TT7hZgpzaZBSRoYGNDc\n3JwkaWpqSkNDQy1fq8G1OfDGjRs6f/68rl+/rsOHD+/6XldXl9EzAWXgMxNd98SWibYztVpvm4mN\n12twyUUyETBnmotFyMQQM5msD31WzCoT+dVoZOrkyZOqVCpaXFzU9PS0fv361XTt6OioHjx4oC9f\nvhj9Cp3J+oGBga3/npqakiS1t7fr8uXLRkGS9kwu89ju+fTp09av3szMzLR8xkZItre36/nz53ry\n5Il6enp09uxZ9fT06NGjR3r27NmBr1Gr1fTw4UP19vaqq6tr1xeAbT4z0WRPbJnoMpPvTGy8XpJc\nJBMBc6a5WIRM9DGTyzOEPitmlYmUZSEKJiULb9680cuXL3Xq1Cmj1zRZ33ijt7W1aWZmxuj/BPic\nyWUe2z0hi2oaLl26pM+fP+vcuXMtngaA5CcTTfbElokuM4XMRMktF8lEwF6rXCxCJvqYyeUZQp8V\ns8pELsLIDdtGOZP1SQ9Aac8UomlwcnJSs7OzqtVqknTg30ru5doc2N3drcHBQd26dUsnTpzY9b3+\n/n7jnw9gm0vLpo8MSjKTj5wOmYmSWy6SiUD6ipCJPmZyeYbQZ8WsMpHPCCM3Xr16pffv3+v27ds6\nevRoKuvX1ta23ugLCwtN32wTExNBZnKZJ8kzhCgUk9S0TbqtrU2Li4vGPx/ANtv8MdkTWya6zBQy\nEyW3XCQTgfQVIRN9zJT0GUKcFbPKRC7CyA3bRjnb9S4HIJ8z+W4alMIV1QBIn0vLpu8Mii2nQ2Wi\nRC4CWStCJvqeyeUZinxW5CKM3LBtlLNd7/JG9zlTiKbBkZERLS4uqqura988jb8ZdGlUBeCfS8um\n7wyKLad9ZKJELgIxKkIm+p7J5RmKfFbkM8LIjVqtpvn5+X1/O5bW+tHR0aZv9CxmcpnHdk+IQjEA\nftjmj+2e2DLRZSYfmSiRi0CMipCJvmdyeYYinxW5CCM3bBvlbNe7NPv5nMlH0+BeIQrFAPjh0rLp\nO4Niy2kfmSiRi0CMipCJvmdyeYYinxW5CCM3bBvlbNe7NPv5nMlH0+Be1WpV4+PjB5Y4JG1UBeCH\nS8um7wyKLad9ZKJELgIxKkIm+p7J5RmKfFbkM8LIDdtGOdv1Ls1+Pmfy0TS4V4hCMQB+uLRs+s6g\n2HLadyZK5CIQiyJkou+ZXJ6hyGdFLsLAJpcDUGzz2O4JUSgGIJ9iy0SXmXxnokQuAmVRxkyUin1W\n5CIMbHI5AMU2j+2eq1ev6sOHD8YlDqaNqgDyL7ZMdJnJdyZK5CJQFmXMRKnYZ0U+Iwxscmn288l3\n06AUplAMQD7FlomS/1Zql2IbchEohzJmolTssyIXYWCTywHIJ99Ng1KYQjEA+RRbJkr+W6ldim3I\nRaAcypiJUrHPivxqNLDp6dOnWlhYsDoAxTaP7Z4QhWIA8im2THSZyXcmSuQiUBZlzESp2GdFLsLA\nJpcDkE++mwZdxFgUAcCP2DJR8t9K7YJcBMqBTDSTp0zkIgzAWIxFEQCQJXIRALblKRO5CAMw5tKo\nCgBFRi4CwLY8ZeI/WQ8AID8aJQsAgP+RiwCwLU+ZSGs0AGMujaoAUGTkIgBsy1Mm8qvRAIzFWBQB\nAFkiFwFgW54ykYswAAAAAKBU+IwwAAAAAKBUuAgDAAAAAEqFizAAAAAAoFS4CAMAAAAASoWLMAAA\nAACgVLgIAwAAAABKhYswAAAAAKBU/gO81WP/dSRMmwAAAABJRU5ErkJggg==\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "sns.set_context(\"talk\")\n", "grouped.boxplot(figsize=(16, 5), layout=(1, 3), rot=90)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The charts above show us the relevance of each feature in classifying the quality of CollabMap buildings, routes, and route sets. Next, we find the three most relevant features for each data type to report in the paper." ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Calculate the mean importance of each feature for each data type\n", "imp_means = grouped.mean()" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
BuildingRouteRoute Set
0$r$$\\mathsf{ACC}$$r$
1$\\mathsf{ACC}$$d$$\\mathsf{ACC}_e$
2$n_a$$\\mathrm{mfd}_\\mathit{der}$$e$
\n", "
" ], "text/plain": [ " Building Route Route Set\n", "0 $r$ $\\mathsf{ACC}$ $r$\n", "1 $\\mathsf{ACC}$ $d$ $\\mathsf{ACC}_e$\n", "2 $n_a$ $\\mathrm{mfd}_\\mathit{der}$ $e$" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.DataFrame(\n", " {row_name: row.sort_values(ascending=False)[:3].index.get_values()\n", " for row_name, row in imp_means.iterrows()\n", " }\n", ")" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "The table above shows the most important metrics as reported by the decision tree classifiers during their training for each dataset." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Retest the classifications using minimal sets of features (Extra)\n", "\n", "Armed with the knowledge of the three most important features in each experiment, we re-run the experiments using only those." ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from analytics import cv_test" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accuracy: 89.96% ±0.0570 <-- Building\n" ] } ], "source": [ "res, imps = cv_test(df_buildings[['assortativity', 'acc', 'activities']], df_buildings.label, test_id=\"Building\")" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accuracy: 95.09% ±0.0455 <-- Route\n" ] } ], "source": [ "res, imps = cv_test(df_routes[['acc', 'diameter', 'mfd_der']], df_routes.label, test_id=\"Route\")" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accuracy: 95.05% ±0.0541 <-- Routeset\n" ] } ], "source": [ "res, imps = cv_test(df_routesets[['assortativity', 'acc_e', 'entities']], df_routesets.label, test_id=\"Routeset\")" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "As shown above, using only _three metrics_ from each dataset, we can still achieve a high level of accuracy from the classifiers: 90%, 95%, and 95% for buildings, routes, and route sets, respectively. The performance is very close to that when all the metrics were used (90%, 97%, 96%, respectively). This shows that, in certain applications, we can select a smaller set of metrics for classification based on the relevancy analysis above to reduce the computational cost without significantly affecting the classification performance." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.2" } }, "nbformat": 4, "nbformat_minor": 2 }