{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# CIC Subpopulation Construction\n", "\n", "To create a representative model of the agent interactions, we will use subpopulation modeling. We take all of the agents and cluster them based off of the following features from full population actual transactional data from Jan - May 11 2020 xDai data (s means source, t means target):\n", "* s_location - source individual location\n", "* s_business_type - source individual business type\n", "* t_location - target individual location\n", "* t_business_type - target individual business type.\n", "* weight, which is tokens, exchange amount\n", "* s_bal - source individual CIC wallet balance\n", "* t_bal - target individual CIC wallet balance\n", "\n", "Essentially, we are taking a graph zoom operation, bundling nodes together based off of their likeness. Nodes are constant with edges being transative. The algorithm we use for this graph zoom operation is Kmeans clustering. Based off our descriptive statistical analysis and use of th Gap Statistic created by Stanford researchers Tibshirani, Walther and Hastie in their 2001 [paper](https://web.stanford.edu/~hastie/Papers/gap.pdf), we determined 50 clusters are representative of the subpopulations. All of the flows inside of the bundle become part of the self-loop flow. For example, within cluster 1, agent a can transaction with as b. This will not be reflected within our model as this is intra not inter cluster interactions.\n", " \n", "## Graph Model of Current Spend Activity\n", "\n", "We created a network graph of the CIC transaction data as a $G(N,E)$ weighted directed graph with source and target agents as nodes, $N$ and the edges as $E$. Tokens are used as the edge weight to denote the actual CIC flow between agents, as $i,j \\in E$.\n", "\n", "The observed data shows the actual payments between network actors that are transacting in CIC. The observed data does not show us shillings payments between actors, actors utility, or demand. We only know actual CIC spends between agents. \n", "\n", "\n", "## Saving Clustering Results\n", "At the bottom of this notebook, we calculate the median, 1st quartile, 3rd quartile, mean, standard deviation, utility types ordering, and utility types probability. These values can then be copied into the ```subpopulation_clusters.py``` in the simulation folders for use in the simulations." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# import libraries\n", "import networkx as nx\n", "import pandas as pd\n", "import numpy as np\n", "from sklearn.cluster import KMeans\n", "from gap_statistic import OptimalK\n", "from sklearn.decomposition import PCA\n", "\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Dump as of 5-15-2020\n", "Jan - May 11 2020 xDai Blockchain data\n", "https://www.grassrootseconomics.org/research" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# import the data\n", "transactions = pd.read_csv('data/sarafu_xDAI_tx_all_pub_all_time_12May2020.csv')\n", "users = pd.read_csv('data/sarafu_xDAI_users_all_pub_all_time_12May2020.csv')" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>id</th>\n", " <th>timeset</th>\n", " <th>transfer_subtype</th>\n", " <th>source</th>\n", " <th>s_gender</th>\n", " <th>s_location</th>\n", " <th>s_business_type</th>\n", " <th>target</th>\n", " <th>t_gender</th>\n", " <th>t_location</th>\n", " <th>t_business_type</th>\n", " <th>tx_token</th>\n", " <th>weight</th>\n", " <th>type</th>\n", " <th>token_name</th>\n", " <th>token_address</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>1</td>\n", " <td>2020-01-25 19:13:17.731529</td>\n", " <td>DISBURSEMENT</td>\n", " <td>0xBDB3Bc887C3b70586BC25D04d89eC802b897fC5F</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>System</td>\n", " <td>0x245fc81fe385450Dc0f4787668e47c903C00b0A1</td>\n", " <td>female</td>\n", " <td>GE Office</td>\n", " <td>Savings Group</td>\n", " <td>NaN</td>\n", " <td>18000.000000</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>2</td>\n", " <td>2020-01-25 19:13:19.056070</td>\n", " <td>DISBURSEMENT</td>\n", " <td>0xBDB3Bc887C3b70586BC25D04d89eC802b897fC5F</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>System</td>\n", " <td>0xC1697C1326fD192438515fE2F7E4cCb0C705C5d2</td>\n", " <td>male</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>NaN</td>\n", " <td>9047.660892</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>3</td>\n", " <td>2020-01-25 19:13:20.288346</td>\n", " <td>DISBURSEMENT</td>\n", " <td>0xBDB3Bc887C3b70586BC25D04d89eC802b897fC5F</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>System</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>male</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>NaN</td>\n", " <td>25378.726002</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>4</td>\n", " <td>2020-01-25 19:13:21.478850</td>\n", " <td>DISBURSEMENT</td>\n", " <td>0xBDB3Bc887C3b70586BC25D04d89eC802b897fC5F</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>System</td>\n", " <td>0xD95954e3fCd2f09A6Be5931D24f731eFa63BF435</td>\n", " <td>male</td>\n", " <td>G.E</td>\n", " <td>Farming/Labour</td>\n", " <td>NaN</td>\n", " <td>4495.932576</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>5</td>\n", " <td>2020-01-26 07:48:43.042684</td>\n", " <td>DISBURSEMENT</td>\n", " <td>0xBDB3Bc887C3b70586BC25D04d89eC802b897fC5F</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>System</td>\n", " <td>0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72</td>\n", " <td>male</td>\n", " <td>Home</td>\n", " <td>Farming/Labour</td>\n", " <td>NaN</td>\n", " <td>400.000000</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " id timeset transfer_subtype \\\n", "0 1 2020-01-25 19:13:17.731529 DISBURSEMENT \n", "1 2 2020-01-25 19:13:19.056070 DISBURSEMENT \n", "2 3 2020-01-25 19:13:20.288346 DISBURSEMENT \n", "3 4 2020-01-25 19:13:21.478850 DISBURSEMENT \n", "4 5 2020-01-26 07:48:43.042684 DISBURSEMENT \n", "\n", " source s_gender s_location \\\n", "0 0xBDB3Bc887C3b70586BC25D04d89eC802b897fC5F NaN None \n", "1 0xBDB3Bc887C3b70586BC25D04d89eC802b897fC5F NaN None \n", "2 0xBDB3Bc887C3b70586BC25D04d89eC802b897fC5F NaN None \n", "3 0xBDB3Bc887C3b70586BC25D04d89eC802b897fC5F NaN None \n", "4 0xBDB3Bc887C3b70586BC25D04d89eC802b897fC5F NaN None \n", "\n", " s_business_type target t_gender \\\n", "0 System 0x245fc81fe385450Dc0f4787668e47c903C00b0A1 female \n", "1 System 0xC1697C1326fD192438515fE2F7E4cCb0C705C5d2 male \n", "2 System 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 male \n", "3 System 0xD95954e3fCd2f09A6Be5931D24f731eFa63BF435 male \n", "4 System 0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72 male \n", "\n", " t_location t_business_type tx_token weight type token_name \\\n", "0 GE Office Savings Group NaN 18000.000000 directed Sarafu \n", "1 GE Nairobi Farming/Labour NaN 9047.660892 directed Sarafu \n", "2 GE Nairobi Farming/Labour NaN 25378.726002 directed Sarafu \n", "3 G.E Farming/Labour NaN 4495.932576 directed Sarafu \n", "4 Home Farming/Labour NaN 400.000000 directed Sarafu \n", "\n", " token_address \n", "0 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 \n", "1 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 \n", "2 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 \n", "3 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 \n", "4 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "transactions.head()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'STANDARD': 0.5085207861177824,\n", " 'DISBURSEMENT': 0.35574873997902784,\n", " 'RECLAMATION': 0.13483070053783444,\n", " 'AGENT_OUT': 0.0008997733653553429}" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "transactions.transfer_subtype.value_counts(normalize=True).to_dict()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Based on the data dictionary provided by Grassroots Economics, we know that the transfer subtype codes are:\n", "\n", "* DISBURSEMENT = from Grassroots Economics\n", "* RECLAMATION = Back to GE, \n", "* STANDARD = a trade between users, \n", "* AGENT = when a group account is cashing out\n", "\n", "\n", "For purposes of our analysis, we will subset to STANDARD transactions. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "transactions_subset = transactions[transactions['transfer_subtype'] == 'STANDARD']\n", "transactions_subset = transactions_subset[transactions_subset['token_name']=='Sarafu']" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>id</th>\n", " <th>start</th>\n", " <th>label</th>\n", " <th>gender</th>\n", " <th>location</th>\n", " <th>held_roles</th>\n", " <th>business_type</th>\n", " <th>bal</th>\n", " <th>xDAI_blockchain_address</th>\n", " <th>confidence</th>\n", " <th>...</th>\n", " <th>otxns_in</th>\n", " <th>otxns_out</th>\n", " <th>ounique_in</th>\n", " <th>ounique_out</th>\n", " <th>svol_in</th>\n", " <th>svol_out</th>\n", " <th>stxns_in</th>\n", " <th>stxns_out</th>\n", " <th>sunique_in</th>\n", " <th>sunique_out</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>1</td>\n", " <td>2020-01-25 19:10:50.218686</td>\n", " <td>1</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>ADMIN</td>\n", " <td>System</td>\n", " <td>8.916761e+06</td>\n", " <td>0xBDB3Bc887C3b70586BC25D04d89eC802b897fC5F</td>\n", " <td>0.000000</td>\n", " <td>...</td>\n", " <td>19917</td>\n", " <td>52610</td>\n", " <td>9</td>\n", " <td>19862</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>2</td>\n", " <td>2018-10-23 09:09:58</td>\n", " <td>2</td>\n", " <td>female</td>\n", " <td>GE Office</td>\n", " <td>TOKEN_AGENT</td>\n", " <td>Savings Group</td>\n", " <td>1.800000e+05</td>\n", " <td>0x245fc81fe385450Dc0f4787668e47c903C00b0A1</td>\n", " <td>0.000000</td>\n", " <td>...</td>\n", " <td>134</td>\n", " <td>16</td>\n", " <td>68</td>\n", " <td>0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>3</td>\n", " <td>2018-10-21 14:20:57</td>\n", " <td>3</td>\n", " <td>male</td>\n", " <td>GE Nairobi</td>\n", " <td>BENEFICIARY</td>\n", " <td>Farming/Labour</td>\n", " <td>5.666089e+01</td>\n", " <td>0xC1697C1326fD192438515fE2F7E4cCb0C705C5d2</td>\n", " <td>0.000000</td>\n", " <td>...</td>\n", " <td>2</td>\n", " <td>2</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0.0</td>\n", " <td>9007.0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>4</td>\n", " <td>2018-10-21 15:38:30</td>\n", " <td>4</td>\n", " <td>male</td>\n", " <td>GE Nairobi</td>\n", " <td>BENEFICIARY</td>\n", " <td>Farming/Labour</td>\n", " <td>1.173773e+04</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>0.100000</td>\n", " <td>...</td>\n", " <td>6</td>\n", " <td>1</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>20619.0</td>\n", " <td>50449.0</td>\n", " <td>20</td>\n", " <td>15</td>\n", " <td>11</td>\n", " <td>5</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>5</td>\n", " <td>2018-10-23 14:10:27</td>\n", " <td>5</td>\n", " <td>male</td>\n", " <td>G.E</td>\n", " <td>BENEFICIARY</td>\n", " <td>Farming/Labour</td>\n", " <td>7.297263e+03</td>\n", " <td>0xD95954e3fCd2f09A6Be5931D24f731eFa63BF435</td>\n", " <td>0.405063</td>\n", " <td>...</td>\n", " <td>15</td>\n", " <td>1</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>127393.3</td>\n", " <td>168905.0</td>\n", " <td>158</td>\n", " <td>208</td>\n", " <td>84</td>\n", " <td>65</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "<p>5 rows × 22 columns</p>\n", "</div>" ], "text/plain": [ " id start label gender location held_roles \\\n", "0 1 2020-01-25 19:10:50.218686 1 NaN None ADMIN \n", "1 2 2018-10-23 09:09:58 2 female GE Office TOKEN_AGENT \n", "2 3 2018-10-21 14:20:57 3 male GE Nairobi BENEFICIARY \n", "3 4 2018-10-21 15:38:30 4 male GE Nairobi BENEFICIARY \n", "4 5 2018-10-23 14:10:27 5 male G.E BENEFICIARY \n", "\n", " business_type bal xDAI_blockchain_address \\\n", "0 System 8.916761e+06 0xBDB3Bc887C3b70586BC25D04d89eC802b897fC5F \n", "1 Savings Group 1.800000e+05 0x245fc81fe385450Dc0f4787668e47c903C00b0A1 \n", "2 Farming/Labour 5.666089e+01 0xC1697C1326fD192438515fE2F7E4cCb0C705C5d2 \n", "3 Farming/Labour 1.173773e+04 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 \n", "4 Farming/Labour 7.297263e+03 0xD95954e3fCd2f09A6Be5931D24f731eFa63BF435 \n", "\n", " confidence ... otxns_in otxns_out ounique_in ounique_out svol_in \\\n", "0 0.000000 ... 19917 52610 9 19862 0.0 \n", "1 0.000000 ... 134 16 68 0 0.0 \n", "2 0.000000 ... 2 2 1 0 0.0 \n", "3 0.100000 ... 6 1 1 0 20619.0 \n", "4 0.405063 ... 15 1 1 0 127393.3 \n", "\n", " svol_out stxns_in stxns_out sunique_in sunique_out \n", "0 0.0 0 0 0 0 \n", "1 0.0 0 0 0 0 \n", "2 9007.0 0 1 0 1 \n", "3 50449.0 20 15 11 5 \n", "4 168905.0 158 208 84 65 \n", "\n", "[5 rows x 22 columns]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "users.head()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'Farming/Labour': 0.43367860016090104,\n", " 'Food/Water': 0.22863032984714401,\n", " 'Shop': 0.1406878519710378,\n", " 'Fuel/Energy': 0.06365647626709574,\n", " 'None': 0.0621983105390185,\n", " 'Transport': 0.04379525341914722,\n", " 'Education': 0.014380530973451327,\n", " 'Savings Group': 0.006335478680611424,\n", " 'Health': 0.00331858407079646,\n", " 'Environment': 0.001910699919549477,\n", " 'System': 0.0012067578439259854,\n", " 'Staff': 0.00010056315366049879,\n", " 'Chama': 5.0281576830249393e-05,\n", " 'Game': 5.0281576830249393e-05}" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "users['business_type'].value_counts(normalize=True).to_dict()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Combine user and transaction tables\n", "\n", "Combine user and transaction tables on both the source and target features." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "user_subset = users[['bal','xDAI_blockchain_address']]" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>id</th>\n", " <th>timeset</th>\n", " <th>transfer_subtype</th>\n", " <th>source</th>\n", " <th>s_gender</th>\n", " <th>s_location</th>\n", " <th>s_business_type</th>\n", " <th>target</th>\n", " <th>t_gender</th>\n", " <th>t_location</th>\n", " <th>t_business_type</th>\n", " <th>tx_token</th>\n", " <th>weight</th>\n", " <th>type</th>\n", " <th>token_name</th>\n", " <th>token_address</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>72647</th>\n", " <td>170140</td>\n", " <td>2020-04-30 10:43:45.170528</td>\n", " <td>STANDARD</td>\n", " <td>0xC1697C1326fD192438515fE2F7E4cCb0C705C5d2</td>\n", " <td>male</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>male</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>NaN</td>\n", " <td>9007.0</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " </tr>\n", " <tr>\n", " <th>72648</th>\n", " <td>10</td>\n", " <td>2020-01-26 08:26:22.521902</td>\n", " <td>STANDARD</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>male</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72</td>\n", " <td>male</td>\n", " <td>Home</td>\n", " <td>Farming/Labour</td>\n", " <td>NaN</td>\n", " <td>100.0</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " </tr>\n", " <tr>\n", " <th>72649</th>\n", " <td>11</td>\n", " <td>2020-01-26 08:27:26.757372</td>\n", " <td>STANDARD</td>\n", " <td>0xD95954e3fCd2f09A6Be5931D24f731eFa63BF435</td>\n", " <td>male</td>\n", " <td>G.E</td>\n", " <td>Farming/Labour</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>male</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>NaN</td>\n", " <td>2.0</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " </tr>\n", " <tr>\n", " <th>72650</th>\n", " <td>13</td>\n", " <td>2020-01-26 08:32:05.154096</td>\n", " <td>STANDARD</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>male</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72</td>\n", " <td>male</td>\n", " <td>Home</td>\n", " <td>Farming/Labour</td>\n", " <td>NaN</td>\n", " <td>23.0</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " </tr>\n", " <tr>\n", " <th>72651</th>\n", " <td>15</td>\n", " <td>2020-01-26 08:38:42.186525</td>\n", " <td>STANDARD</td>\n", " <td>0x4AfD04b9eD17759B362c8C929207Fe7ad81C39d3</td>\n", " <td>male</td>\n", " <td>Test</td>\n", " <td>Health</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>male</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>NaN</td>\n", " <td>12.0</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " </tr>\n", " <tr>\n", " <th>...</th>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " </tr>\n", " <tr>\n", " <th>147810</th>\n", " <td>208035</td>\n", " <td>2020-05-11 08:52:34.504171</td>\n", " <td>STANDARD</td>\n", " <td>0x97F5165b544e0869ba3Be80D7eEe8b73a0270Dfe</td>\n", " <td>Unknown gender</td>\n", " <td>kilibole</td>\n", " <td>Farming/Labour</td>\n", " <td>0x5CAaA1f7dC13235Fe181D0307e682c387e75a6ec</td>\n", " <td>Unknown gender</td>\n", " <td>Kilibole</td>\n", " <td>Food/Water</td>\n", " <td>NaN</td>\n", " <td>20.0</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " </tr>\n", " <tr>\n", " <th>147811</th>\n", " <td>208021</td>\n", " <td>2020-05-11 08:49:20.768559</td>\n", " <td>STANDARD</td>\n", " <td>0x9a05d12df366cE3aa1420c6DFFD0db9ce4ba77Fc</td>\n", " <td>Unknown gender</td>\n", " <td>Kikomani</td>\n", " <td>Food/Water</td>\n", " <td>0xb44279a1d11A2bc4b1b3D08D3BEAb8278cc86985</td>\n", " <td>Unknown gender</td>\n", " <td>Bofu</td>\n", " <td>Shop</td>\n", " <td>NaN</td>\n", " <td>350.0</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " </tr>\n", " <tr>\n", " <th>147812</th>\n", " <td>208459</td>\n", " <td>2020-05-11 10:11:18.699013</td>\n", " <td>STANDARD</td>\n", " <td>0x2e44845BE57687bFdcdd26044bB7CdD575781336</td>\n", " <td>male</td>\n", " <td>Miyani</td>\n", " <td>Shop</td>\n", " <td>0xfCF20a412eB6DD345237C7BEeBab53B424b98297</td>\n", " <td>male</td>\n", " <td>Miyani</td>\n", " <td>Shop</td>\n", " <td>NaN</td>\n", " <td>400.0</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " </tr>\n", " <tr>\n", " <th>147813</th>\n", " <td>208395</td>\n", " <td>2020-05-11 10:01:04.805823</td>\n", " <td>STANDARD</td>\n", " <td>0xAc4DB7728940e76BCd98Bb8E60671916f3B7576A</td>\n", " <td>male</td>\n", " <td>Kilifi</td>\n", " <td>Farming/Labour</td>\n", " <td>0x2f99a653F5dc201eA97578A6a203BC4db1eaD2FF</td>\n", " <td>Unknown gender</td>\n", " <td>KIlifi</td>\n", " <td>Education</td>\n", " <td>NaN</td>\n", " <td>20.0</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " </tr>\n", " <tr>\n", " <th>147814</th>\n", " <td>208396</td>\n", " <td>2020-05-11 10:01:10.449068</td>\n", " <td>STANDARD</td>\n", " <td>0x2f99a653F5dc201eA97578A6a203BC4db1eaD2FF</td>\n", " <td>Unknown gender</td>\n", " <td>KIlifi</td>\n", " <td>Education</td>\n", " <td>0xAc4DB7728940e76BCd98Bb8E60671916f3B7576A</td>\n", " <td>male</td>\n", " <td>Kilifi</td>\n", " <td>Farming/Labour</td>\n", " <td>NaN</td>\n", " <td>20.0</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "<p>75167 rows × 16 columns</p>\n", "</div>" ], "text/plain": [ " id timeset transfer_subtype \\\n", "72647 170140 2020-04-30 10:43:45.170528 STANDARD \n", "72648 10 2020-01-26 08:26:22.521902 STANDARD \n", "72649 11 2020-01-26 08:27:26.757372 STANDARD \n", "72650 13 2020-01-26 08:32:05.154096 STANDARD \n", "72651 15 2020-01-26 08:38:42.186525 STANDARD \n", "... ... ... ... \n", "147810 208035 2020-05-11 08:52:34.504171 STANDARD \n", "147811 208021 2020-05-11 08:49:20.768559 STANDARD \n", "147812 208459 2020-05-11 10:11:18.699013 STANDARD \n", "147813 208395 2020-05-11 10:01:04.805823 STANDARD \n", "147814 208396 2020-05-11 10:01:10.449068 STANDARD \n", "\n", " source s_gender \\\n", "72647 0xC1697C1326fD192438515fE2F7E4cCb0C705C5d2 male \n", "72648 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 male \n", "72649 0xD95954e3fCd2f09A6Be5931D24f731eFa63BF435 male \n", "72650 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 male \n", "72651 0x4AfD04b9eD17759B362c8C929207Fe7ad81C39d3 male \n", "... ... ... \n", "147810 0x97F5165b544e0869ba3Be80D7eEe8b73a0270Dfe Unknown gender \n", "147811 0x9a05d12df366cE3aa1420c6DFFD0db9ce4ba77Fc Unknown gender \n", "147812 0x2e44845BE57687bFdcdd26044bB7CdD575781336 male \n", "147813 0xAc4DB7728940e76BCd98Bb8E60671916f3B7576A male \n", "147814 0x2f99a653F5dc201eA97578A6a203BC4db1eaD2FF Unknown gender \n", "\n", " s_location s_business_type \\\n", "72647 GE Nairobi Farming/Labour \n", "72648 GE Nairobi Farming/Labour \n", "72649 G.E Farming/Labour \n", "72650 GE Nairobi Farming/Labour \n", "72651 Test Health \n", "... ... ... \n", "147810 kilibole Farming/Labour \n", "147811 Kikomani Food/Water \n", "147812 Miyani Shop \n", "147813 Kilifi Farming/Labour \n", "147814 KIlifi Education \n", "\n", " target t_gender \\\n", "72647 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 male \n", "72648 0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72 male \n", "72649 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 male \n", "72650 0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72 male \n", "72651 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 male \n", "... ... ... \n", "147810 0x5CAaA1f7dC13235Fe181D0307e682c387e75a6ec Unknown gender \n", "147811 0xb44279a1d11A2bc4b1b3D08D3BEAb8278cc86985 Unknown gender \n", "147812 0xfCF20a412eB6DD345237C7BEeBab53B424b98297 male \n", "147813 0x2f99a653F5dc201eA97578A6a203BC4db1eaD2FF Unknown gender \n", "147814 0xAc4DB7728940e76BCd98Bb8E60671916f3B7576A male \n", "\n", " t_location t_business_type tx_token weight type token_name \\\n", "72647 GE Nairobi Farming/Labour NaN 9007.0 directed Sarafu \n", "72648 Home Farming/Labour NaN 100.0 directed Sarafu \n", "72649 GE Nairobi Farming/Labour NaN 2.0 directed Sarafu \n", "72650 Home Farming/Labour NaN 23.0 directed Sarafu \n", "72651 GE Nairobi Farming/Labour NaN 12.0 directed Sarafu \n", "... ... ... ... ... ... ... \n", "147810 Kilibole Food/Water NaN 20.0 directed Sarafu \n", "147811 Bofu Shop NaN 350.0 directed Sarafu \n", "147812 Miyani Shop NaN 400.0 directed Sarafu \n", "147813 KIlifi Education NaN 20.0 directed Sarafu \n", "147814 Kilifi Farming/Labour NaN 20.0 directed Sarafu \n", "\n", " token_address \n", "72647 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 \n", "72648 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 \n", "72649 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 \n", "72650 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 \n", "72651 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 \n", "... ... \n", "147810 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 \n", "147811 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 \n", "147812 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 \n", "147813 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 \n", "147814 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 \n", "\n", "[75167 rows x 16 columns]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "transactions_subset" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "transactions_subset_v1 = transactions_subset.merge(user_subset, how='left', left_on='source', right_on='xDAI_blockchain_address')\n", "transactions_subset_v1['s_bal'] = transactions_subset_v1['bal']\n", "del transactions_subset_v1['bal']\n", "transactions_subset_v1['s_xDAI_blockchain_address'] = transactions_subset_v1['xDAI_blockchain_address']\n", "del transactions_subset_v1['xDAI_blockchain_address']" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "transactions_subset_v2 = transactions_subset_v1.merge(user_subset, how='left', left_on='target', right_on='xDAI_blockchain_address')" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "transactions_subset_v2 = transactions_subset_v1.merge(user_subset, how='left', left_on='target', right_on='xDAI_blockchain_address')\n", "transactions_subset_v2['t_bal'] = transactions_subset_v2['bal']\n", "del transactions_subset_v2['bal']\n", "transactions_subset_v2['t_xDAI_blockchain_address'] = transactions_subset_v2['xDAI_blockchain_address']\n", "del transactions_subset_v2['xDAI_blockchain_address']" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>id</th>\n", " <th>timeset</th>\n", " <th>transfer_subtype</th>\n", " <th>source</th>\n", " <th>s_gender</th>\n", " <th>s_location</th>\n", " <th>s_business_type</th>\n", " <th>target</th>\n", " <th>t_gender</th>\n", " <th>t_location</th>\n", " <th>t_business_type</th>\n", " <th>tx_token</th>\n", " <th>weight</th>\n", " <th>type</th>\n", " <th>token_name</th>\n", " <th>token_address</th>\n", " <th>s_bal</th>\n", " <th>s_xDAI_blockchain_address</th>\n", " <th>t_bal</th>\n", " <th>t_xDAI_blockchain_address</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>170140</td>\n", " <td>2020-04-30 10:43:45.170528</td>\n", " <td>STANDARD</td>\n", " <td>0xC1697C1326fD192438515fE2F7E4cCb0C705C5d2</td>\n", " <td>male</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>male</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>NaN</td>\n", " <td>9007.0</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " <td>56.660892</td>\n", " <td>0xC1697C1326fD192438515fE2F7E4cCb0C705C5d2</td>\n", " <td>11737.726002</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>10</td>\n", " <td>2020-01-26 08:26:22.521902</td>\n", " <td>STANDARD</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>male</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72</td>\n", " <td>male</td>\n", " <td>Home</td>\n", " <td>Farming/Labour</td>\n", " <td>NaN</td>\n", " <td>100.0</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " <td>11737.726002</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>902.500000</td>\n", " <td>0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>11</td>\n", " <td>2020-01-26 08:27:26.757372</td>\n", " <td>STANDARD</td>\n", " <td>0xD95954e3fCd2f09A6Be5931D24f731eFa63BF435</td>\n", " <td>male</td>\n", " <td>G.E</td>\n", " <td>Farming/Labour</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>male</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>NaN</td>\n", " <td>2.0</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " <td>7297.262576</td>\n", " <td>0xD95954e3fCd2f09A6Be5931D24f731eFa63BF435</td>\n", " <td>11737.726002</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>13</td>\n", " <td>2020-01-26 08:32:05.154096</td>\n", " <td>STANDARD</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>male</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72</td>\n", " <td>male</td>\n", " <td>Home</td>\n", " <td>Farming/Labour</td>\n", " <td>NaN</td>\n", " <td>23.0</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " <td>11737.726002</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>902.500000</td>\n", " <td>0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>15</td>\n", " <td>2020-01-26 08:38:42.186525</td>\n", " <td>STANDARD</td>\n", " <td>0x4AfD04b9eD17759B362c8C929207Fe7ad81C39d3</td>\n", " <td>male</td>\n", " <td>Test</td>\n", " <td>Health</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>male</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>NaN</td>\n", " <td>12.0</td>\n", " <td>directed</td>\n", " <td>Sarafu</td>\n", " <td>0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4</td>\n", " <td>448.000000</td>\n", " <td>0x4AfD04b9eD17759B362c8C929207Fe7ad81C39d3</td>\n", " <td>11737.726002</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " id timeset transfer_subtype \\\n", "0 170140 2020-04-30 10:43:45.170528 STANDARD \n", "1 10 2020-01-26 08:26:22.521902 STANDARD \n", "2 11 2020-01-26 08:27:26.757372 STANDARD \n", "3 13 2020-01-26 08:32:05.154096 STANDARD \n", "4 15 2020-01-26 08:38:42.186525 STANDARD \n", "\n", " source s_gender s_location \\\n", "0 0xC1697C1326fD192438515fE2F7E4cCb0C705C5d2 male GE Nairobi \n", "1 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 male GE Nairobi \n", "2 0xD95954e3fCd2f09A6Be5931D24f731eFa63BF435 male G.E \n", "3 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 male GE Nairobi \n", "4 0x4AfD04b9eD17759B362c8C929207Fe7ad81C39d3 male Test \n", "\n", " s_business_type target t_gender \\\n", "0 Farming/Labour 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 male \n", "1 Farming/Labour 0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72 male \n", "2 Farming/Labour 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 male \n", "3 Farming/Labour 0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72 male \n", "4 Health 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 male \n", "\n", " t_location t_business_type tx_token weight type token_name \\\n", "0 GE Nairobi Farming/Labour NaN 9007.0 directed Sarafu \n", "1 Home Farming/Labour NaN 100.0 directed Sarafu \n", "2 GE Nairobi Farming/Labour NaN 2.0 directed Sarafu \n", "3 Home Farming/Labour NaN 23.0 directed Sarafu \n", "4 GE Nairobi Farming/Labour NaN 12.0 directed Sarafu \n", "\n", " token_address s_bal \\\n", "0 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 56.660892 \n", "1 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 11737.726002 \n", "2 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 7297.262576 \n", "3 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 11737.726002 \n", "4 0x0Fd6e8F2320C90e9D4b3A5bd888c4D556d20AbD4 448.000000 \n", "\n", " s_xDAI_blockchain_address t_bal \\\n", "0 0xC1697C1326fD192438515fE2F7E4cCb0C705C5d2 11737.726002 \n", "1 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 902.500000 \n", "2 0xD95954e3fCd2f09A6Be5931D24f731eFa63BF435 11737.726002 \n", "3 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 902.500000 \n", "4 0x4AfD04b9eD17759B362c8C929207Fe7ad81C39d3 11737.726002 \n", "\n", " t_xDAI_blockchain_address \n", "0 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 \n", "1 0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72 \n", "2 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 \n", "3 0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72 \n", "4 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "transactions_subset_v2.head()" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "# subset the data into the needed columns for clustering\n", "combined = transactions_subset_v2[['source','s_location','s_business_type','target','t_location',\n", " 't_business_type','weight','s_bal','t_bal']]" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>source</th>\n", " <th>s_location</th>\n", " <th>s_business_type</th>\n", " <th>target</th>\n", " <th>t_location</th>\n", " <th>t_business_type</th>\n", " <th>weight</th>\n", " <th>s_bal</th>\n", " <th>t_bal</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>0xC1697C1326fD192438515fE2F7E4cCb0C705C5d2</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>9007.0</td>\n", " <td>56.660892</td>\n", " <td>11737.726002</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72</td>\n", " <td>Home</td>\n", " <td>Farming/Labour</td>\n", " <td>100.0</td>\n", " <td>11737.726002</td>\n", " <td>902.500000</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>0xD95954e3fCd2f09A6Be5931D24f731eFa63BF435</td>\n", " <td>G.E</td>\n", " <td>Farming/Labour</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>2.0</td>\n", " <td>7297.262576</td>\n", " <td>11737.726002</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72</td>\n", " <td>Home</td>\n", " <td>Farming/Labour</td>\n", " <td>23.0</td>\n", " <td>11737.726002</td>\n", " <td>902.500000</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>0x4AfD04b9eD17759B362c8C929207Fe7ad81C39d3</td>\n", " <td>Test</td>\n", " <td>Health</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>12.0</td>\n", " <td>448.000000</td>\n", " <td>11737.726002</td>\n", " </tr>\n", " <tr>\n", " <th>...</th>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " <td>...</td>\n", " </tr>\n", " <tr>\n", " <th>75162</th>\n", " <td>0x97F5165b544e0869ba3Be80D7eEe8b73a0270Dfe</td>\n", " <td>kilibole</td>\n", " <td>Farming/Labour</td>\n", " <td>0x5CAaA1f7dC13235Fe181D0307e682c387e75a6ec</td>\n", " <td>Kilibole</td>\n", " <td>Food/Water</td>\n", " <td>20.0</td>\n", " <td>0.000000</td>\n", " <td>5.000000</td>\n", " </tr>\n", " <tr>\n", " <th>75163</th>\n", " <td>0x9a05d12df366cE3aa1420c6DFFD0db9ce4ba77Fc</td>\n", " <td>Kikomani</td>\n", " <td>Food/Water</td>\n", " <td>0xb44279a1d11A2bc4b1b3D08D3BEAb8278cc86985</td>\n", " <td>Bofu</td>\n", " <td>Shop</td>\n", " <td>350.0</td>\n", " <td>0.000000</td>\n", " <td>800.000000</td>\n", " </tr>\n", " <tr>\n", " <th>75164</th>\n", " <td>0x2e44845BE57687bFdcdd26044bB7CdD575781336</td>\n", " <td>Miyani</td>\n", " <td>Shop</td>\n", " <td>0xfCF20a412eB6DD345237C7BEeBab53B424b98297</td>\n", " <td>Miyani</td>\n", " <td>Shop</td>\n", " <td>400.0</td>\n", " <td>0.000000</td>\n", " <td>800.000000</td>\n", " </tr>\n", " <tr>\n", " <th>75165</th>\n", " <td>0xAc4DB7728940e76BCd98Bb8E60671916f3B7576A</td>\n", " <td>Kilifi</td>\n", " <td>Farming/Labour</td>\n", " <td>0x2f99a653F5dc201eA97578A6a203BC4db1eaD2FF</td>\n", " <td>KIlifi</td>\n", " <td>Education</td>\n", " <td>20.0</td>\n", " <td>400.000000</td>\n", " <td>500.000000</td>\n", " </tr>\n", " <tr>\n", " <th>75166</th>\n", " <td>0x2f99a653F5dc201eA97578A6a203BC4db1eaD2FF</td>\n", " <td>KIlifi</td>\n", " <td>Education</td>\n", " <td>0xAc4DB7728940e76BCd98Bb8E60671916f3B7576A</td>\n", " <td>Kilifi</td>\n", " <td>Farming/Labour</td>\n", " <td>20.0</td>\n", " <td>500.000000</td>\n", " <td>400.000000</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "<p>75167 rows × 9 columns</p>\n", "</div>" ], "text/plain": [ " source s_location s_business_type \\\n", "0 0xC1697C1326fD192438515fE2F7E4cCb0C705C5d2 GE Nairobi Farming/Labour \n", "1 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 GE Nairobi Farming/Labour \n", "2 0xD95954e3fCd2f09A6Be5931D24f731eFa63BF435 G.E Farming/Labour \n", "3 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 GE Nairobi Farming/Labour \n", "4 0x4AfD04b9eD17759B362c8C929207Fe7ad81C39d3 Test Health \n", "... ... ... ... \n", "75162 0x97F5165b544e0869ba3Be80D7eEe8b73a0270Dfe kilibole Farming/Labour \n", "75163 0x9a05d12df366cE3aa1420c6DFFD0db9ce4ba77Fc Kikomani Food/Water \n", "75164 0x2e44845BE57687bFdcdd26044bB7CdD575781336 Miyani Shop \n", "75165 0xAc4DB7728940e76BCd98Bb8E60671916f3B7576A Kilifi Farming/Labour \n", "75166 0x2f99a653F5dc201eA97578A6a203BC4db1eaD2FF KIlifi Education \n", "\n", " target t_location t_business_type \\\n", "0 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 GE Nairobi Farming/Labour \n", "1 0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72 Home Farming/Labour \n", "2 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 GE Nairobi Farming/Labour \n", "3 0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72 Home Farming/Labour \n", "4 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 GE Nairobi Farming/Labour \n", "... ... ... ... \n", "75162 0x5CAaA1f7dC13235Fe181D0307e682c387e75a6ec Kilibole Food/Water \n", "75163 0xb44279a1d11A2bc4b1b3D08D3BEAb8278cc86985 Bofu Shop \n", "75164 0xfCF20a412eB6DD345237C7BEeBab53B424b98297 Miyani Shop \n", "75165 0x2f99a653F5dc201eA97578A6a203BC4db1eaD2FF KIlifi Education \n", "75166 0xAc4DB7728940e76BCd98Bb8E60671916f3B7576A Kilifi Farming/Labour \n", "\n", " weight s_bal t_bal \n", "0 9007.0 56.660892 11737.726002 \n", "1 100.0 11737.726002 902.500000 \n", "2 2.0 7297.262576 11737.726002 \n", "3 23.0 11737.726002 902.500000 \n", "4 12.0 448.000000 11737.726002 \n", "... ... ... ... \n", "75162 20.0 0.000000 5.000000 \n", "75163 350.0 0.000000 800.000000 \n", "75164 400.0 0.000000 800.000000 \n", "75165 20.0 400.000000 500.000000 \n", "75166 20.0 500.000000 400.000000 \n", "\n", "[75167 rows x 9 columns]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "combined" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "source = combined.source.values\n", "target = combined.target.values\n", "# remove the source and target variables for clustering\n", "del combined['source']\n", "del combined['target']" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "# create dummy variables of the categorical variables \n", "updated = pd.get_dummies(combined)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# define how many clusters to test\n", "clustersToTest = [10,20,25,30,40,50]\n", "# calculate the optimal number of clusters using the Gap Statistic -https://statweb.stanford.edu/~gwalther/gap\n", "optimalK = OptimalK(parallel_backend='joblib')\n", "n_clusters = optimalK(X=updated, cluster_array=clustersToTest)" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>n_clusters</th>\n", " <th>gap_value</th>\n", " <th>gap*</th>\n", " <th>ref_dispersion_std</th>\n", " <th>sk</th>\n", " <th>sk*</th>\n", " <th>diff</th>\n", " <th>diff*</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>10.0</td>\n", " <td>3.363615</td>\n", " <td>1.414036e+14</td>\n", " <td>2.510135e+12</td>\n", " <td>0.019816</td>\n", " <td>1.633046e+14</td>\n", " <td>0.200252</td>\n", " <td>1.555745e+14</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>20.0</td>\n", " <td>3.177815</td>\n", " <td>9.154344e+13</td>\n", " <td>1.200491e+12</td>\n", " <td>0.014451</td>\n", " <td>1.057143e+14</td>\n", " <td>0.187308</td>\n", " <td>1.033105e+14</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>25.0</td>\n", " <td>3.001553</td>\n", " <td>7.603468e+13</td>\n", " <td>7.675192e+11</td>\n", " <td>0.011046</td>\n", " <td>8.780176e+13</td>\n", " <td>0.106216</td>\n", " <td>8.643467e+13</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>30.0</td>\n", " <td>2.904411</td>\n", " <td>6.720920e+13</td>\n", " <td>5.602132e+11</td>\n", " <td>0.009074</td>\n", " <td>7.760920e+13</td>\n", " <td>0.230329</td>\n", " <td>7.539618e+13</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>40.0</td>\n", " <td>2.686358</td>\n", " <td>5.289592e+13</td>\n", " <td>6.017818e+11</td>\n", " <td>0.012277</td>\n", " <td>6.108290e+13</td>\n", " <td>-1.248115</td>\n", " <td>6.018132e+13</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " n_clusters gap_value gap* ref_dispersion_std sk \\\n", "0 10.0 3.363615 1.414036e+14 2.510135e+12 0.019816 \n", "1 20.0 3.177815 9.154344e+13 1.200491e+12 0.014451 \n", "2 25.0 3.001553 7.603468e+13 7.675192e+11 0.011046 \n", "3 30.0 2.904411 6.720920e+13 5.602132e+11 0.009074 \n", "4 40.0 2.686358 5.289592e+13 6.017818e+11 0.012277 \n", "\n", " sk* diff diff* \n", "0 1.633046e+14 0.200252 1.555745e+14 \n", "1 1.057143e+14 0.187308 1.033105e+14 \n", "2 8.780176e+13 0.106216 8.643467e+13 \n", "3 7.760920e+13 0.230329 7.539618e+13 \n", "4 6.108290e+13 -1.248115 6.018132e+13 " ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "optimalK.gap_df.head()" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "<Figure size 432x288 with 1 Axes>" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.plot(optimalK.gap_df.n_clusters, optimalK.gap_df.gap_value, linewidth=3)\n", "plt.scatter(optimalK.gap_df[optimalK.gap_df.n_clusters == n_clusters].n_clusters,\n", " optimalK.gap_df[optimalK.gap_df.n_clusters == n_clusters].gap_value, s=250, c='r')\n", "plt.grid(True)\n", "plt.text(20, 4, 'Clusters: {}'.format(str(clustersToTest)), horizontalalignment='center',verticalalignment='center')\n", "plt.xlabel('Cluster Count')\n", "plt.ylabel('Gap Value')\n", "plt.title('Gap Values by Cluster Count')\n", "plt.savefig('gap_statistic.png')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute clusters based off of the following features:\n", "* s_location\n", "* s_business_type\n", "* t_location\n", "* t_business_type\n", "* weight, which is tokens exchange\n", "* s_bal\n", "* t_bal\n", "\n", "\n", "\"The KMeans algorithm clusters data by trying to separate samples in n groups of equal variance, minimizing a criterion known as the inertia or within-cluster sum-of-squares (see below). This algorithm requires the number of clusters to be specified. It scales well to large number of samples and has been used across a large range of application areas in many different fields.\n", "\n", "The k-means algorithm divides a set of samples into disjoint clusters , each described by the mean \n", " of the samples in the cluster. The means are commonly called the cluster “centroids”; note that they are not, in general, points from , although they live in the same space.\" - https://scikit-learn.org/stable/modules/clustering.html#k-means" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/aclarkdata/anaconda3/lib/python3.7/site-packages/sklearn/cluster/_kmeans.py:974: FutureWarning: 'n_jobs' was deprecated in version 0.23 and will be removed in 0.25.\n", " \" removed in 0.25.\", FutureWarning)\n" ] } ], "source": [ "kmeans = KMeans(n_clusters=50, random_state=1,n_jobs=-1).fit(updated.values)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/aclarkdata/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:2: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame.\n", "Try using .loc[row_indexer,col_indexer] = value instead\n", "\n", "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n", " \n" ] } ], "source": [ "# add the clusters back to the combined dataframe\n", "combined['cluster'] = kmeans.labels_" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/aclarkdata/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:2: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame.\n", "Try using .loc[row_indexer,col_indexer] = value instead\n", "\n", "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n", " \n", "/home/aclarkdata/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:3: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame.\n", "Try using .loc[row_indexer,col_indexer] = value instead\n", "\n", "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n", " This is separate from the ipykernel package so we can avoid doing imports until\n" ] } ], "source": [ "# add back the source and target variables\n", "combined['source'] = source\n", "combined['target'] = target" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>s_location</th>\n", " <th>s_business_type</th>\n", " <th>t_location</th>\n", " <th>t_business_type</th>\n", " <th>weight</th>\n", " <th>s_bal</th>\n", " <th>t_bal</th>\n", " <th>cluster</th>\n", " <th>source</th>\n", " <th>target</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>9007.0</td>\n", " <td>56.660892</td>\n", " <td>11737.726002</td>\n", " <td>13</td>\n", " <td>0xC1697C1326fD192438515fE2F7E4cCb0C705C5d2</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>Home</td>\n", " <td>Farming/Labour</td>\n", " <td>100.0</td>\n", " <td>11737.726002</td>\n", " <td>902.500000</td>\n", " <td>12</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>G.E</td>\n", " <td>Farming/Labour</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>2.0</td>\n", " <td>7297.262576</td>\n", " <td>11737.726002</td>\n", " <td>48</td>\n", " <td>0xD95954e3fCd2f09A6Be5931D24f731eFa63BF435</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>Home</td>\n", " <td>Farming/Labour</td>\n", " <td>23.0</td>\n", " <td>11737.726002</td>\n", " <td>902.500000</td>\n", " <td>12</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " <td>0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>Test</td>\n", " <td>Health</td>\n", " <td>GE Nairobi</td>\n", " <td>Farming/Labour</td>\n", " <td>12.0</td>\n", " <td>448.000000</td>\n", " <td>11737.726002</td>\n", " <td>13</td>\n", " <td>0x4AfD04b9eD17759B362c8C929207Fe7ad81C39d3</td>\n", " <td>0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " s_location s_business_type t_location t_business_type weight \\\n", "0 GE Nairobi Farming/Labour GE Nairobi Farming/Labour 9007.0 \n", "1 GE Nairobi Farming/Labour Home Farming/Labour 100.0 \n", "2 G.E Farming/Labour GE Nairobi Farming/Labour 2.0 \n", "3 GE Nairobi Farming/Labour Home Farming/Labour 23.0 \n", "4 Test Health GE Nairobi Farming/Labour 12.0 \n", "\n", " s_bal t_bal cluster \\\n", "0 56.660892 11737.726002 13 \n", "1 11737.726002 902.500000 12 \n", "2 7297.262576 11737.726002 48 \n", "3 11737.726002 902.500000 12 \n", "4 448.000000 11737.726002 13 \n", "\n", " source \\\n", "0 0xC1697C1326fD192438515fE2F7E4cCb0C705C5d2 \n", "1 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 \n", "2 0xD95954e3fCd2f09A6Be5931D24f731eFa63BF435 \n", "3 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 \n", "4 0x4AfD04b9eD17759B362c8C929207Fe7ad81C39d3 \n", "\n", " target \n", "0 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 \n", "1 0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72 \n", "2 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 \n", "3 0x4AB73CfaC1732a9DcD74BdB4C9605f21832D7C72 \n", "4 0xBAB77A20a757e8438DfaBF01D5F36DD12d862B31 " ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "combined.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Calculate and plot Two PCA components of the data." ] }, { "cell_type": "code", "execution_count": 165, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "<Figure size 432x288 with 1 Axes>" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Create a PCA instance: pca\n", "pca = PCA(n_components=2)\n", "principalComponents = pca.fit_transform(updated\n", " )\n", "df = pd.DataFrame(principalComponents)\n", "\n", "df['label'] = kmeans.labels_\n", "colors = plt.cm.Spectral(np.linspace(0, 1, len(df.label.unique())))\n", "\n", "for color, label in zip(colors, df.label.unique()):\n", " \n", " tempdf = df[df.label == label]\n", " plt.scatter(tempdf[0], tempdf[1], c=color)\n", " \n", "plt.scatter(kmeans.cluster_centers_[:,0], kmeans.cluster_centers_[:, 1], c='r', s=500, alpha=0.5,)\n", "plt.grid(True)\n", "plt.text(200000, 260000, 'Clusters are the red dots', horizontalalignment='center',verticalalignment='center')\n", "plt.savefig('pca.png')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Descriptive statistics \n", "\n", "Calculate relevant statistics, such as median, mean, etc for creating probability distributions in the subpopulation model." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>weight</th>\n", " <th>s_bal</th>\n", " <th>t_bal</th>\n", " </tr>\n", " <tr>\n", " <th>cluster</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>217.737536</td>\n", " <td>332.674357</td>\n", " <td>3122.710726</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>588.111940</td>\n", " <td>793.864819</td>\n", " <td>251651.998315</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>957.820312</td>\n", " <td>7089.179214</td>\n", " <td>21755.181601</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>349.925309</td>\n", " <td>516.042937</td>\n", " <td>64166.491418</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>455.317844</td>\n", " <td>64995.360511</td>\n", " <td>751.148124</td>\n", " </tr>\n", " <tr>\n", " <th>5</th>\n", " <td>2443.890625</td>\n", " <td>251651.998315</td>\n", " <td>1746.597767</td>\n", " </tr>\n", " <tr>\n", " <th>6</th>\n", " <td>586.917409</td>\n", " <td>1022.866799</td>\n", " <td>38404.821609</td>\n", " </tr>\n", " <tr>\n", " <th>7</th>\n", " <td>1533.448040</td>\n", " <td>23214.200804</td>\n", " <td>1469.670862</td>\n", " </tr>\n", " <tr>\n", " <th>8</th>\n", " <td>408.100000</td>\n", " <td>895.817336</td>\n", " <td>100579.182676</td>\n", " </tr>\n", " <tr>\n", " <th>9</th>\n", " <td>1468.701299</td>\n", " <td>64148.612782</td>\n", " <td>22314.167052</td>\n", " </tr>\n", " <tr>\n", " <th>10</th>\n", " <td>271.261258</td>\n", " <td>573.611940</td>\n", " <td>1152.574038</td>\n", " </tr>\n", " <tr>\n", " <th>11</th>\n", " <td>365.099390</td>\n", " <td>406.400941</td>\n", " <td>14410.774177</td>\n", " </tr>\n", " <tr>\n", " <th>12</th>\n", " <td>790.907135</td>\n", " <td>11235.023756</td>\n", " <td>956.054517</td>\n", " </tr>\n", " <tr>\n", " <th>13</th>\n", " <td>433.822322</td>\n", " <td>537.032863</td>\n", " <td>9837.111302</td>\n", " </tr>\n", " <tr>\n", " <th>14</th>\n", " <td>8074.000000</td>\n", " <td>251651.998315</td>\n", " <td>121082.429001</td>\n", " </tr>\n", " <tr>\n", " <th>15</th>\n", " <td>1187.972973</td>\n", " <td>9402.471887</td>\n", " <td>55225.482860</td>\n", " </tr>\n", " <tr>\n", " <th>16</th>\n", " <td>15562.500000</td>\n", " <td>121082.429001</td>\n", " <td>50899.750788</td>\n", " </tr>\n", " <tr>\n", " <th>17</th>\n", " <td>389.501989</td>\n", " <td>401.109603</td>\n", " <td>25436.848041</td>\n", " </tr>\n", " <tr>\n", " <th>18</th>\n", " <td>4298.095238</td>\n", " <td>34136.137502</td>\n", " <td>62137.100395</td>\n", " </tr>\n", " <tr>\n", " <th>19</th>\n", " <td>80000.000000</td>\n", " <td>63145.960000</td>\n", " <td>121082.429001</td>\n", " </tr>\n", " <tr>\n", " <th>20</th>\n", " <td>8978.181818</td>\n", " <td>44220.000308</td>\n", " <td>121082.429001</td>\n", " </tr>\n", " <tr>\n", " <th>21</th>\n", " <td>1039.047619</td>\n", " <td>100579.182676</td>\n", " <td>1313.687026</td>\n", " </tr>\n", " <tr>\n", " <th>22</th>\n", " <td>302.688709</td>\n", " <td>2335.376506</td>\n", " <td>576.133882</td>\n", " </tr>\n", " <tr>\n", " <th>23</th>\n", " <td>374.248656</td>\n", " <td>529.628969</td>\n", " <td>31342.309668</td>\n", " </tr>\n", " <tr>\n", " <th>24</th>\n", " <td>1631.845550</td>\n", " <td>38576.683107</td>\n", " <td>1335.521733</td>\n", " </tr>\n", " <tr>\n", " <th>25</th>\n", " <td>1962.083333</td>\n", " <td>63573.309892</td>\n", " <td>43986.716465</td>\n", " </tr>\n", " <tr>\n", " <th>26</th>\n", " <td>285.848711</td>\n", " <td>848.769339</td>\n", " <td>18260.218613</td>\n", " </tr>\n", " <tr>\n", " <th>27</th>\n", " <td>37712.500000</td>\n", " <td>15293.127761</td>\n", " <td>6721.732034</td>\n", " </tr>\n", " <tr>\n", " <th>28</th>\n", " <td>1941.818182</td>\n", " <td>45110.659482</td>\n", " <td>251651.998315</td>\n", " </tr>\n", " <tr>\n", " <th>29</th>\n", " <td>468.716263</td>\n", " <td>17535.443217</td>\n", " <td>791.779968</td>\n", " </tr>\n", " <tr>\n", " <th>30</th>\n", " <td>3240.503052</td>\n", " <td>862.117155</td>\n", " <td>993.258934</td>\n", " </tr>\n", " <tr>\n", " <th>31</th>\n", " <td>1236.973333</td>\n", " <td>15340.314634</td>\n", " <td>39194.641356</td>\n", " </tr>\n", " <tr>\n", " <th>32</th>\n", " <td>1408.134556</td>\n", " <td>15404.220436</td>\n", " <td>68353.725751</td>\n", " </tr>\n", " <tr>\n", " <th>33</th>\n", " <td>317.344987</td>\n", " <td>336.022102</td>\n", " <td>6084.374342</td>\n", " </tr>\n", " <tr>\n", " <th>34</th>\n", " <td>404.778700</td>\n", " <td>594.851729</td>\n", " <td>44619.093968</td>\n", " </tr>\n", " <tr>\n", " <th>35</th>\n", " <td>482.002628</td>\n", " <td>4479.433066</td>\n", " <td>4557.371287</td>\n", " </tr>\n", " <tr>\n", " <th>36</th>\n", " <td>3593.148148</td>\n", " <td>3569.248851</td>\n", " <td>121082.429001</td>\n", " </tr>\n", " <tr>\n", " <th>37</th>\n", " <td>9354.545455</td>\n", " <td>121082.429001</td>\n", " <td>2014.247057</td>\n", " </tr>\n", " <tr>\n", " <th>38</th>\n", " <td>14338.571429</td>\n", " <td>251651.998315</td>\n", " <td>60154.599923</td>\n", " </tr>\n", " <tr>\n", " <th>39</th>\n", " <td>503.700787</td>\n", " <td>928.528786</td>\n", " <td>71742.725832</td>\n", " </tr>\n", " <tr>\n", " <th>40</th>\n", " <td>175.678483</td>\n", " <td>294.939972</td>\n", " <td>271.114703</td>\n", " </tr>\n", " <tr>\n", " <th>41</th>\n", " <td>1302.072607</td>\n", " <td>12999.924780</td>\n", " <td>7072.339809</td>\n", " </tr>\n", " <tr>\n", " <th>42</th>\n", " <td>387.122709</td>\n", " <td>697.305678</td>\n", " <td>21694.464157</td>\n", " </tr>\n", " <tr>\n", " <th>43</th>\n", " <td>437.794249</td>\n", " <td>568.978270</td>\n", " <td>55748.221328</td>\n", " </tr>\n", " <tr>\n", " <th>44</th>\n", " <td>11678.857143</td>\n", " <td>4067.108100</td>\n", " <td>4226.813826</td>\n", " </tr>\n", " <tr>\n", " <th>45</th>\n", " <td>548.573207</td>\n", " <td>6283.897324</td>\n", " <td>679.334775</td>\n", " </tr>\n", " <tr>\n", " <th>46</th>\n", " <td>13100.037736</td>\n", " <td>4238.100340</td>\n", " <td>28746.781097</td>\n", " </tr>\n", " <tr>\n", " <th>47</th>\n", " <td>32377.777778</td>\n", " <td>59578.441515</td>\n", " <td>14515.326330</td>\n", " </tr>\n", " <tr>\n", " <th>48</th>\n", " <td>772.342169</td>\n", " <td>4772.491359</td>\n", " <td>12047.885243</td>\n", " </tr>\n", " <tr>\n", " <th>49</th>\n", " <td>3661.085106</td>\n", " <td>30063.054365</td>\n", " <td>19776.065227</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " weight s_bal t_bal\n", "cluster \n", "0 217.737536 332.674357 3122.710726\n", "1 588.111940 793.864819 251651.998315\n", "2 957.820312 7089.179214 21755.181601\n", "3 349.925309 516.042937 64166.491418\n", "4 455.317844 64995.360511 751.148124\n", "5 2443.890625 251651.998315 1746.597767\n", "6 586.917409 1022.866799 38404.821609\n", "7 1533.448040 23214.200804 1469.670862\n", "8 408.100000 895.817336 100579.182676\n", "9 1468.701299 64148.612782 22314.167052\n", "10 271.261258 573.611940 1152.574038\n", "11 365.099390 406.400941 14410.774177\n", "12 790.907135 11235.023756 956.054517\n", "13 433.822322 537.032863 9837.111302\n", "14 8074.000000 251651.998315 121082.429001\n", "15 1187.972973 9402.471887 55225.482860\n", "16 15562.500000 121082.429001 50899.750788\n", "17 389.501989 401.109603 25436.848041\n", "18 4298.095238 34136.137502 62137.100395\n", "19 80000.000000 63145.960000 121082.429001\n", "20 8978.181818 44220.000308 121082.429001\n", "21 1039.047619 100579.182676 1313.687026\n", "22 302.688709 2335.376506 576.133882\n", "23 374.248656 529.628969 31342.309668\n", "24 1631.845550 38576.683107 1335.521733\n", "25 1962.083333 63573.309892 43986.716465\n", "26 285.848711 848.769339 18260.218613\n", "27 37712.500000 15293.127761 6721.732034\n", "28 1941.818182 45110.659482 251651.998315\n", "29 468.716263 17535.443217 791.779968\n", "30 3240.503052 862.117155 993.258934\n", "31 1236.973333 15340.314634 39194.641356\n", "32 1408.134556 15404.220436 68353.725751\n", "33 317.344987 336.022102 6084.374342\n", "34 404.778700 594.851729 44619.093968\n", "35 482.002628 4479.433066 4557.371287\n", "36 3593.148148 3569.248851 121082.429001\n", "37 9354.545455 121082.429001 2014.247057\n", "38 14338.571429 251651.998315 60154.599923\n", "39 503.700787 928.528786 71742.725832\n", "40 175.678483 294.939972 271.114703\n", "41 1302.072607 12999.924780 7072.339809\n", "42 387.122709 697.305678 21694.464157\n", "43 437.794249 568.978270 55748.221328\n", "44 11678.857143 4067.108100 4226.813826\n", "45 548.573207 6283.897324 679.334775\n", "46 13100.037736 4238.100340 28746.781097\n", "47 32377.777778 59578.441515 14515.326330\n", "48 772.342169 4772.491359 12047.885243\n", "49 3661.085106 30063.054365 19776.065227" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "combined.groupby('cluster').mean()" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "# compute median, Q1,Q3, mean, and sigma\n", "clustersMedianSourceBalance = []\n", "clusters1stQSourceBalance = []\n", "clusters3rdQSourceBalance = []\n", "clustersMu = []\n", "clustersSigma = []\n", "for i in range(0,len(combined.cluster.unique())):\n", " temp = combined[combined['cluster']==i]\n", " clustersMu.append(round(temp.weight.mean(),2))\n", " clustersSigma.append(round(temp.weight.std(),2))\n", " clustersMedianSourceBalance.append(round(temp.weight.median(),2))\n", " clusters1stQSourceBalance.append(round(temp.s_bal.quantile(0.25),2))\n", " clusters3rdQSourceBalance.append(round(temp.s_bal.quantile(0.75),2))\n", " \n" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "clusters = []\n", "for i in range(0,len(combined.cluster.unique())):\n", " clusters.append(str(i))\n", " \n", " \n", "mixingAgents = clusters.copy()\n", "mixingAgents.append('external')" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "UtilityTypesOrdered = {}\n", "for i in range(0,len(combined.cluster.unique())):\n", " UtilityTypesOrdered[str(i)] = dict(zip(list(combined[combined['cluster']==i].t_business_type.value_counts(normalize=True).to_dict().keys()),list(combined[combined['cluster']==i].t_business_type.value_counts(normalize=True).to_dict().values())))\n", " \n", "UtilityTypesOrdered['external'] = {'Food/Water':1,\n", " 'Fuel/Energy':2,\n", " 'Health':3,\n", " 'Education':4,\n", " 'Savings Group':5,\n", " 'Shop':6}\n" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "utilityTypesProbability = {}\n", "for i in range(0,len(combined.cluster.unique())):\n", " utilityTypesProbability[str(i)] = combined[combined['cluster']==i].t_business_type.value_counts(normalize=True).to_dict()\n", " \n", " \n", "utilityTypesProbability['external'] = {'Food/Water':0.6,\n", " 'Fuel/Energy':0.10,\n", " 'Health':0.03,\n", " 'Education':0.015,\n", " 'Savings Group':0.065,\n", " 'Shop':0.19}\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create initilization file (copy from here) \n", "\n", "clusters = ['0',\n", " '1',\n", " '2',\n", " '3',\n", " '4',\n", " '5',\n", " '6',\n", " '7',\n", " '8',\n", " '9',\n", " '10',\n", " '11',\n", " '12',\n", " '13',\n", " '14',\n", " '15',\n", " '16',\n", " '17',\n", " '18',\n", " '19',\n", " '20',\n", " '21',\n", " '22',\n", " '23',\n", " '24',\n", " '25',\n", " '26',\n", " '27',\n", " '28',\n", " '29',\n", " '30',\n", " '31',\n", " '32',\n", " '33',\n", " '34',\n", " '35',\n", " '36',\n", " '37',\n", " '38',\n", " '39',\n", " '40',\n", " '41',\n", " '42',\n", " '43',\n", " '44',\n", " '45',\n", " '46',\n", " '47',\n", " '48',\n", " '49']\n", "\n", "mixingAgents = ['0',\n", " '1',\n", " '2',\n", " '3',\n", " '4',\n", " '5',\n", " '6',\n", " '7',\n", " '8',\n", " '9',\n", " '10',\n", " '11',\n", " '12',\n", " '13',\n", " '14',\n", " '15',\n", " '16',\n", " '17',\n", " '18',\n", " '19',\n", " '20',\n", " '21',\n", " '22',\n", " '23',\n", " '24',\n", " '25',\n", " '26',\n", " '27',\n", " '28',\n", " '29',\n", " '30',\n", " '31',\n", " '32',\n", " '33',\n", " '34',\n", " '35',\n", " '36',\n", " '37',\n", " '38',\n", " '39',\n", " '40',\n", " '41',\n", " '42',\n", " '43',\n", " '44',\n", " '45',\n", " '46',\n", " '47',\n", " '48',\n", " '49',\n", " 'external']\n", "\n", "\n", "clustersMedianSourceBalance = [150.0,\n", " 340.0,\n", " 250.0,\n", " 20.0,\n", " 330.0,\n", " 320.0,\n", " 240.0,\n", " 300.0,\n", " 300.0,\n", " 50.0,\n", " 900.0,\n", " 120.0,\n", " 400.0,\n", " 180.0,\n", " 300.0,\n", " 6000.0,\n", " 132.5,\n", " 130.0,\n", " 160.0,\n", " 5000.0,\n", " 150.0,\n", " 10000.0,\n", " 200.0,\n", " 10000.0,\n", " 200.0,\n", " 200.0,\n", " 35000.0,\n", " 20000.0,\n", " 100.0,\n", " 500.0,\n", " 425.0,\n", " 13320.0,\n", " 500.0,\n", " 500.0,\n", " 1000.0,\n", " 390.0,\n", " 150.0,\n", " 250.0,\n", " 45000.0,\n", " 36300.0,\n", " 960.0,\n", " 120.0,\n", " 200.0,\n", " 100.0,\n", " 220.0,\n", " 600.0,\n", " 62000.0,\n", " 500.0,\n", " 900.0,\n", " 486.0]\n", "\n", "clusters1stQSourceBalance = [56.0,\n", " 118.46,\n", " 105.0,\n", " 64767.51,\n", " 251652.0,\n", " 124.5,\n", " 4139.28,\n", " 146.1,\n", " 1002.5,\n", " 17145.78,\n", " 52676.2,\n", " 100.0,\n", " 121082.43,\n", " 112.0,\n", " 28849.43,\n", " 27619.22,\n", " 66.36,\n", " 251652.0,\n", " 148.0,\n", " 38653.54,\n", " 67.22,\n", " 121082.43,\n", " 6429.46,\n", " 555.04,\n", " 104.48,\n", " 96.43,\n", " 52676.2,\n", " 251652.0,\n", " 64.73,\n", " 36824.5,\n", " 15182.03,\n", " 485.94,\n", " 21660.89,\n", " 11210.0,\n", " 100579.18,\n", " 100.46,\n", " 2845.01,\n", " 3338.98,\n", " 1274.91,\n", " 6724.88,\n", " 38653.54,\n", " 114.5,\n", " 68.0,\n", " 100.0,\n", " 20.93,\n", " 14050.3,\n", " 63145.96,\n", " 9276.23,\n", " 63234.8,\n", " 64767.51]\n", "\n", "clusters3rdQSourceBalance = [403.96,\n", " 506.6,\n", " 592.96,\n", " 64767.51,\n", " 251652.0,\n", " 1501.41,\n", " 7214.9,\n", " 869.82,\n", " 1557.01,\n", " 18304.36,\n", " 55142.93,\n", " 419.96,\n", " 121082.43,\n", " 816.3,\n", " 38653.54,\n", " 37106.89,\n", " 770.65,\n", " 251652.0,\n", " 838.46,\n", " 38653.54,\n", " 315.0,\n", " 121082.43,\n", " 9074.79,\n", " 5726.66,\n", " 602.02,\n", " 437.96,\n", " 63234.8,\n", " 251652.0,\n", " 425.0,\n", " 40953.15,\n", " 17145.78,\n", " 6349.27,\n", " 25695.83,\n", " 13156.46,\n", " 100579.18,\n", " 819.33,\n", " 4158.5,\n", " 5597.38,\n", " 2823.81,\n", " 20030.91,\n", " 51710.52,\n", " 537.94,\n", " 542.92,\n", " 415.43,\n", " 895.66,\n", " 18304.36,\n", " 63145.96,\n", " 14050.3,\n", " 64767.51,\n", " 64767.51]\n", "\n", "clustersMu = [329.98,\n", " 588.11,\n", " 469.93,\n", " 492.32,\n", " 2443.89,\n", " 565.21,\n", " 1120.5,\n", " 408.1,\n", " 550.09,\n", " 503.42,\n", " 2478.89,\n", " 349.93,\n", " 9354.55,\n", " 453.69,\n", " 4298.1,\n", " 7508.1,\n", " 376.86,\n", " 8074.0,\n", " 333.75,\n", " 7691.43,\n", " 362.68,\n", " 15562.5,\n", " 672.28,\n", " 10809.6,\n", " 274.98,\n", " 405.46,\n", " 34555.56,\n", " 14338.57,\n", " 255.48,\n", " 1229.44,\n", " 1470.23,\n", " 14590.61,\n", " 1527.75,\n", " 770.73,\n", " 1039.05,\n", " 503.7,\n", " 362.11,\n", " 499.51,\n", " 45000.0,\n", " 37504.55,\n", " 1941.82,\n", " 262.96,\n", " 702.23,\n", " 168.57,\n", " 2000.58,\n", " 1383.32,\n", " 65333.33,\n", " 1454.43,\n", " 1483.11,\n", " 1853.03]\n", "\n", "clustersSigma = [583.23,\n", " 1501.26,\n", " 966.32,\n", " 1452.2,\n", " 6789.39,\n", " 847.29,\n", " 2228.12,\n", " 483.5,\n", " 852.2,\n", " 1170.38,\n", " 3256.26,\n", " 1174.55,\n", " 16235.99,\n", " 841.35,\n", " 7696.91,\n", " 6814.68,\n", " 785.21,\n", " 10886.9,\n", " 712.65,\n", " 8713.11,\n", " 708.54,\n", " 18542.24,\n", " 1164.0,\n", " 3682.08,\n", " 340.99,\n", " 624.76,\n", " 8171.77,\n", " 15060.34,\n", " 461.52,\n", " 1774.39,\n", " 4617.97,\n", " 4770.82,\n", " 2641.75,\n", " 1133.41,\n", " 767.87,\n", " 437.68,\n", " 652.72,\n", " 761.07,\n", " 7071.07,\n", " 5274.96,\n", " 2716.8,\n", " 572.43,\n", " 1553.21,\n", " 210.61,\n", " 4477.94,\n", " 1798.73,\n", " 31134.12,\n", " 2147.9,\n", " 1900.27,\n", " 2909.68]\n", "\n", "\n", "# nested dictionary\n", "UtilityTypesOrdered = {'0': {'Food/Water': 0.4119323241317899,\n", " 'Farming/Labour': 0.26090828138913624,\n", " 'Shop': 0.17916295636687443,\n", " 'Savings Group': 0.07266251113089937,\n", " 'Fuel/Energy': 0.034194122885129116,\n", " 'Transport': 0.02617987533392698,\n", " 'Health': 0.006767586821015138,\n", " 'Education': 0.004096170970614425,\n", " 'None': 0.004096170970614425},\n", " '1': {'Food/Water': 1.0},\n", " '2': {'Savings Group': 0.87890625,\n", " 'Health': 0.08984375,\n", " 'Food/Water': 0.03125},\n", " '3': {'Savings Group': 0.4905964535196131,\n", " 'Farming/Labour': 0.3610961848468565,\n", " 'Food/Water': 0.14830736163353037},\n", " '4': {'Farming/Labour': 0.2843866171003718,\n", " 'Shop': 0.25650557620817843,\n", " 'Fuel/Energy': 0.17843866171003717,\n", " 'Food/Water': 0.16171003717472118,\n", " 'None': 0.10966542750929369,\n", " 'Savings Group': 0.0055762081784386614,\n", " 'Transport': 0.0037174721189591076},\n", " '5': {'Farming/Labour': 0.421875,\n", " 'Food/Water': 0.421875,\n", " 'Shop': 0.0625,\n", " 'Savings Group': 0.03125,\n", " 'Fuel/Energy': 0.03125,\n", " 'Transport': 0.03125},\n", " '6': {'Savings Group': 0.6008097165991902,\n", " 'Food/Water': 0.35870445344129553,\n", " 'Shop': 0.04048582995951417},\n", " '7': {'Farming/Labour': 0.4346590909090909,\n", " 'Food/Water': 0.2869318181818182,\n", " 'Shop': 0.1278409090909091,\n", " 'Fuel/Energy': 0.07670454545454546,\n", " 'Savings Group': 0.03977272727272727,\n", " 'Education': 0.017045454545454544,\n", " 'None': 0.011363636363636364,\n", " 'Transport': 0.002840909090909091,\n", " 'Health': 0.002840909090909091},\n", " '8': {'Savings Group': 1.0},\n", " '9': {'Savings Group': 0.7142857142857143,\n", " 'Food/Water': 0.18181818181818182,\n", " 'Farming/Labour': 0.07792207792207792,\n", " 'Education': 0.025974025974025976},\n", " '10': {'Food/Water': 0.3499875508340941,\n", " 'Farming/Labour': 0.3162088140094614,\n", " 'Shop': 0.21047389824881732,\n", " 'Transport': 0.03950535314133953,\n", " 'None': 0.03386173126400531,\n", " 'Fuel/Energy': 0.022491493069964313,\n", " 'Education': 0.01709685451074778,\n", " 'Savings Group': 0.006473566271059839,\n", " 'Environment': 0.002157855423686613,\n", " 'Health': 0.0016598887874512407,\n", " 'Chama': 8.299443937256204e-05},\n", " '11': {'Savings Group': 0.4873417721518987,\n", " 'Food/Water': 0.3377445339470656,\n", " 'Education': 0.09723820483314154,\n", " 'Farming/Labour': 0.06271576524741082,\n", " 'Shop': 0.014959723820483314},\n", " '12': {'Food/Water': 0.34994337485843713,\n", " 'Shop': 0.2332955832389581,\n", " 'Farming/Labour': 0.19592298980747452,\n", " 'Fuel/Energy': 0.057757644394110984,\n", " 'Savings Group': 0.053227633069082674,\n", " 'Education': 0.05096262740656852,\n", " 'None': 0.026047565118912798,\n", " 'Transport': 0.020385050962627407,\n", " 'Health': 0.011325028312570781,\n", " 'Environment': 0.0011325028312570782},\n", " '13': {'Savings Group': 0.3712871287128713,\n", " 'Food/Water': 0.247974797479748,\n", " 'Shop': 0.19801980198019803,\n", " 'Fuel/Energy': 0.08235823582358236,\n", " 'Health': 0.07605760576057606,\n", " 'Farming/Labour': 0.024302430243024302},\n", " '14': {'Savings Group': 1.0},\n", " '15': {'Savings Group': 1.0},\n", " '16': {'Savings Group': 0.5, 'Food/Water': 0.5},\n", " '17': {'Savings Group': 0.7335701598579041,\n", " 'Shop': 0.17584369449378331,\n", " 'Food/Water': 0.0905861456483126},\n", " '18': {'Savings Group': 0.6984126984126984,\n", " 'Food/Water': 0.23809523809523808,\n", " 'Farming/Labour': 0.06349206349206349},\n", " '19': {'Savings Group': 1.0},\n", " '20': {'Savings Group': 1.0},\n", " '21': {'Farming/Labour': 0.47619047619047616,\n", " 'Food/Water': 0.3333333333333333,\n", " 'Shop': 0.09523809523809523,\n", " 'Fuel/Energy': 0.047619047619047616,\n", " 'Transport': 0.047619047619047616},\n", " '22': {'Food/Water': 0.33040588654165676,\n", " 'Farming/Labour': 0.3209114645145977,\n", " 'Shop': 0.164016140517446,\n", " 'None': 0.06147638262520769,\n", " 'Fuel/Energy': 0.05008307619273677,\n", " 'Transport': 0.028957987182530263,\n", " 'Savings Group': 0.023973415618324233,\n", " 'Education': 0.014478993591265131,\n", " 'Health': 0.0035604082601471635,\n", " 'Environment': 0.0011868027533823878,\n", " 'Staff': 0.00047472110135295516,\n", " 'Chama': 0.00023736055067647758,\n", " 'Game': 0.00023736055067647758},\n", " '23': {'Savings Group': 0.8323424494649228,\n", " 'Farming/Labour': 0.16765755053507728},\n", " '24': {'Farming/Labour': 0.38481675392670156,\n", " 'Food/Water': 0.3717277486910995,\n", " 'Shop': 0.1387434554973822,\n", " 'Fuel/Energy': 0.05235602094240838,\n", " 'Transport': 0.02356020942408377,\n", " 'Savings Group': 0.01832460732984293,\n", " 'Education': 0.007853403141361256,\n", " 'Staff': 0.002617801047120419},\n", " '25': {'Savings Group': 0.7916666666666666,\n", " 'Food/Water': 0.20833333333333334},\n", " '26': {'Savings Group': 0.7442348008385744, 'Food/Water': 0.2557651991614256},\n", " '27': {'Food/Water': 0.3333333333333333,\n", " 'Farming/Labour': 0.25,\n", " 'Health': 0.25,\n", " 'Savings Group': 0.08333333333333333,\n", " 'Fuel/Energy': 0.08333333333333333},\n", " '28': {'Food/Water': 1.0},\n", " '29': {'Food/Water': 0.27335640138408307,\n", " 'Farming/Labour': 0.23529411764705882,\n", " 'Shop': 0.21972318339100347,\n", " 'Fuel/Energy': 0.21280276816608998,\n", " 'None': 0.03806228373702422,\n", " 'Education': 0.006920415224913495,\n", " 'Transport': 0.006920415224913495,\n", " 'Savings Group': 0.005190311418685121,\n", " 'Staff': 0.0017301038062283738},\n", " '30': {'Food/Water': 0.36228287841191065,\n", " 'Shop': 0.2679900744416873,\n", " 'Farming/Labour': 0.21712158808933002,\n", " 'Savings Group': 0.08436724565756824,\n", " 'Education': 0.02481389578163772,\n", " 'Fuel/Energy': 0.018610421836228287,\n", " 'Transport': 0.017369727047146403,\n", " 'None': 0.0037220843672456576,\n", " 'Health': 0.0024813895781637717,\n", " 'Environment': 0.0012406947890818859},\n", " '31': {'Savings Group': 0.8,\n", " 'Food/Water': 0.13333333333333333,\n", " 'Shop': 0.06666666666666667},\n", " '32': {'Savings Group': 0.7444444444444445,\n", " 'Farming/Labour': 0.2,\n", " 'Food/Water': 0.05555555555555555},\n", " '33': {'Food/Water': 0.33343474292668085,\n", " 'Farming/Labour': 0.28414968055978096,\n", " 'Savings Group': 0.18892607240644965,\n", " 'Shop': 0.1146942500760572,\n", " 'Fuel/Energy': 0.06936416184971098,\n", " 'None': 0.006693033160937024,\n", " 'Education': 0.0027380590203833284},\n", " '34': {'Savings Group': 1.0},\n", " '35': {'Food/Water': 0.3829787234042553,\n", " 'Farming/Labour': 0.2390488110137672,\n", " 'Shop': 0.1902377972465582,\n", " 'Savings Group': 0.07259073842302878,\n", " 'Transport': 0.060075093867334166,\n", " 'Health': 0.030037546933667083,\n", " 'Fuel/Energy': 0.016270337922403004,\n", " 'None': 0.0050062578222778474,\n", " 'Education': 0.0037546933667083854},\n", " '36': {'Savings Group': 1.0},\n", " '37': {'Farming/Labour': 0.5454545454545454,\n", " 'Food/Water': 0.36363636363636365,\n", " 'Savings Group': 0.045454545454545456,\n", " 'Shop': 0.045454545454545456},\n", " '38': {'Savings Group': 1.0},\n", " '39': {'Savings Group': 1.0},\n", " '40': {'Farming/Labour': 0.3595236417447678,\n", " 'Food/Water': 0.3165386512578395,\n", " 'Shop': 0.18842928616728913,\n", " 'Fuel/Energy': 0.05108871820167712,\n", " 'None': 0.0360439715312522,\n", " 'Transport': 0.022443802409978154,\n", " 'Education': 0.01039391163413431,\n", " 'Savings Group': 0.00842083010358678,\n", " 'Health': 0.004545134240011275,\n", " 'Staff': 0.0011627087590726517,\n", " 'Environment': 0.0010570079627933197,\n", " 'System': 0.00035233598759777326},\n", " '41': {'Food/Water': 0.33003300330033003,\n", " 'Farming/Labour': 0.2739273927392739,\n", " 'Shop': 0.1782178217821782,\n", " 'Savings Group': 0.13861386138613863,\n", " 'Health': 0.0429042904290429,\n", " 'Fuel/Energy': 0.0165016501650165,\n", " 'Transport': 0.0165016501650165,\n", " 'Education': 0.0033003300330033004},\n", " '42': {'Savings Group': 0.8661740558292282, 'Health': 0.13382594417077176},\n", " '43': {'Savings Group': 1.0},\n", " '44': {'Food/Water': 0.4805194805194805,\n", " 'Shop': 0.14285714285714285,\n", " 'Savings Group': 0.14285714285714285,\n", " 'Farming/Labour': 0.13636363636363635,\n", " 'Health': 0.06493506493506493,\n", " 'Transport': 0.012987012987012988,\n", " 'Environment': 0.012987012987012988,\n", " 'Fuel/Energy': 0.006493506493506494},\n", " '45': {'Food/Water': 0.35471100554235946,\n", " 'Farming/Labour': 0.2414885193982581,\n", " 'Shop': 0.23198733174980204,\n", " 'Education': 0.03800475059382423,\n", " 'None': 0.035629453681710214,\n", " 'Transport': 0.035629453681710214,\n", " 'Fuel/Energy': 0.028503562945368172,\n", " 'Savings Group': 0.02454473475851148,\n", " 'Health': 0.006334125098970704,\n", " 'Environment': 0.001583531274742676,\n", " 'Staff': 0.000791765637371338,\n", " 'System': 0.000791765637371338},\n", " '46': {'Savings Group': 0.6981132075471698,\n", " 'Health': 0.18867924528301888,\n", " 'Food/Water': 0.09433962264150944,\n", " 'Shop': 0.018867924528301886},\n", " '47': {'Savings Group': 0.5555555555555556,\n", " 'Farming/Labour': 0.2222222222222222,\n", " 'Food/Water': 0.2222222222222222},\n", " '48': {'Food/Water': 0.38795180722891565,\n", " 'Savings Group': 0.38313253012048193,\n", " 'Health': 0.10120481927710843,\n", " 'Shop': 0.09879518072289156,\n", " 'Fuel/Energy': 0.016867469879518072,\n", " 'Farming/Labour': 0.012048192771084338},\n", " '49': {'Food/Water': 0.3829787234042553,\n", " 'Savings Group': 0.3829787234042553,\n", " 'Education': 0.19148936170212766,\n", " 'Fuel/Energy': 0.0425531914893617},\n", " 'external': {'Food/Water': 1,\n", " 'Fuel/Energy': 2,\n", " 'Health': 3,\n", " 'Education': 4,\n", " 'Savings Group': 5,\n", " 'Shop': 6}}\n", " \n", "# nested dictionary \n", "utilityTypesProbability = {'0': {'Food/Water': 0.4119323241317899,\n", " 'Farming/Labour': 0.26090828138913624,\n", " 'Shop': 0.17916295636687443,\n", " 'Savings Group': 0.07266251113089937,\n", " 'Fuel/Energy': 0.034194122885129116,\n", " 'Transport': 0.02617987533392698,\n", " 'Health': 0.006767586821015138,\n", " 'Education': 0.004096170970614425,\n", " 'None': 0.004096170970614425},\n", " '1': {'Food/Water': 1.0},\n", " '2': {'Savings Group': 0.87890625,\n", " 'Health': 0.08984375,\n", " 'Food/Water': 0.03125},\n", " '3': {'Savings Group': 0.4905964535196131,\n", " 'Farming/Labour': 0.3610961848468565,\n", " 'Food/Water': 0.14830736163353037},\n", " '4': {'Farming/Labour': 0.2843866171003718,\n", " 'Shop': 0.25650557620817843,\n", " 'Fuel/Energy': 0.17843866171003717,\n", " 'Food/Water': 0.16171003717472118,\n", " 'None': 0.10966542750929369,\n", " 'Savings Group': 0.0055762081784386614,\n", " 'Transport': 0.0037174721189591076},\n", " '5': {'Farming/Labour': 0.421875,\n", " 'Food/Water': 0.421875,\n", " 'Shop': 0.0625,\n", " 'Savings Group': 0.03125,\n", " 'Fuel/Energy': 0.03125,\n", " 'Transport': 0.03125},\n", " '6': {'Savings Group': 0.6008097165991902,\n", " 'Food/Water': 0.35870445344129553,\n", " 'Shop': 0.04048582995951417},\n", " '7': {'Farming/Labour': 0.4346590909090909,\n", " 'Food/Water': 0.2869318181818182,\n", " 'Shop': 0.1278409090909091,\n", " 'Fuel/Energy': 0.07670454545454546,\n", " 'Savings Group': 0.03977272727272727,\n", " 'Education': 0.017045454545454544,\n", " 'None': 0.011363636363636364,\n", " 'Transport': 0.002840909090909091,\n", " 'Health': 0.002840909090909091},\n", " '8': {'Savings Group': 1.0},\n", " '9': {'Savings Group': 0.7142857142857143,\n", " 'Food/Water': 0.18181818181818182,\n", " 'Farming/Labour': 0.07792207792207792,\n", " 'Education': 0.025974025974025976},\n", " '10': {'Food/Water': 0.3499875508340941,\n", " 'Farming/Labour': 0.3162088140094614,\n", " 'Shop': 0.21047389824881732,\n", " 'Transport': 0.03950535314133953,\n", " 'None': 0.03386173126400531,\n", " 'Fuel/Energy': 0.022491493069964313,\n", " 'Education': 0.01709685451074778,\n", " 'Savings Group': 0.006473566271059839,\n", " 'Environment': 0.002157855423686613,\n", " 'Health': 0.0016598887874512407,\n", " 'Chama': 8.299443937256204e-05},\n", " '11': {'Savings Group': 0.4873417721518987,\n", " 'Food/Water': 0.3377445339470656,\n", " 'Education': 0.09723820483314154,\n", " 'Farming/Labour': 0.06271576524741082,\n", " 'Shop': 0.014959723820483314},\n", " '12': {'Food/Water': 0.34994337485843713,\n", " 'Shop': 0.2332955832389581,\n", " 'Farming/Labour': 0.19592298980747452,\n", " 'Fuel/Energy': 0.057757644394110984,\n", " 'Savings Group': 0.053227633069082674,\n", " 'Education': 0.05096262740656852,\n", " 'None': 0.026047565118912798,\n", " 'Transport': 0.020385050962627407,\n", " 'Health': 0.011325028312570781,\n", " 'Environment': 0.0011325028312570782},\n", " '13': {'Savings Group': 0.3712871287128713,\n", " 'Food/Water': 0.247974797479748,\n", " 'Shop': 0.19801980198019803,\n", " 'Fuel/Energy': 0.08235823582358236,\n", " 'Health': 0.07605760576057606,\n", " 'Farming/Labour': 0.024302430243024302},\n", " '14': {'Savings Group': 1.0},\n", " '15': {'Savings Group': 1.0},\n", " '16': {'Savings Group': 0.5, 'Food/Water': 0.5},\n", " '17': {'Savings Group': 0.7335701598579041,\n", " 'Shop': 0.17584369449378331,\n", " 'Food/Water': 0.0905861456483126},\n", " '18': {'Savings Group': 0.6984126984126984,\n", " 'Food/Water': 0.23809523809523808,\n", " 'Farming/Labour': 0.06349206349206349},\n", " '19': {'Savings Group': 1.0},\n", " '20': {'Savings Group': 1.0},\n", " '21': {'Farming/Labour': 0.47619047619047616,\n", " 'Food/Water': 0.3333333333333333,\n", " 'Shop': 0.09523809523809523,\n", " 'Fuel/Energy': 0.047619047619047616,\n", " 'Transport': 0.047619047619047616},\n", " '22': {'Food/Water': 0.33040588654165676,\n", " 'Farming/Labour': 0.3209114645145977,\n", " 'Shop': 0.164016140517446,\n", " 'None': 0.06147638262520769,\n", " 'Fuel/Energy': 0.05008307619273677,\n", " 'Transport': 0.028957987182530263,\n", " 'Savings Group': 0.023973415618324233,\n", " 'Education': 0.014478993591265131,\n", " 'Health': 0.0035604082601471635,\n", " 'Environment': 0.0011868027533823878,\n", " 'Staff': 0.00047472110135295516,\n", " 'Chama': 0.00023736055067647758,\n", " 'Game': 0.00023736055067647758},\n", " '23': {'Savings Group': 0.8323424494649228,\n", " 'Farming/Labour': 0.16765755053507728},\n", " '24': {'Farming/Labour': 0.38481675392670156,\n", " 'Food/Water': 0.3717277486910995,\n", " 'Shop': 0.1387434554973822,\n", " 'Fuel/Energy': 0.05235602094240838,\n", " 'Transport': 0.02356020942408377,\n", " 'Savings Group': 0.01832460732984293,\n", " 'Education': 0.007853403141361256,\n", " 'Staff': 0.002617801047120419},\n", " '25': {'Savings Group': 0.7916666666666666,\n", " 'Food/Water': 0.20833333333333334},\n", " '26': {'Savings Group': 0.7442348008385744, 'Food/Water': 0.2557651991614256},\n", " '27': {'Food/Water': 0.3333333333333333,\n", " 'Farming/Labour': 0.25,\n", " 'Health': 0.25,\n", " 'Savings Group': 0.08333333333333333,\n", " 'Fuel/Energy': 0.08333333333333333},\n", " '28': {'Food/Water': 1.0},\n", " '29': {'Food/Water': 0.27335640138408307,\n", " 'Farming/Labour': 0.23529411764705882,\n", " 'Shop': 0.21972318339100347,\n", " 'Fuel/Energy': 0.21280276816608998,\n", " 'None': 0.03806228373702422,\n", " 'Education': 0.006920415224913495,\n", " 'Transport': 0.006920415224913495,\n", " 'Savings Group': 0.005190311418685121,\n", " 'Staff': 0.0017301038062283738},\n", " '30': {'Food/Water': 0.36228287841191065,\n", " 'Shop': 0.2679900744416873,\n", " 'Farming/Labour': 0.21712158808933002,\n", " 'Savings Group': 0.08436724565756824,\n", " 'Education': 0.02481389578163772,\n", " 'Fuel/Energy': 0.018610421836228287,\n", " 'Transport': 0.017369727047146403,\n", " 'None': 0.0037220843672456576,\n", " 'Health': 0.0024813895781637717,\n", " 'Environment': 0.0012406947890818859},\n", " '31': {'Savings Group': 0.8,\n", " 'Food/Water': 0.13333333333333333,\n", " 'Shop': 0.06666666666666667},\n", " '32': {'Savings Group': 0.7444444444444445,\n", " 'Farming/Labour': 0.2,\n", " 'Food/Water': 0.05555555555555555},\n", " '33': {'Food/Water': 0.33343474292668085,\n", " 'Farming/Labour': 0.28414968055978096,\n", " 'Savings Group': 0.18892607240644965,\n", " 'Shop': 0.1146942500760572,\n", " 'Fuel/Energy': 0.06936416184971098,\n", " 'None': 0.006693033160937024,\n", " 'Education': 0.0027380590203833284},\n", " '34': {'Savings Group': 1.0},\n", " '35': {'Food/Water': 0.3829787234042553,\n", " 'Farming/Labour': 0.2390488110137672,\n", " 'Shop': 0.1902377972465582,\n", " 'Savings Group': 0.07259073842302878,\n", " 'Transport': 0.060075093867334166,\n", " 'Health': 0.030037546933667083,\n", " 'Fuel/Energy': 0.016270337922403004,\n", " 'None': 0.0050062578222778474,\n", " 'Education': 0.0037546933667083854},\n", " '36': {'Savings Group': 1.0},\n", " '37': {'Farming/Labour': 0.5454545454545454,\n", " 'Food/Water': 0.36363636363636365,\n", " 'Savings Group': 0.045454545454545456,\n", " 'Shop': 0.045454545454545456},\n", " '38': {'Savings Group': 1.0},\n", " '39': {'Savings Group': 1.0},\n", " '40': {'Farming/Labour': 0.3595236417447678,\n", " 'Food/Water': 0.3165386512578395,\n", " 'Shop': 0.18842928616728913,\n", " 'Fuel/Energy': 0.05108871820167712,\n", " 'None': 0.0360439715312522,\n", " 'Transport': 0.022443802409978154,\n", " 'Education': 0.01039391163413431,\n", " 'Savings Group': 0.00842083010358678,\n", " 'Health': 0.004545134240011275,\n", " 'Staff': 0.0011627087590726517,\n", " 'Environment': 0.0010570079627933197,\n", " 'System': 0.00035233598759777326},\n", " '41': {'Food/Water': 0.33003300330033003,\n", " 'Farming/Labour': 0.2739273927392739,\n", " 'Shop': 0.1782178217821782,\n", " 'Savings Group': 0.13861386138613863,\n", " 'Health': 0.0429042904290429,\n", " 'Fuel/Energy': 0.0165016501650165,\n", " 'Transport': 0.0165016501650165,\n", " 'Education': 0.0033003300330033004},\n", " '42': {'Savings Group': 0.8661740558292282, 'Health': 0.13382594417077176},\n", " '43': {'Savings Group': 1.0},\n", " '44': {'Food/Water': 0.4805194805194805,\n", " 'Shop': 0.14285714285714285,\n", " 'Savings Group': 0.14285714285714285,\n", " 'Farming/Labour': 0.13636363636363635,\n", " 'Health': 0.06493506493506493,\n", " 'Transport': 0.012987012987012988,\n", " 'Environment': 0.012987012987012988,\n", " 'Fuel/Energy': 0.006493506493506494},\n", " '45': {'Food/Water': 0.35471100554235946,\n", " 'Farming/Labour': 0.2414885193982581,\n", " 'Shop': 0.23198733174980204,\n", " 'Education': 0.03800475059382423,\n", " 'None': 0.035629453681710214,\n", " 'Transport': 0.035629453681710214,\n", " 'Fuel/Energy': 0.028503562945368172,\n", " 'Savings Group': 0.02454473475851148,\n", " 'Health': 0.006334125098970704,\n", " 'Environment': 0.001583531274742676,\n", " 'Staff': 0.000791765637371338,\n", " 'System': 0.000791765637371338},\n", " '46': {'Savings Group': 0.6981132075471698,\n", " 'Health': 0.18867924528301888,\n", " 'Food/Water': 0.09433962264150944,\n", " 'Shop': 0.018867924528301886},\n", " '47': {'Savings Group': 0.5555555555555556,\n", " 'Farming/Labour': 0.2222222222222222,\n", " 'Food/Water': 0.2222222222222222},\n", " '48': {'Food/Water': 0.38795180722891565,\n", " 'Savings Group': 0.38313253012048193,\n", " 'Health': 0.10120481927710843,\n", " 'Shop': 0.09879518072289156,\n", " 'Fuel/Energy': 0.016867469879518072,\n", " 'Farming/Labour': 0.012048192771084338},\n", " '49': {'Food/Water': 0.3829787234042553,\n", " 'Savings Group': 0.3829787234042553,\n", " 'Education': 0.19148936170212766,\n", " 'Fuel/Energy': 0.0425531914893617},\n", " 'external': {'Food/Water': 0.6,\n", " 'Fuel/Energy': 0.1,\n", " 'Health': 0.03,\n", " 'Education': 0.015,\n", " 'Savings Group': 0.065,\n", " 'Shop': 0.19}}\n", "\n", "# agent:[centrality,allocationValue]\n", "agentAllocation = {'0': [1, 1],\n", " '1': [1, 1],\n", " '2': [1, 1],\n", " '3': [1, 1],\n", " '4': [1, 1],\n", " '5': [1, 1],\n", " '6': [1, 1],\n", " '7': [1, 1],\n", " '8': [1, 1],\n", " '9': [1, 1],\n", " '10': [1, 1],\n", " '11': [1, 1],\n", " '12': [1, 1],\n", " '13': [1, 1],\n", " '14': [1, 1],\n", " '15': [1, 1],\n", " '16': [1, 1],\n", " '17': [1, 1],\n", " '18': [1, 1],\n", " '19': [1, 1],\n", " '20': [1, 1],\n", " '21': [1, 1],\n", " '22': [1, 1],\n", " '23': [1, 1],\n", " '24': [1, 1],\n", " '25': [1, 1],\n", " '26': [1, 1],\n", " '27': [1, 1],\n", " '28': [1, 1],\n", " '29': [1, 1],\n", " '30': [1, 1],\n", " '31': [1, 1],\n", " '32': [1, 1],\n", " '33': [1, 1],\n", " '34': [1, 1],\n", " '35': [1, 1],\n", " '36': [1, 1],\n", " '37': [1, 1],\n", " '38': [1, 1],\n", " '39': [1, 1],\n", " '40': [1, 1],\n", " '41': [1, 1],\n", " '42': [1, 1],\n", " '43': [1, 1],\n", " '44': [1, 1],\n", " '45': [1, 1],\n", " '46': [1, 1],\n", " '47': [1, 1],\n", " '48': [1, 1],\n", " '49': [1, 1]}\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.5" } }, "nbformat": 4, "nbformat_minor": 2 }