{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Project: Vitória No-show Medical Appointment\n", "\n", "\n", "## Table of Contents\n", "\n", "\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Student Tags\n", "\n", "Author: Anderson Hitoshi Uyekita \n", "Course: Data Science - Foundation I \n", "COD: ND110 \n", "Date: 13/12/2018 \n", "Dataset: No-show appointments \n", "Version: 1.0\n", "\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction \n", "\n", "I decided to face this specific problem because I am from Brazil and it will be a big challenge for me to make any conclusions using this data. I am an Electrical Engineer, and I have never been to Vitória, even more, I do not anything about the public health care systems.\n", "\n", "It's a bit complicated to address any kind of question (beforehand) if I do not have a clue about the variables in this dataset. Obviously, (Now) I have already loaded the data and printed some summaries, and my approach for this Project is aggregate new variables (all these new information are public and free to access) to enhance my analysis.\n", "\n", "With these new variables, such as Average Income per Month, Number of inhabitants in each Neighborhood (divided into male and female), Neighborhood, and Regional Administration.\n", "\n", "I hope at the end this document I have made good questions, and the most important, I have \"answered\" properly.\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Objectivies\n", "\n", "Although I do not have a question yet, I wonder to this project an opportunity to practice all the content learned \"in class\". So, my objectives for this project are:\n", "\n", "* Having practised Data Wrangling\n", " * Gathering;\n", " * Assessing;\n", " * Cleaning;\n", "* Having practised EDA;\n", "* Having practised in Python Coding;\n", " * Also in Jupyter Notebook;\n", "* Making a reproducible research;\n", "* Having fun.\n", "\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Synopsis\n", "\n", "I have made along this document three questions:\n", "\n", ">**1. Are patients with many appointments are more likely to show-up then the patients with less than two appointments?**\n", ">>Seems to be true, but I lack information to affirm it, what I could say is when I analyse patient with a higher number of appointments, they tend to have a better rate of show-up.\n", "\n", ">**2. Are patients with many appointments in average older than those with few appointments?**\n", ">>I also identify a positive correlation, when I analyse patients with higher number of appointments they tend to have an age average higher than the population.\n", "\n", ">**3. Are patients with disabilities tend to have better rates of show-up? Is the number of disabilities raises the show-up rates?**\n", ">>Due to the small number of patients with disabilities, it is difficult to answer this question, but I have found a better rate of show-up in the group of patients with disabilities, however, patients with many disabilities tend to have worse show-up rates.\n", "\n", "All these questions I have answered using the original data and additional data collected on the web. My report is linear and I start with a Data Wrangling (Gather and Assess), after that I start with the EDA, in this step I stopped a while to gather new data on the internet, and I came back to finish the EDA and to pose and answer the questions.\n", "\n", "In the final of the document, I have inserted an appendix to segregate from the main document the Additional Data I have found.\n", "\n", "\n", "_Obs.: I really do not know if I am doing great and for this reason, I have decided to submit my report to receive a feedback. I am not sure If I am on the \"right path\". I know I have not used all the resources I have gathered in the web, and I do want to use them to pose a new question, but first I need a preliminary feedback._\n", "\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reproducibility\n", "\n", "I have written this report using the Jupyter Notebook to allow anyone the reproducibility of each step. Despite I have created some dataset to enhance my analysis, which turns everything much harder to reproduce, I have stored it in Github, so it's available to anyone to download and to use it.\n", "\n", "### Work envinronment\n", "\n", "All this research is performed using:\n", "\n", "* Dell Notebook Inspiron 7348;\n", "* Intel(R) Core(TM) i7-5500U CPU @ 2.40GHz 2.40GHz;\n", "* 8.00 GB;\n", "* Windows 10 Pro 64-bits.\n", "\n", "### Softwares\n", "\n", "To be honest, I did not use the Python IDLE in this project, because I wrote directly in Jupyter Notebook, so to perform study I code everything directly in Opera.\n", "\n", "* Opera\n", "* Atom\n", "\n", "I have used the Atom to push to Github repository, and nothing more than this.\n", "\n", "### Packages\n", "\n", "I kindly ask you to install each of this packages before you run the next steps.\n", "\n", "* Jupyter Notebook\n", "* Pandas\n", "* Numpy\n", "* Matplotlib\n", "\n", "### Repository\n", "\n", "You can access all files of this report in this repository:\n", "\n", "* https://github.com/AndersonUyekita/udacity_data_science_foundation_01\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Wrangling \n", "\n", "This file is available at Kaggle and maintained by [@JoniHoppen][data_1], is also available to download at GDrive.\n", "\n", "* Kaggle (uploaded by JoniHoppen), and;\n", "* GDrive\n", "\n", "### Brief about Vitória City\n", "\n", "This dataset records medical appointments from March to june of 2016, realized in publics hospitals in [Vitória](https://en.wikipedia.org/wiki/Vitória,_Espírito_Santo), a medium size city (almost 320.000 inhabitants) of Brazil located in the southeast region.\n", "\n", "\n", " \n", " \n", " \n", " \n", "

\n", "
Figure 1a - Brazil Map with Espírito Santo State Highlighted.
\n", "
Figure 1b - Vitória City, Capital of Espírito Santo State.

\n", "\n", "(**) Both pictures extracted from Wikipedia.\n", "\n", "Vitória is the Capital of Espírito Santo State, and has almost 320.000 inhabitants (also according with the Wikipedia) spread in 9 Regional Administration areas, which has a total of 80 neighborhoods.\n", "\n", "The public health systems are compounded by Basic Units of Health (_Unidade Básica de Saúde_, for the sake of this document UBS) spread all over the city, the Figure 2 shows how these UBS are located in the city.\n", "\n", "\n", "\n", "
Figure 2 - UBS in Vitória City.
\n", "\n", "I have extracted this illustration from the document [_Relatório de Gestão 2008_][data_2] from the _Sistema Único de Saúde_ of Vitória Council Health Secretary, based on this document and on these two:\n", "\n", "* [Equipamento de Saúde][data_3]: This document shows each UBS address;\n", "* [Município de Vitória - Territórios de Saúde][data_4]: Summarize the number of each kind of UBS.\n", "\n", "After a briefly compilation of info from these three sources, I have summed up 38 (See details here) UBS (or similars) in Vitória City.\n", "\n", "* Basic Units of Health with Family Strategy(US Família): 19 units;\n", "* Community Agent Health Program (US PACS): 8 units;\n", "* Polyclinic, Specialized or Reference Center: 11.\n", "\n", "I will assume for this report all appointments could be done in any of this 38 premises listed above.\n", "\n", "**Observation:** This is not important to the project, but could be important to someone who wants to learn more about this dataset. The Regional Administration Area is slightly different from the Health Territory Area because the Regional Administration Area has 9 areas and the Health Territory Area has 6, which means some neighbourhood were relocated to other regions. It could be much explicit if you compare these two files ([Regional Administrative Area][data_5] and [Health Territory Area][data_4]).\n", "\n", "If you are still curious about the location of each UBS, you can access the [GeoWeb][data_6], this is a web-based tool with a thematic map.\n", "\n", "### Data Structure\n", "\n", "Unfortunately, this dataset lacks a codebook, despite this hurdle, much of the problems could be solved reading the Kaggle Discussions Channel. This dataset has:\n", "\n", "* 110.527 rows (observations).\n", "* 14 columns (variable or features);\n", "\n", "Each of these variables could be interpreted by using the Table 1.\n", "\n", "
Table 1 - Variables Types Description
\n", "\n", "|Variable|Type|Description|\n", "|:-:|:-:|:-:|\n", "|PatientId|int|Internal ID to track the patient\n", "|AppointmentID|date|Internal ID to track the appointment\n", "|Gender|cat|F: Female and M: Male\n", "|ScheduledDay|date|Planned date to appointment\n", "|AppointmentDay|date|When the appointment occurs\n", "|Age|int|Person Age, if Age is negative the person is pregnant\n", "|Neighbourhood|str|In Brazil, each city is divided into Regional Administration (_Regiões Administrativas_)
and then in neighborhood (_bairros_). Read more in Appendix A.\n", "|Scholarship|bool|0: No and 1: Yes\n", "|Hipertension|bool|0: No and 1: Yes\n", "|Diabetes|bool|0: No and 1: Yes\n", "|Alcoholism|bool| 0: No and 1: Yes\n", "|Handcap|int| Means the quantity of disabilities of this person (0,1,2,3,4).\n", "|SMS_received|int|Number of SMS sent to the patient|\n", "|No-show|cat| No: Show-up and Yes: No-show\n", "\n", "### Additional Data\n", "\n", "I have found some good information available on the internet, and I decided to aggregate for my analysis. In the final of this document, in Appendix chapter, I have added some details from where I get this info. As results of it, I have written two datasets:\n", "\n", "* Vitória Regional Administration Data;\n", "* Vitória Neighborhood Data.\n", "* Vitória Age Data.\n", "* Vitória UBS Dataset.\n", "\n", "\n", "[data_1]: https://www.kaggle.com/joniarroba\n", "[data_2]: http://www.vitoria.es.gov.br/arquivos/20100519_saude_relator_gestao_2008.pdf\n", "[data_3]: http://sistemas7.vitoria.es.gov.br/GeoWebApi/Downloads/pdf/saude/Equipamentos_de_Saude.pdf\n", "[data_4]: http://sistemas7.vitoria.es.gov.br/GeoWebApi/Downloads/pdf/saude/Regioes_Territoriais_de_Saude.pdf\n", "[data_5]: http://sistemas7.vitoria.es.gov.br/GeoWebApi/Downloads/pdf/politicos/Regionais_Administrativas.pdf\n", "[data_6]: http://geoweb.vitoria.es.gov.br/\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Assess\n", "\n", "Let's start analysing the number of rows, columns, number of null values, number of duplicated rows, and data types. I am also interested in the first and last rows to assess if there is any problem with a header or a non-standard value in the final of the archive.\n", "\n", "### Requirements\n", "\n", "These are all the libraries I will use during this report these libraries:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Importing Libraries.\n", "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "% matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's load the `noshowappointments-kagglev2-may-2016.csv` file, according with the instructions:\n", "\n", ">_This dataset collects information from **100k medical appointments** in Brazil and is focused on the question of whether or not patients show up for their appointment. A number of characteristics about the patient are included in each row._\n", "\n", "Has 100.000 medical appointments." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Rows: 110527\n", "Columns: 14\n" ] } ], "source": [ "# Loading the dataset.\n", "df_vit = pd.read_csv('noshowappointments-kagglev2-may-2016.csv')\n", "\n", "# Plot the dimensions\n", "print(\"Rows: {}\\nColumns: {}\".format(*df_vit.shape))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As I expected, the number of rows is roughly 110k and 14 columns.\n", "\n", "Now, I want to see the first and the last 10 rows, just to ensure if everything is ok. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PatientIdAppointmentIDGenderScheduledDayAppointmentDayAgeNeighbourhoodScholarshipHipertensionDiabetesAlcoholismHandcapSMS_receivedNo-show
02.987250e+135642903F2016-04-29T18:38:08Z2016-04-29T00:00:00Z62JARDIM DA PENHA010000No
15.589978e+145642503M2016-04-29T16:08:27Z2016-04-29T00:00:00Z56JARDIM DA PENHA000000No
24.262962e+125642549F2016-04-29T16:19:04Z2016-04-29T00:00:00Z62MATA DA PRAIA000000No
38.679512e+115642828F2016-04-29T17:29:31Z2016-04-29T00:00:00Z8PONTAL DE CAMBURI000000No
48.841186e+125642494F2016-04-29T16:07:23Z2016-04-29T00:00:00Z56JARDIM DA PENHA011000No
\n", "
" ], "text/plain": [ " PatientId AppointmentID Gender ScheduledDay \\\n", "0 2.987250e+13 5642903 F 2016-04-29T18:38:08Z \n", "1 5.589978e+14 5642503 M 2016-04-29T16:08:27Z \n", "2 4.262962e+12 5642549 F 2016-04-29T16:19:04Z \n", "3 8.679512e+11 5642828 F 2016-04-29T17:29:31Z \n", "4 8.841186e+12 5642494 F 2016-04-29T16:07:23Z \n", "\n", " AppointmentDay Age Neighbourhood Scholarship Hipertension \\\n", "0 2016-04-29T00:00:00Z 62 JARDIM DA PENHA 0 1 \n", "1 2016-04-29T00:00:00Z 56 JARDIM DA PENHA 0 0 \n", "2 2016-04-29T00:00:00Z 62 MATA DA PRAIA 0 0 \n", "3 2016-04-29T00:00:00Z 8 PONTAL DE CAMBURI 0 0 \n", "4 2016-04-29T00:00:00Z 56 JARDIM DA PENHA 0 1 \n", "\n", " Diabetes Alcoholism Handcap SMS_received No-show \n", "0 0 0 0 0 No \n", "1 0 0 0 0 No \n", "2 0 0 0 0 No \n", "3 0 0 0 0 No \n", "4 1 0 0 0 No " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_vit.head(5) # Print the first 5 rows." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PatientIdAppointmentIDGenderScheduledDayAppointmentDayAgeNeighbourhoodScholarshipHipertensionDiabetesAlcoholismHandcapSMS_receivedNo-show
1105222.572134e+125651768F2016-05-03T09:15:35Z2016-06-07T00:00:00Z56MARIA ORTIZ000001No
1105233.596266e+125650093F2016-05-03T07:27:33Z2016-06-07T00:00:00Z51MARIA ORTIZ000001No
1105241.557663e+135630692F2016-04-27T16:03:52Z2016-06-07T00:00:00Z21MARIA ORTIZ000001No
1105259.213493e+135630323F2016-04-27T15:09:23Z2016-06-07T00:00:00Z38MARIA ORTIZ000001No
1105263.775115e+145629448F2016-04-27T13:30:56Z2016-06-07T00:00:00Z54MARIA ORTIZ000001No
\n", "
" ], "text/plain": [ " PatientId AppointmentID Gender ScheduledDay \\\n", "110522 2.572134e+12 5651768 F 2016-05-03T09:15:35Z \n", "110523 3.596266e+12 5650093 F 2016-05-03T07:27:33Z \n", "110524 1.557663e+13 5630692 F 2016-04-27T16:03:52Z \n", "110525 9.213493e+13 5630323 F 2016-04-27T15:09:23Z \n", "110526 3.775115e+14 5629448 F 2016-04-27T13:30:56Z \n", "\n", " AppointmentDay Age Neighbourhood Scholarship Hipertension \\\n", "110522 2016-06-07T00:00:00Z 56 MARIA ORTIZ 0 0 \n", "110523 2016-06-07T00:00:00Z 51 MARIA ORTIZ 0 0 \n", "110524 2016-06-07T00:00:00Z 21 MARIA ORTIZ 0 0 \n", "110525 2016-06-07T00:00:00Z 38 MARIA ORTIZ 0 0 \n", "110526 2016-06-07T00:00:00Z 54 MARIA ORTIZ 0 0 \n", "\n", " Diabetes Alcoholism Handcap SMS_received No-show \n", "110522 0 0 0 1 No \n", "110523 0 0 0 1 No \n", "110524 0 0 0 1 No \n", "110525 0 0 0 1 No \n", "110526 0 0 0 1 No " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_vit.tail(5) # Print the last 5 rows." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After check the first and last rows, and confirm nothing is wrong with it. I will analyse the number of duplicated entries." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of duplicated rows: 0\n" ] } ], "source": [ "# Sum the number of rows with duplicated entries.\n", "dupe = sum(df_vit.duplicated())\n", "\n", "print(\"Number of duplicated rows: {}\".format(dupe))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nice dataframe, no duplicated entries (identical row). Now, let's find any NA value or any kind of strange value using the `.info()` method. Note with results of `.info()` I can check if there is inconsistency in the data type of each variable." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 110527 entries, 0 to 110526\n", "Data columns (total 14 columns):\n", "PatientId 110527 non-null float64\n", "AppointmentID 110527 non-null int64\n", "Gender 110527 non-null object\n", "ScheduledDay 110527 non-null object\n", "AppointmentDay 110527 non-null object\n", "Age 110527 non-null int64\n", "Neighbourhood 110527 non-null object\n", "Scholarship 110527 non-null int64\n", "Hipertension 110527 non-null int64\n", "Diabetes 110527 non-null int64\n", "Alcoholism 110527 non-null int64\n", "Handcap 110527 non-null int64\n", "SMS_received 110527 non-null int64\n", "No-show 110527 non-null object\n", "dtypes: float64(1), int64(8), object(5)\n", "memory usage: 11.8+ MB\n" ] } ], "source": [ "# Check if any rows has NA values.\n", "df_vit.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Based on the `.info()` results, I have found some problem, so I will only comment which I think is wrong.\n", "\n", "* `PatientId`: ID to identify the patient, must be int;\n", "* `ScheduledDay` and `AppointmentDay`: Are dates, I will need to fix it;\n", "* `Scholarship`, `Hipertension`, `Diabetes`, `No-show`, and `Alcoholism`: Boolean variables, I also need to fix it later.\n", "\n", "Let's see the columns names, maybe it will be necessary an adjustment in the upper case to lower case." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['PatientId', 'AppointmentID', 'Gender', 'ScheduledDay',\n", " 'AppointmentDay', 'Age', 'Neighbourhood', 'Scholarship', 'Hipertension',\n", " 'Diabetes', 'Alcoholism', 'Handcap', 'SMS_received', 'No-show'],\n", " dtype='object')" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Print the columns names.\n", "df_vit.columns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Fortunately, I did not find any space between the variable's name, however, there is a plenty of variable with One uppercase in the middle of the variable. To put everything uniformly, later I will apply the `.lower()`.\n", "\n", "The `.nunique()` method prints the number of unique values for each variable. " ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "PatientId 62299\n", "AppointmentID 110527\n", "Gender 2\n", "ScheduledDay 103549\n", "AppointmentDay 27\n", "Age 104\n", "Neighbourhood 81\n", "Scholarship 2\n", "Hipertension 2\n", "Diabetes 2\n", "Alcoholism 2\n", "Handcap 5\n", "SMS_received 2\n", "No-show 2\n", "dtype: int64" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Print the number of unique values for each variable.\n", "df_vit.nunique()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `.nunique()` method results are very clarifying because I realize some `PatientId` appointed more than once." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Appointed more than once: 43.63%\n", "Appointed one time: 56.37%\n" ] } ], "source": [ "unique_patientid = round(100*(110527-62299)/110527,2),round(100*(62299)/110527,2)\n", "\n", "print(\"Appointed more than once: {}%\\nAppointed one time: {}%\".format(*unique_patientid))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's take a look in the `Age` variable, I want to know the `.max()`, `.min()`, `.mean()` etc, the `.decribe()` shows a very good summary." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "count 110527.000000\n", "mean 37.088874\n", "std 23.110205\n", "min -1.000000\n", "25% 18.000000\n", "50% 37.000000\n", "75% 55.000000\n", "max 115.000000\n", "Name: Age, dtype: float64" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Shows the mean, max, min etc.\n", "df_vit['Age'].describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The minimum age of -1 is quite weird. So, I will print all the observations with less than zero age." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PatientIdAppointmentIDGenderScheduledDayAppointmentDayAgeNeighbourhoodScholarshipHipertensionDiabetesAlcoholismHandcapSMS_receivedNo-show
998324.659432e+145775010F2016-06-06T08:58:13Z2016-06-06T00:00:00Z-1ROMÃO000000No
\n", "
" ], "text/plain": [ " PatientId AppointmentID Gender ScheduledDay \\\n", "99832 4.659432e+14 5775010 F 2016-06-06T08:58:13Z \n", "\n", " AppointmentDay Age Neighbourhood Scholarship Hipertension \\\n", "99832 2016-06-06T00:00:00Z -1 ROMÃO 0 0 \n", "\n", " Diabetes Alcoholism Handcap SMS_received No-show \n", "99832 0 0 0 0 No " ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Cases of age less than zero.\n", "df_vit[df_vit.Age < 0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Only one observation has age below of zero, for this reason, in the next chapter (Data Cleaning) I will remove this observation.\n", "\n", "I would also like to record when this appointment happens." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "nov: 1\n", "dez: 61\n", "jan: 60\n", "fev: 281\n", "mar: 3614\n", "mar: 25339\n", "aph: 67421\n", "jun: 13750\n" ] } ], "source": [ "nov = sum(df_vit['ScheduledDay'].str.contains('2015-11'))\n", "dez = sum(df_vit['ScheduledDay'].str.contains('2015-12'))\n", "jan = sum(df_vit['ScheduledDay'].str.contains('2016-01'))\n", "fev = sum(df_vit['ScheduledDay'].str.contains('2016-02'))\n", "mar = sum(df_vit['ScheduledDay'].str.contains('2016-03'))\n", "aph = sum(df_vit['ScheduledDay'].str.contains('2016-04'))\n", "may = sum(df_vit['ScheduledDay'].str.contains('2016-05'))\n", "jun = sum(df_vit['ScheduledDay'].str.contains('2016-06'))\n", "\n", "print(\"nov: {}\\ndez: {}\\njan: {}\\nfev: {}\\nmar: {}\\nmar: {}\\naph: {}\\njun: {}\".format(nov,dez,jan,fev,mar,aph,may,jun))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The appointment distribution is very left-skewed, most of the record observation was made in April, March, and June.\n", "\n", "Move on to `Handcap` variable, which has 5 levels:\n", "\n", "* 0: No disabilities;\n", "* 1: One disability;\n", "* 2: Two disabilities, and so on." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 108286\n", "1 2042\n", "2 183\n", "3 13\n", "4 3\n", "Name: Handcap, dtype: int64" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# If a person has disabilities (blind, dumb, etc.)\n", "df_vit['Handcap'].value_counts() # This is cummulative and I do not know what kind o disabilities each patient has." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After that, most of the patients do not have disabilities.\n", "\n", "Let's see the patient gender. " ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "F 71840\n", "M 38687\n", "Name: Gender, dtype: int64" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Gender\n", "df_vit.Gender.value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Women are predominant in this medical appointments.\n", "\n", "The next variable is `SMS_received`, adopting the interpretation of the creator/maintainer of the dataset, this variable is the number of messages sent/received by the patient." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 75045\n", "1 35482\n", "Name: SMS_received, dtype: int64" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_vit.SMS_received.value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This variable looks like a boolean, but let's assume as quantity because the maintainer said this is the number of messages sent to the patient.\n", "\n", "The next 4 variables are all boolean, so I will bind the results into a single table." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ScholarshipAlcoholismHipertensionDiabetes
09966610716788726102584
1108613360218017943
\n", "
" ], "text/plain": [ " Scholarship Alcoholism Hipertension Diabetes\n", "0 99666 107167 88726 102584\n", "1 10861 3360 21801 7943" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Just o save some lines of report\n", "b_family = df_vit['Scholarship'].value_counts() # Counts the number of Scholarship\n", "b_alcohol = df_vit['Alcoholism'].value_counts() # Counts the number of Alcoholism\n", "b_hiper = df_vit['Hipertension'].value_counts() # Counts the number of Hipertension\n", "b_diab = df_vit['Diabetes'].value_counts() # Counts the number of Diabetes\n", "\n", "pd.concat([b_family,b_alcohol,b_hiper,b_diab],axis = 1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After this extense Data Assessing, let start the Data Clean.\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Cleaning\n", "\n", "In this chapter I will edit the `df_vit` dataframe modifying some variables types, removing non-desireable observations, etc.\n", "\n", "Based on the long Data Assessing, this dataframe is quite clean, I just need to convert some variable and remove a non-standard negative age. \n", "\n", "### Renaming Columns names\n", "\n", "Renaming the columns will turn the study much easier and fast, due to the uppercase in the middle of the variable names, it's annoying because you must record the exact form to type." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['patientid', 'appointmentid', 'gender', 'scheduledday',\n", " 'appointmentday', 'age', 'neighbourhood', 'scholarship', 'hipertension',\n", " 'diabetes', 'alcoholism', 'handcap', 'sms_received', 'no_show'],\n", " dtype='object')" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# The rename methods deals with changing dash to underscores and lower casing everything.\n", "df_vit.rename(columns=lambda x: x.strip().lower().replace(\"-\", \"_\"), inplace=True)\n", "\n", "# Print columns name to ensure the modification.\n", "df_vit.columns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This minor changing will save me time in future, avoiding me to commit typos problems.\n", "\n", "### Converting Data Types\n", "\n", "Now, let's convert `patientid`, `scheduledday`, and `appointmentday`." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "# Convert float to int\n", "df_vit.patientid = df_vit.patientid.astype(np.int64) # I am coercing to int64, because the default converts to int32." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The conversion of `scheduledday` and `appointmentday` will be performed by the `numpy` package." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
patientidappointmentidgenderscheduleddayappointmentdayageneighbourhoodscholarshiphipertensiondiabetesalcoholismhandcapsms_receivedno_show
0298724998242965642903F2016-04-29 18:38:082016-04-2962JARDIM DA PENHA010000No
15589977766944385642503M2016-04-29 16:08:272016-04-2956JARDIM DA PENHA000000No
242629622999515642549F2016-04-29 16:19:042016-04-2962MATA DA PRAIA000000No
38679512131745642828F2016-04-29 17:29:312016-04-298PONTAL DE CAMBURI000000No
488411864481835642494F2016-04-29 16:07:232016-04-2956JARDIM DA PENHA011000No
\n", "
" ], "text/plain": [ " patientid appointmentid gender scheduledday appointmentday \\\n", "0 29872499824296 5642903 F 2016-04-29 18:38:08 2016-04-29 \n", "1 558997776694438 5642503 M 2016-04-29 16:08:27 2016-04-29 \n", "2 4262962299951 5642549 F 2016-04-29 16:19:04 2016-04-29 \n", "3 867951213174 5642828 F 2016-04-29 17:29:31 2016-04-29 \n", "4 8841186448183 5642494 F 2016-04-29 16:07:23 2016-04-29 \n", "\n", " age neighbourhood scholarship hipertension diabetes alcoholism \\\n", "0 62 JARDIM DA PENHA 0 1 0 0 \n", "1 56 JARDIM DA PENHA 0 0 0 0 \n", "2 62 MATA DA PRAIA 0 0 0 0 \n", "3 8 PONTAL DE CAMBURI 0 0 0 0 \n", "4 56 JARDIM DA PENHA 0 1 1 0 \n", "\n", " handcap sms_received no_show \n", "0 0 0 No \n", "1 0 0 No \n", "2 0 0 No \n", "3 0 0 No \n", "4 0 0 No " ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Converting the ScheduledDay to date.\n", "df_vit.scheduledday = df_vit.scheduledday.apply(np.datetime64)\n", "\n", "# Converting the AppointmentDay to date.\n", "df_vit.appointmentday = df_vit.appointmentday.apply(np.datetime64)\n", "\n", "# Print the first 5 rows.\n", "df_vit.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Both `datetime` variables are converted but have minor differences.\n", "\n", "* `scheduledday`: It is OK;\n", "* `appointmentday`: Lose the hours, minutes, and seconds, probably is due to all the cases have the same time `00:00:00`.\n", "\n", "Now, let's convert the categorical variable, such as:\n", "\n", "* `gender`: male and female categories;\n", "* `no-show`: yes or no." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "# Converting the No-show variable as categorical\n", "df_vit['no_show'] = pd.Categorical(df_vit['no_show'])\n", "\n", "# Converting the Gender variable as categorical\n", "df_vit['gender'] = pd.Categorical(df_vit['gender'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, I will convert all boolean variables." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "# Converting int64 variables to bool.\n", "df_vit.scholarship = df_vit.scholarship.apply(bool)\n", "df_vit.hipertension = df_vit.hipertension.apply(bool)\n", "df_vit.diabetes = df_vit.diabetes.apply(bool)\n", "df_vit.alcoholism = df_vit.alcoholism.apply(bool)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's print the `.info()` once again to ensure the conversion." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 110527 entries, 0 to 110526\n", "Data columns (total 14 columns):\n", "patientid 110527 non-null int64\n", "appointmentid 110527 non-null int64\n", "gender 110527 non-null category\n", "scheduledday 110527 non-null datetime64[ns]\n", "appointmentday 110527 non-null datetime64[ns]\n", "age 110527 non-null int64\n", "neighbourhood 110527 non-null object\n", "scholarship 110527 non-null bool\n", "hipertension 110527 non-null bool\n", "diabetes 110527 non-null bool\n", "alcoholism 110527 non-null bool\n", "handcap 110527 non-null int64\n", "sms_received 110527 non-null int64\n", "no_show 110527 non-null category\n", "dtypes: bool(4), category(2), datetime64[ns](2), int64(5), object(1)\n", "memory usage: 7.4+ MB\n" ] } ], "source": [ "df_vit.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data Filtering\n", "\n", "I will remove all observations with age below of zero because it is not correct. I know there is only one observation with this characteristics, it is not likely this deletion will affect all the analysis." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of rows: 110526\n" ] } ], "source": [ "# Cases of Age less than zero. I have updated my variable name to lowercase age.\n", "df_vit = df_vit[df_vit.age >= 0]\n", "\n", "# Print the number of rows\n", "print(\"Number of rows: {}\".format(df_vit.shape[0]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As I expected, the number of rows has decreased one unit.\n", "\n", "Here, I finished the Data Wrangling.\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exploratory Data Analysis \n", "\n", "Sadly, I did not come up with a good and interesting question, but as a kickstart to my Exploratory Analysis, I will try to find any relationship between the patient who has appointed more than one time and no-showed.\n", "\n", "> My Question: Is there a relationship between patients with many appointments to do show-up more frequently?\n", "\n", "First, I want to know how is the percentage of a no-show of all appointments." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "No 88207\n", "Yes 22319\n", "Name: no_show, dtype: int64\n", "Appointments with Show-up: 88207\n", "Appointments with No-show: 22319\n", "\n", "No-show Percentage: 20.0%\n" ] } ], "source": [ "no_show_p = df_vit['no_show'].value_counts() # First show-up and second no-show.\n", "\n", "print(no_show_p)\n", "\n", "print(\"Appointments with Show-up: {}\\nAppointments with No-show: {}\\n\".format(*no_show_p)) # No-show: Yes : 22319\n", "print(\"No-show Percentage: {}%\".format(round(100*no_show_p[1]/sum(no_show_p)),2)) # Show-up: No : 88207" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The no-show rate is 20.0% for all appointments but I know that some patients have an appointment more than once. Let's get deep in this question plotting the histogram." ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Histogram of patient id.\n", "df_vit['patientid'].value_counts().hist(bins = [1, 2, 3, 4, 5, 6, 7, 8, 9], # 9 Levels of bons to be showed\n", " figsize= [15,4], # Strech a image to horizontal\n", " cumulative = False); # Do not cumulative\n", "\n", "plt.title('Graphic 1 - Vitória City Appointments per Patient ID Histogram') # Add title\n", "plt.xlabel('Appointments per Patient ID') # X axis Label\n", "plt.ylabel('Frequency'); # Y axis Label" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Although it seems to be **not** clear that only a few patients have several appointments. I will plot the `patientid` with most appointments to analyse this in details." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "822145925426128 88\n", "99637671331 84\n", "26886125921145 70\n", "33534783483176 65\n", "258424392677 62\n", "Name: patientid, dtype: int64" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# This is the patientid with multiple appointments.\n", "df_vit['patientid'].value_counts().head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, there are some patients with high frequency, for this reason, I would like to test the theory of [80/20][1], so I will `.sum()` the first 20% of `patientid` with higher frequency.\n", "\n", "[1]: https://en.wikipedia.org/wiki/Pareto_principle" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of appointments : 48767\n", "Number of appointments (%): 44%\n" ] } ], "source": [ "# Number of unique patient id \n", "unq_pat = df_vit['patientid'].nunique()\n", "\n", "# Summation of the 20% patients with higher number of appointments\n", "res = sum(df_vit['patientid'].value_counts().head(int(unq_pat * 0.20))) # I use the int() to ensure a integer number as index.\n", "\n", "# Print\n", "print('Number of appointments : {}'.format(res))\n", "print('Number of appointments (%): {}%'.format(round(100*res/110526),2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The 20% with higher frequency (`patientid` with many appointments) are responsible for 44% of the total appointments, it is far from the 80%, but it shows a kind of \"asymmetry\".\n", "\n", "### Question 1\n", "\n", ">**Are patients with many appointments are more likely to show-up then the patients with less than two appointments?**\n", "\n", "Let's divide the dataset into two groups using a threshold of at least 3 appointments to classify each patientid.\n", "\n", "* First group (`df_gp1`) that one with a higher number of appointments, and;\n", "* Second group (`df_gp2`) that one with a lower number of appointments." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False 51814\n", "True 10484\n", "Name: patientid, dtype: int64" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Counts the number of appointment for each patient id. # I have decided to split in several lines\n", "n_appoin_per_pat_id = df_vit['patientid'].value_counts() # because in the future this is much easier for \n", " # me to revise this code, and to understand.\n", "# Creates a \"vector\" of boolean (True or False) for each patient id.\n", "is_greater_3 = n_appoin_per_pat_id >= 3\n", "is_not_great_3 = n_appoin_per_pat_id < 3\n", "\n", "# Quantity of Patients with more or equal 3 appointments.\n", "is_greater_3.value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are 10,484 patients with more than or equal to 3 appointments and 51,814 patients with less than 3 appointments.\n", "\n", "_Obs.: I have read [this](https://stackoverflow.com/questions/26640145/python-pandas-how-to-get-the-row-names-from-index-of-a-dataframe) thread in Stack overflow to understand how to extract the \"rows names.\"_" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[822145925426128,\n", " 99637671331,\n", " 26886125921145,\n", " 33534783483176,\n", " 258424392677,\n", " 75797461494159,\n", " 871374938638855,\n", " 6264198675331,\n", " 66844879846766,\n", " 872278549442]" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Creating a list of patientid with more than 3 appointments.\n", "more_3 = list(n_appoin_per_pat_id[is_greater_3].index) # List of patient id with more than 3 appointments\n", "less_3 = list(n_appoin_per_pat_id[is_not_great_3].index) # List of patient id less than 3 appointments\n", "\n", "# First 10 patients id with more than 3 appointments.\n", "more_3[:10] # Slicing an Example" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `more_3` and `less_3` are two lists of `patientid`, they will be used to select the rows of `df_vit`." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n" ] } ], "source": [ "# Testing if any less of information.\n", "print(df_vit['patientid'].nunique() == len(more_3) + len(less_3)) # The total must be equal to these two subsets." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I have inserted a test to ensure if these two subsetted data frames have the same quantity of observation from the original dataframe. If the results are `True`, is everything OK!" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Dataframe with higher number appointments: 10484\n", "Dataframe with lower number appointments: 51814\n" ] } ], "source": [ "# Print the number of patient id in each list.\n", "print(\"Dataframe with higher number appointments: {}\\nDataframe with lower number appointments: {}\".format(len(more_3),len(less_3)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These two vectors (`more_3` and `less_3`) will be used to \"filter\" and separate which rows pertain to `df_gp1` or `df_gp2`.\n", "\n", "_Obs.: I have read [this](https://stackoverflow.com/questions/12065885/filter-dataframe-rows-if-value-in-column-is-in-a-set-list-of-values) Stack overflow thread to learn how to use `.isin()`._" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "df_gp1\n", "Observations: 44817\n", "Columns: 14\n", "\n", "df_gp2\n", "Observations: 65709\n", "Columns: 14\n" ] } ], "source": [ "# Subsetting the Dataframe with the Patients with more than 3 appointments\n", "df_gp1 = df_vit[df_vit['patientid'].isin(more_3)]\n", "\n", "# Subsetting the Dataframe with the Patients with less than 3 appointments\n", "df_gp2 = df_vit[df_vit['patientid'].isin(less_3)]\n", "\n", "# Printing the dimension of each dataframe\n", "print(\"df_gp1\\nObservations: {}\\nColumns: {}\\n\\ndf_gp2\\nObservations: {}\\nColumns: {}\".format(*df_gp1.shape,*df_gp2.shape))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Both data frames are ready to calculate the no-show rate." ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "No-show: 21.09%\n" ] } ], "source": [ "# Assign the results of value count to a variable, I need it to make calculations\n", "df_gp1_res_no_show = df_gp1.no_show.value_counts()\n", "\n", "# Percentage Calculation\n", "df_gp1_res_no_show_p = 100 * df_gp1_res_no_show[1]/(df_gp1_res_no_show[0] + df_gp1_res_no_show[1])\n", "\n", "# Print results and rounding.\n", "print(\"No-show: {}%\".format(round(df_gp1_res_no_show_p,2)))" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "No-show: 19.58%\n" ] } ], "source": [ "# Assign the results of value count to a variable, I need it to make calculations\n", "df_gp2_res_no_show = df_gp2.no_show.value_counts()\n", "\n", "# Percentage Calculation\n", "df_gp2_res_no_show_p = 100 * df_gp2_res_no_show[1]/(df_gp2_res_no_show[0] + df_gp2_res_no_show[1])\n", "\n", "print(\"No-show: {}%\".format(round(df_gp2_res_no_show_p,2)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Although the difference between these two results is slight, patients from the group `df_gp1` have a higher percentage of a no-show from the `df_gp2`. Going to the opposite way from my expectations (and negating my question).\n", "\n", "I will make a sensible analysis, varying the threshold of 3 appointments to 4, 5 and so on, therefore I will create a function to encapsulate this lines of codes." ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [], "source": [ "# I decided to encapsulate in one function because it will save me line codes.\n", "def attendance(data, n, complete = False):\n", " \"\"\"\n", " -----------------------------------------------------------------------------------------------------\n", " |DESCRIPTION: |\n", " | |\n", " | This function calculate the no-show rate separating the dataframe into two group based on a |\n", " | threshold n. |\n", " | |\n", " -----------------------------------------------------------------------------------------------------\n", " |INPUT: |\n", " | |\n", " | VARIABLE TYPE DESCRIPTION |\n", " | |\n", " | data dataframe This dataset must be the 'noshowappointments-kagglev2-may-2016.csv' |\n", " | imported file but it is required an treatment before use this function.|\n", " | The columns name should be renamed to lower case. |\n", " | |\n", " | n int The threshold number of appointment to classify each patient id. |\n", " | |\n", " | complete bool If False return percentage and If True also returns data frames. |\n", " | |\n", " -----------------------------------------------------------------------------------------------------\n", " |OUTPUT: |\n", " | |\n", " | VARIABLE TYPE DESCRIPTION |\n", " | |\n", " | [m_no_show_p, list The first element of the list is the percentage of no-show of patients|\n", " | l_no_show_p] with more than or equal to n appointments in the data , the second |\n", " | element is the percentage of the group with less than n appointments. |\n", " | |\n", " | df_more, data frame The former with patientid with more or equal n appointments and the |\n", " | df_less latter with patientid with less than n appointments. |\n", " | |\n", " -----------------------------------------------------------------------------------------------------\n", " \"\"\"\n", " greater = data['patientid'].value_counts() >= n # Creating a vector to filter the rows\n", " lesser = data['patientid'].value_counts() < n # Creating a vector to filter the rows\n", " \n", " # Extracting the patient id as list\n", " more = list((data['patientid'].value_counts())[greater].index)\n", " less = list((data['patientid'].value_counts())[lesser].index)\n", " \n", " # Subsetting the data into two dataframes.\n", " df_more = data[data['patientid'].isin(more)]\n", " df_less = data[data['patientid'].isin(less)]\n", " \n", " # Calculation of no-show from the first dataframe (Patients with more than or equal to n appointments)\n", " m_no_show = df_more.no_show.value_counts()\n", " m_no_show_p = 100 * m_no_show[1]/(m_no_show[0] + m_no_show[1])\n", " \n", " # Calculation of no-show from the first dataframe (Patients with less than n appointments)\n", " l_no_show = df_less.no_show.value_counts()\n", " l_no_show_p = 100 * l_no_show[1]/(l_no_show[0] + l_no_show[1])\n", " \n", " if complete == False:\n", " return [round(m_no_show_p,2), round(l_no_show_p,2)] # Return results in percentage and in a list.\n", " else:\n", " return [round(m_no_show_p,2), round(l_no_show_p,2), df_more, df_less]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The function `attendance` will calculate for each quantity of appointments the probability of these two groups." ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [], "source": [ "# Variable Initialization\n", "res_no_show_2_88 = []\n", "\n", "# Loop to generate for each index value (threshold) the probabilities.\n", "for index in range(2,88):\n", " res_no_show_2_88.append(attendance(df_vit, index))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The variable `res_no_show_2_88` is a list with each element a pair of no-show percentage. For this reason, I need to unpack this list to plot in the graphic." ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [], "source": [ "# Unpacking the res_no_show_2_88 variable\n", "no_show_2_88, show_up_2_88 = zip(*res_no_show_2_88)\n", "\n", "# X axis variable\n", "threshold = np.arange(2,88)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I have created two variables (`no_show_2_88` and `show_up_2_88`) to be plotted and a third variable to simulate my threshold number (`threshold`)." ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Defining the size of the Plot.\n", "plt.figure(figsize= [12,4])\n", "\n", "# red dashes, blue squares and green triangles\n", "plt.plot(threshold, no_show_2_88, 'r--',threshold,show_up_2_88, 'bs')\n", "\n", "plt.xlabel('Number of appointments by Patient ID') # X label\n", "plt.ylabel('No-show [%]') # Y label\n", "plt.title('Graphic 2 - No-show rate by Patient ID') # Graphic Title\n", "plt.text(25, 5.5, r'Knee') # Text \n", "plt.grid(True) # Grid\n", "plt.legend(('Group1', 'Group2')) # Legend\n", "\n", "plt.show() # Plot the graphic" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The graphic above shows the behaviour of the no-show rate when the threshold varies from 2 to 88. Keep in mind, that I defined threshold as the number of appointments by each patient.\n", "\n", "#### Analyzing the red line (Group 1):\n", "\n", "* Roughly, from 7 to 25 there is a negative linear relationship between the patient with more appointment and no-show rate;\n", " * Patient tend to do not miss an appointment;\n", "* Patients with more than 25 appointments have a very low no-show rate.\n", "\n", "#### Analyzing the blue line (Group 2):\n", "\n", "* Do not show any changes varying the threshold.\n", "\n", "### Answer Question 1\n", "\n", ">As I expected the higher appointment frequency given of a `patientid` leads a higher shop-ups (or lower no-show), this is probably a consequence from the illness severity (chemotherapy, radiotherapy, dialysis, etc.), which obliges the patient to go several times to the hospital.\n", "\n", "I have recognised a shortage of information, that is why I will stop here this analysis to collect new data. Later, I will come back to a deeper analysis because I think there is an opportunity for more exploratory analysis.\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Merging\n", "\n", "I have gathered some additional data from [Vitória Council Website](http://legado.vitoria.es.gov.br/regionais/home.asp), let's load them to make a new analysis. I have not yet figure out the second question and binding this additional data could lead me a new insight." ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
id_rareg_admnbhpop_2000pop_2010malefemaleavg_inc_mon
01CentroCentro92409838441654221791,15
11CentroDo Moscoso854795369426647,05
21CentroFonte Grande14591231592639706,03
31CentroIlha do Príncipe2810261311941419623,21
41CentroParque Moscoso170817737919821754,79
\n", "
" ], "text/plain": [ " id_ra reg_adm nbh pop_2000 pop_2010 male female \\\n", "0 1 Centro Centro 9240 9838 4416 5422 \n", "1 1 Centro Do Moscoso 854 795 369 426 \n", "2 1 Centro Fonte Grande 1459 1231 592 639 \n", "3 1 Centro Ilha do Príncipe 2810 2613 1194 1419 \n", "4 1 Centro Parque Moscoso 1708 1773 791 982 \n", "\n", " avg_inc_mon \n", "0 1791,15 \n", "1 647,05 \n", "2 706,03 \n", "3 623,21 \n", "4 1754,79 " ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Loading the Vitoria Auxiliary data.\n", "vit_aux = pd.read_csv('02-Datasets/vit_aux.csv', sep=\";\");\n", "\n", "# Print the first 5 rows.\n", "vit_aux.head()" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
patientidappointmentidgenderscheduleddayappointmentdayageneighbourhoodscholarshiphipertensiondiabetesalcoholismhandcapsms_receivedno_show
0298724998242965642903F2016-04-29 18:38:082016-04-2962JARDIM DA PENHAFalseTrueFalseFalse00No
15589977766944385642503M2016-04-29 16:08:272016-04-2956JARDIM DA PENHAFalseFalseFalseFalse00No
\n", "
" ], "text/plain": [ " patientid appointmentid gender scheduledday appointmentday \\\n", "0 29872499824296 5642903 F 2016-04-29 18:38:08 2016-04-29 \n", "1 558997776694438 5642503 M 2016-04-29 16:08:27 2016-04-29 \n", "\n", " age neighbourhood scholarship hipertension diabetes alcoholism \\\n", "0 62 JARDIM DA PENHA False True False False \n", "1 56 JARDIM DA PENHA False False False False \n", "\n", " handcap sms_received no_show \n", "0 0 0 No \n", "1 0 0 No " ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_vit.head(2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Comparing these two datasets, I realized the only variable in common is the neighborhood (or neighbourhood in Britain English), sadly the variable name and the content is not equivalent. It means I need to edit the column from both data frames. Again, I will use a tailored function to deal with this problem." ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [], "source": [ "def lower_ahu(data):\n", " \"\"\"\n", " -----------------------------------------------------------------------------------------------------\n", " |DESCRIPTION: |\n", " | |\n", " | This function converts any uppercase letter to lowercase, converts space into underscore and |\n", " | change any non-standard letter such as: á,é,í,ó,ú,ã,õ,ç etc. to a \"regular\" letter without the|\n", " | accent. |\n", " | |\n", " -----------------------------------------------------------------------------------------------------\n", " |INPUT: |\n", " | |\n", " | VARIABLE TYPE DESCRIPTION |\n", " | |\n", " | data list The input expected a list with strings. |\n", " | |\n", " | |\n", " -----------------------------------------------------------------------------------------------------\n", " |OUTPUT: |\n", " | |\n", " | VARIABLE TYPE DESCRIPTION |\n", " | |\n", " | data list The converted version of the list, without any word with weird accent. |\n", " | |\n", " -----------------------------------------------------------------------------------------------------\n", " \"\"\"\n", " # This helps to reduce the number of \"types\" of letters.\n", " data = [x.lower() for x in data]\n", " \n", " # My list of non standard letter and its accents.\n", " non_standard = [['ã','a'],['á','a'],['à','a'],['â','a'],\n", " ['é','e'],['è','e'],['ê','e'],\n", " ['í','i'],['ì','i'],\n", " ['õ','o'],['ó','o'],['ò','o'],['ô','o'],\n", " ['ú','u'],['ù','u'],\n", " ['ç','c'],\n", " [\" \",'_'],[\"-\",'_'],[\"'\",'_'],[\"`\",'_'],[\"´\",'_'],[\",\",'.']]\n", " \n", " for wrong,correct in non_standard:\n", " data = [x.replace(wrong,correct) for x in data] # Where the magic happens! \n", " \n", " return data # Return the data without acccents." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After defining a function, let's use it." ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Converting the content of the nbh varible to lowercase and removing portuguese accents.\n", "vit_aux['nbh'] = lower_ahu(vit_aux['nbh'])\n", "vit_aux['reg_adm'] = lower_ahu(vit_aux['reg_adm'])\n", "\n", "# Converting the content of the neighbourhood varible to lowercase and removing portuguese accents.\n", "df_vit['neighbourhood'] = lower_ahu(df_vit['neighbourhood'])\n", "\n", "# Let's compare both categories.\n", "sorted(list(vit_aux.nbh.unique())) is sorted(list(df_vit.neighbourhood.unique()))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Unfortunately, is `False`. Let's dig to find out the problem of different numbers of neighborhoods." ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Auxiliary Dataframe: 80\n", "Vitória Hospital Appointment: 81\n" ] } ], "source": [ "# Number of neighborhood in each dataframe.\n", "nbh_comparison = len(vit_aux.nbh.unique()),len(df_vit.neighbourhood.unique())\n", "\n", "print(\"Auxiliary Dataframe: {}\\nVitória Hospital Appointment: {}\".format(*nbh_comparison))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, there is one neighborhood lacking in `vit_aux` data frame, and this neighborhood, after few minutes comparing these two lists, is `ilhas_oceanicas_de_trindade`. Curiously, this island is 1,200 kilometers far from the coast, it means this is an ultra-marine territory, a place where only 32 Marines lives and no one else." ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
patientidappointmentidgenderscheduleddayappointmentdayageneighbourhoodscholarshiphipertensiondiabetesalcoholismhandcapsms_receivedno_show
487545349868551145583947F2016-04-14 12:25:432016-05-1351ilhas_oceanicas_de_trindadeFalseFalseFalseFalse00Yes
4876572564297524815583948F2016-04-14 12:26:132016-05-1358ilhas_oceanicas_de_trindadeFalseFalseFalseFalse00Yes
\n", "
" ], "text/plain": [ " patientid appointmentid gender scheduledday appointmentday \\\n", "48754 534986855114 5583947 F 2016-04-14 12:25:43 2016-05-13 \n", "48765 7256429752481 5583948 F 2016-04-14 12:26:13 2016-05-13 \n", "\n", " age neighbourhood scholarship hipertension diabetes \\\n", "48754 51 ilhas_oceanicas_de_trindade False False False \n", "48765 58 ilhas_oceanicas_de_trindade False False False \n", "\n", " alcoholism handcap sms_received no_show \n", "48754 False 0 0 Yes \n", "48765 False 0 0 Yes " ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Print all observation from ilhas_oceanicas_de_trindade.\n", "df_vit[df_vit['neighbourhood'] == 'ilhas_oceanicas_de_trindade']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, there are only 2 observations, from two different patients, and both no-showed. Due to the inconsistency of this two observations, I will remove them because I am convinced doing it will not affect my results." ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Rows: 110524\n", "Columns: 14\n" ] } ], "source": [ "# Creating a vector to remove only the observations with the ilhas_oceanicas_de_trindade as neighbourhood.\n", "rm_trindade = list(map(lambda x : not(x), df_vit['neighbourhood'] == 'ilhas_oceanicas_de_trindade'))\n", "\n", "# Subsetting using the vector rm_trindade to remove this observations.\n", "df_vit = df_vit[rm_trindade]\n", "\n", "# Print dimensions\n", "print(\"Rows: {}\\nColumns: {}\".format(*df_vit.shape))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, I can compare the neighborhood columns of this two data frames." ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Comparing the categories of these two columns. If true, I could join them.\n", "sorted(list(df_vit.neighbourhood.unique())) == sorted(list(vit_aux.nbh.unique()))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is OK, so I will continue to rename the columns to have the same name (`neighbourhood` will be replaced by `nbh`)." ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [], "source": [ "# Creating a vector to assign a new name to the variable neighbourhood\n", "new_name_column = list(df_vit.columns)\n", "\n", "# Renaming the columns names\n", "new_name_column[6] = 'nbh'\n", "\n", "# Defining variables names.\n", "df_vit.columns = new_name_column" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before any merge, I will convert the `avg_inc_mon` to `float64`. You can see `avg_inc_mon` is originally imported as `str/obj`." ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 80 entries, 0 to 79\n", "Data columns (total 8 columns):\n", "id_ra 80 non-null int64\n", "reg_adm 80 non-null object\n", "nbh 80 non-null object\n", "pop_2000 80 non-null int64\n", "pop_2010 80 non-null int64\n", "male 80 non-null int64\n", "female 80 non-null int64\n", "avg_inc_mon 80 non-null object\n", "dtypes: int64(5), object(3)\n", "memory usage: 5.1+ KB\n" ] } ], "source": [ "vit_aux.info()" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [], "source": [ "# Converting the number Brazillian formatting to USA - Changing , to .\n", "vit_aux['avg_inc_mon'] = lower_ahu(vit_aux['avg_inc_mon'])\n", "\n", "# Converting str/obj to float\n", "vit_aux['avg_inc_mon'] = vit_aux['avg_inc_mon'].astype(float)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Everything is set-up and ready to the merge.\n", "\n", "_Obs.: I have read in pandas website how to merge, [this][merge9] is the thread I have read._\n", "\n", "[merge9]: https://pandas.pydata.org/pandas-docs/stable/merging.html#database-style-dataframe-joining-merging" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
patientidappointmentidgenderscheduleddayappointmentdayagenbhscholarshiphipertensiondiabetes...handcapsms_receivedno_showid_rareg_admpop_2000pop_2010malefemaleavg_inc_mon
0298724998242965642903F2016-04-29 18:38:082016-04-2962jardim_da_penhaFalseTrueFalse...00No9jardim_da_penha246233057113702168692510.89
15589977766944385642503M2016-04-29 16:08:272016-04-2956jardim_da_penhaFalseFalseFalse...00No9jardim_da_penha246233057113702168692510.89
242629622999515642549F2016-04-29 16:19:042016-04-2962mata_da_praiaFalseFalseFalse...00No9jardim_da_penha931710594500555894119.31
38679512131745642828F2016-04-29 17:29:312016-04-298pontal_de_camburiFalseFalseFalse...00No9jardim_da_penha9928894524371498.52
488411864481835642494F2016-04-29 16:07:232016-04-2956jardim_da_penhaFalseTrueTrue...00No9jardim_da_penha246233057113702168692510.89
\n", "

5 rows × 21 columns

\n", "
" ], "text/plain": [ " patientid appointmentid gender scheduledday appointmentday \\\n", "0 29872499824296 5642903 F 2016-04-29 18:38:08 2016-04-29 \n", "1 558997776694438 5642503 M 2016-04-29 16:08:27 2016-04-29 \n", "2 4262962299951 5642549 F 2016-04-29 16:19:04 2016-04-29 \n", "3 867951213174 5642828 F 2016-04-29 17:29:31 2016-04-29 \n", "4 8841186448183 5642494 F 2016-04-29 16:07:23 2016-04-29 \n", "\n", " age nbh scholarship hipertension diabetes ... \\\n", "0 62 jardim_da_penha False True False ... \n", "1 56 jardim_da_penha False False False ... \n", "2 62 mata_da_praia False False False ... \n", "3 8 pontal_de_camburi False False False ... \n", "4 56 jardim_da_penha False True True ... \n", "\n", " handcap sms_received no_show id_ra reg_adm pop_2000 pop_2010 \\\n", "0 0 0 No 9 jardim_da_penha 24623 30571 \n", "1 0 0 No 9 jardim_da_penha 24623 30571 \n", "2 0 0 No 9 jardim_da_penha 9317 10594 \n", "3 0 0 No 9 jardim_da_penha 992 889 \n", "4 0 0 No 9 jardim_da_penha 24623 30571 \n", "\n", " male female avg_inc_mon \n", "0 13702 16869 2510.89 \n", "1 13702 16869 2510.89 \n", "2 5005 5589 4119.31 \n", "3 452 437 1498.52 \n", "4 13702 16869 2510.89 \n", "\n", "[5 rows x 21 columns]" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Merging the two datasets.\n", "df_vit = pd.merge(df_vit, vit_aux, how='left', on=['nbh', 'nbh'])\n", "\n", "# Print the first 5 rows\n", "df_vit.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, the `df_vit` has few more variables, such as:\n", "\n", "* Region Administration Name;\n", "* Population in 2000 and 2010 this Regional Administration;\n", "* Number of male residents;\n", "* Number of female residents, and;\n", "* Monthly Average Income in BRL.\n", "\n", "Let's see the `.describe()` and `.info()` to ensure if there are any problems during the merging process." ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
patientidappointmentidagehandcapsms_receivedid_rapop_2000pop_2010malefemaleavg_inc_mon
count1.105240e+051.105240e+05110524.000000110524.000000110524.000000110524.000000110524.000000110524.000000110524.000000110524.000000110524.000000
mean1.474960e+145.675306e+0637.0889040.0222490.3210344.6000876682.4622348183.3126653825.9688574357.3438081063.303482
std2.560959e+147.129503e+0423.1101110.1615450.4668762.3099886619.25133010209.9172574705.4746795506.939496793.033265
min3.921700e+045.030230e+060.0000000.0000000.0000001.0000000.0000000.0000000.0000000.0000000.000000
25%4.172693e+125.640287e+0618.0000000.0000000.0000003.0000002486.0000002565.0000001198.0000001327.000000558.110000
50%3.173184e+135.680574e+0637.0000000.0000000.0000004.0000004087.0000004402.0000002102.0000002290.000000741.850000
75%9.439068e+135.725523e+0655.0000000.0000001.0000007.0000007585.0000007913.0000003771.0000004142.0000001300.280000
max9.999816e+145.790484e+06115.0000004.0000001.0000009.00000024623.00000039157.00000018230.00000020927.00000010843.740000
\n", "
" ], "text/plain": [ " patientid appointmentid age handcap \\\n", "count 1.105240e+05 1.105240e+05 110524.000000 110524.000000 \n", "mean 1.474960e+14 5.675306e+06 37.088904 0.022249 \n", "std 2.560959e+14 7.129503e+04 23.110111 0.161545 \n", "min 3.921700e+04 5.030230e+06 0.000000 0.000000 \n", "25% 4.172693e+12 5.640287e+06 18.000000 0.000000 \n", "50% 3.173184e+13 5.680574e+06 37.000000 0.000000 \n", "75% 9.439068e+13 5.725523e+06 55.000000 0.000000 \n", "max 9.999816e+14 5.790484e+06 115.000000 4.000000 \n", "\n", " sms_received id_ra pop_2000 pop_2010 \\\n", "count 110524.000000 110524.000000 110524.000000 110524.000000 \n", "mean 0.321034 4.600087 6682.462234 8183.312665 \n", "std 0.466876 2.309988 6619.251330 10209.917257 \n", "min 0.000000 1.000000 0.000000 0.000000 \n", "25% 0.000000 3.000000 2486.000000 2565.000000 \n", "50% 0.000000 4.000000 4087.000000 4402.000000 \n", "75% 1.000000 7.000000 7585.000000 7913.000000 \n", "max 1.000000 9.000000 24623.000000 39157.000000 \n", "\n", " male female avg_inc_mon \n", "count 110524.000000 110524.000000 110524.000000 \n", "mean 3825.968857 4357.343808 1063.303482 \n", "std 4705.474679 5506.939496 793.033265 \n", "min 0.000000 0.000000 0.000000 \n", "25% 1198.000000 1327.000000 558.110000 \n", "50% 2102.000000 2290.000000 741.850000 \n", "75% 3771.000000 4142.000000 1300.280000 \n", "max 18230.000000 20927.000000 10843.740000 " ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_vit.describe()" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Int64Index: 110524 entries, 0 to 110523\n", "Data columns (total 21 columns):\n", "patientid 110524 non-null int64\n", "appointmentid 110524 non-null int64\n", "gender 110524 non-null category\n", "scheduledday 110524 non-null datetime64[ns]\n", "appointmentday 110524 non-null datetime64[ns]\n", "age 110524 non-null int64\n", "nbh 110524 non-null object\n", "scholarship 110524 non-null bool\n", "hipertension 110524 non-null bool\n", "diabetes 110524 non-null bool\n", "alcoholism 110524 non-null bool\n", "handcap 110524 non-null int64\n", "sms_received 110524 non-null int64\n", "no_show 110524 non-null category\n", "id_ra 110524 non-null int64\n", "reg_adm 110524 non-null object\n", "pop_2000 110524 non-null int64\n", "pop_2010 110524 non-null int64\n", "male 110524 non-null int64\n", "female 110524 non-null int64\n", "avg_inc_mon 110524 non-null float64\n", "dtypes: bool(4), category(2), datetime64[ns](2), float64(1), int64(10), object(2)\n", "memory usage: 14.1+ MB\n" ] } ], "source": [ "df_vit.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Everything looks pretty good but before any question, I would like to point out some consideration:\n", "\n", "* The Monthly Average Income is an information based on the residents of the neighbourhood, and I assume all patient will choose the closest UBS from its residents to go, and;\n", "* Male and Female have the same Monthly Average Income.\n", "\n", "So let's start some interesting analysis, with these new features." ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Monthly Average Income Histogram \n", "df_vit['avg_inc_mon'].hist(bins = 11, figsize = [12,4]);\n", "plt.xlim(0,5000)\n", "\n", "plt.title('Graphic 3 - Histogram of Monthly Average Income') # Add title\n", "plt.xlabel('Number of Patients') # X axis Label\n", "plt.ylabel('Frequency'); # Y axis Label" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Concretely, the majority of the public attended by the public health systems in Vitória has average income bearing 1000 BRL/month.\n", "\n", "However, in some cases one `patientid` has multiple entries, it means this patient went several times to the hospital, and so I must remove these duplicated `patientid` to a further analysis, for instance, to calculate the `.mean()` and `.count()`.\n", "\n", "Let's remove the duplicated `patientid` from the `data_vit`. " ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(62296, 21)" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Creating a vector to sinalize if there is a duplicated patientid\n", "dupe = df_vit['patientid'].duplicated()\n", "\n", "# Converting to Not -> True to False.\n", "dupe = list(map( lambda x : not(x) , dupe ) )\n", "\n", "# Selecting the unique patients, no duplicated.\n", "df_clean = df_vit[dupe]\n", "\n", "# Dimensions of the cleaned data set.\n", "df_clean.shape # The number of rows from shape is the same from len(df_vit['patientid'].unique())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I want to calculated the average age and the average income from all differents patients." ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "36.7 years\n", "1068.71 BRL\n" ] } ], "source": [ "# Average Age Calculation\n", "print(\"{} years\".format(round(df_clean['age'].mean(),2)))\n", "print(\"{} BRL\".format(round(df_clean['avg_inc_mon'].mean(),2)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I will encapsulate a function to deal with removing duplicated `patientid`." ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [], "source": [ "# Defining a function to eliminate the duplicates patientid.\n", "def no_dupe_pat(data_dupe):\n", " \"\"\"\n", " -----------------------------------------------------------------------------------------------------\n", " |DESCRIPTION: |\n", " | |\n", " | This function removes all duplicated rows of a given patientid. |\n", " | |\n", " -----------------------------------------------------------------------------------------------------\n", " |INPUT: |\n", " | |\n", " | VARIABLE TYPE DESCRIPTION |\n", " | |\n", " | data_dupe data frame Any data frame with at least the patientid column. |\n", " | |\n", " | |\n", " -----------------------------------------------------------------------------------------------------\n", " |OUTPUT: |\n", " | |\n", " | VARIABLE TYPE DESCRIPTION |\n", " | |\n", " | data_dupe data frame A data frame with no duplicated patientid. |\n", " | |\n", " -----------------------------------------------------------------------------------------------------\n", " \"\"\"\n", " # Creating a vector to sinalize if there is a duplicated patientid\n", " dupe_less = data_dupe['patientid'].duplicated()\n", "\n", " # Converting to Not -> True to False.\n", " dupe_less = list(map( lambda x : not(x) , dupe_less ) )\n", "\n", " # Selecting the unique patients, no duplicated and return.\n", " return data_dupe[dupe_less]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With this function, I can calculate more accurate values of the average mean, average income, etc.\n", "\n", "I will come back to the first question I have made to go deeper and to answer some new questions. I will reproduce the Graphic 2 as a starter." ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Defining the size of the Plot.\n", "plt.figure(figsize= [12,4])\n", "\n", "# red dashes, blue squares and green triangles\n", "plt.plot(threshold, no_show_2_88, 'r--',threshold,show_up_2_88, 'bs')\n", "\n", "plt.xlabel('Number of appointments by Patient ID') # X label\n", "plt.ylabel('No-show [%]') # Y label\n", "plt.title('Graphic 2 - No-show rate by Patient ID') # Graphic Title\n", "plt.text(25, 5.5, r'Knee') # Text \n", "plt.grid(True) # Grid\n", "plt.legend(('Group1', 'Group2')) # Legend\n", "\n", "plt.show() # Plot the graphic" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I am going to split the dataset approximately in the knee (I guess 25 is a good approximation) value of the graphic. " ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [], "source": [ "# Calculating the percentage of no-show and data frames.\n", "no_show_more_25,no_show_less_25,df_more_25,df_less_25 = attendance(df_vit, 25, complete= True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After munging the data, I come up with a question.\n", "\n", "### Question 2\n", "\n", ">**Are patients with many appointments in average older than those with few appointments?**" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Average Age for Patients with more or equal to 25 appointments: 46.99 years\n", "Average Age for Patients with less than 25 appointments: 36.96 years\n" ] } ], "source": [ "# Create a list of the average age of each data frame. \n", "avg_age_25 = [round(df_more_25['age'].mean(),2), round(df_less_25['age'].mean(),2)]\n", "\n", "# Print\n", "print(\"Average Age for Patients with more or equal to 25 appointments: {} years\".format(avg_age_25[0]))\n", "print(\"Average Age for Patients with less than 25 appointments: {} years\".format(avg_age_25[1]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Well, let's take a look in the distribution.\n", "\n", "Before it, I will explain my function `no_dupe_pat()`, this function aims to remove any duplicated rows given a `patientid`, it means: If a `patientid` has 88 appointments, 87 will be excluded." ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Remove duplicates entries in patientid and plot the histogram of ages - Only patients with more or equal 25 apppointments.\n", "no_dupe_pat(df_more_25).age.hist(figsize =[12,4]);\n", "\n", "plt.title('Graphic 4 - Histogram of Age for Patients with more than or equal to 25 appointments') # Add title\n", "plt.xlabel('Age [years]') # X axis Label\n", "plt.ylabel('Frequency'); # Y axis Label" ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Remove duplicates entries in patientid and plot the histogram of ages - Only patients with less than 25 apppointments.\n", "df_less_25.age.hist(figsize =[12,4]);\n", "\n", "plt.title('Graphic 5 - Histogram of Age for Patients with less than 25 appointments') # Add title\n", "plt.xlabel('Age [years]') # X axis Label\n", "plt.ylabel('Frequency'); # Y axis Label" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Based on these two histograms:\n", "\n", "* There are few patients with more than or equal to 25 appointments (only 28 patients to be extactly);\n", " * The majority os these patients has more than 40 years;\n", "* The group with less than 25 appointments has some similarities from 0 to 60 years, roughly, with the same frequency. After 60 years the frequency goes down steadily.\n", "\n", "Only one point does not explain much of this question, so let's varies the number of appointments to plot a graph." ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Initializing variable\n", "list_avg_age = []\n", "\n", "# Loop to calculate the average age varying the number of appointment cummulate by patientid\n", "for index in range(2,88):\n", " no_show_more, no_show_less, df_more, df_less = attendance(df_vit, index, complete= True) \n", " list_avg_age.append([round(df_more['age'].mean(),2), round(df_less['age'].mean(),2)]) # Calculate the .mean()\n", "\n", "# Unpacking the list_avg_age variable.\n", "avg_age_more, avg_age_less = zip(*list_avg_age)\n", "\n", "# Generator of x axis.\n", "threshold = np.arange(2,88)\n", "\n", "# Defining the size of the Plot.\n", "plt.figure(figsize= [12,4])\n", "\n", "# red dashes, blue squares.\n", "plt.plot(threshold, avg_age_more, 'r--',threshold,avg_age_less, 'bs')\n", "\n", "plt.xlabel('Number of appointments by Patient ID') # X label\n", "plt.ylabel('Average Age [years]') # Y label\n", "plt.title('Graphic 6 - Average Age by Patient ID') # Graphic Title\n", "plt.grid(True) # Grid\n", "plt.legend(('Group1', 'Group2')) # Legend\n", "\n", "plt.show() # Plot the graphic" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Based on the graphic, there is a positive relationship between the number of appointments by the patient and the average, however, this could be a trap, because the sample (the number of observation in the data frame with more than \"x\" appointments) became too small when I raises the Number of appointments by a patientid, Table 2 shows how the number of observations drop in the data frame so-called `df_more`.\n", "\n", "
Table 2 - Number of Patients for each Number of Cumulate Appointments.
\n", "\n", "|k|Observations of df_more|Observations of df_less|\n", "|:-:|:-:|:-:|\n", "|2|24,379|37,917\n", "|3|10,484|51,812\n", "|4|4,984|57,312\n", "|5|2,617|59,679\n", "|10|333|61,963\n", "|25|28|62,268\n", "|50|13|62,283\n", "|88|1|62,295|\n", "\n", "### Answer Question 2\n", "\n", ">Seems the average age increases with the increase of the appointments, and it's also true to say the show-up rate increases with the increase in the number of appointment.\n", "\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Question 3\n", "\n", ">**Are patients with disabilities tend to have better rates of show-up? Is the number of disabilities raises the show-up rates?**\n", "\n", "Let's start analysing the number of patients with disabilities." ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 108283\n", "1 2042\n", "2 183\n", "3 13\n", "4 3\n", "Name: handcap, dtype: int64" ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Number of appointments of Disabled Patient.\n", "df_vit.handcap.value_counts()" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The total of appointments with patients with disabilities: 2241\n" ] } ], "source": [ "# Calculation of total appointments with Disabled Patient.\n", "tot_dis = sum(df_vit.handcap.value_counts()) - df_vit.handcap.value_counts()[0]\n", "\n", "print(\"The total of appointments with patients with disabilities: {}\".format(tot_dis))" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The total of patients with disabilities: 1133\n" ] } ], "source": [ "# Calculation of total Disabled Patient.\n", "tot_pat_dis = sum(df_clean.handcap.value_counts()) - df_clean.handcap.value_counts()[0]\n", "\n", "print(\"The total of patients with disabilities: {}\".format(tot_pat_dis))" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAdMAAAHfCAYAAAALPDLWAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzs3XeYVNXhxvHvmd2lI0UQBYSL9A6CigpiT3Q0avgZSIjdaIyJmmh00pQoMRNTNImxoWhiLIkmljCxJQqCioWIogI2RrCAKLDsUmbLnN8f9y6My7JtZvdMeT/PM8+Wmbn7Ttl559x75x5jrUVERESaL+Q6gIiISK5TmYqIiKRJZSoiIpImlamIiEiaVKYiIiJpUpmKiIikSWWaRYwxs4wxf63n/DeNMYe3YqQmMcZMMcaszPAyPWOMNcYUZ3K56TDGnGKMWWOMKTfGjG/mMh4zxpyRZo6Gni9xY8zRzVx2s6/bxL+T8edMNjLG/NgYc7vrHNJyVKb1MMbMMMa8aIzZYoz5NPj+O8YY4yKPtXaktXZ+U64TvFiV1zpZY8y05mQwxpxpjKkOlrPZGLPUGHNCkG+htXZoc5bbmoLbPyiNRfwG+K61tpO19tXdLH9LcB99boz5rzFmeuplrLXHWWv/nEYGZ4wxdxljKoLbt8EY85QxZlgjr/uF+z6Tz5nGvgEwxgwwxiSNMTdl4u82hrX2WmvtuY25bENvklpaBv4/CpLKdDeMMZcCvwd+DewN9AK+DRwKtNnNdYpaLWAjBS9WnWpOwAlAOfB4Got9IVhWV+AO4O/GmO4ZiNuiMji67Q+82cBlxgb30VDgLuBGY8xVGfr72eC64Pb1BT7Fv4254nRgIzDDGNPWdRjJE9ZanWqdgC7AFmBaA5e7C7gZ+Hdw+aOBMPAqsBlYA8xKubwHWOA84GPgE+DSlPNnAX8H/gKU4b9gT0w5Pw4cHXxfBPwYeC+47BJg30bctjuBO9O4b84EFqX83DG4TROBw4EPU87rDfwDWA+sAi6qZ7ntgd8CHwClwKLgdzX32RnAauAz4Ccp1zsQeAHYFNyfNwJtUs63wIXAO0GGZ4PfbcF/UzG9jiwh4KdBlk+Dx6ML0Da4Ts3139vNbbHAoFq/+z9gO7Bn8PN84Nzg+0HAguB2fwb8LeV6vw+eR5uDx3hKrefLg8DfgufA//BLvK7nSwiIBM+Xz4PnWfeUy54W3N7PgZ+kXnc3z/vZKT+HgfKGHo+67vumPGeo5/8DuBtIAtuCZV9ez3PtPeACYB3wf3U8dhcB7wePxa+BUMpz/zngj8FjtQI4qlb2R4ENwLvAt2pl/2ut14FdntPAl4EKoDK4Ha+lPF9mA88Hv/8XsCdwT/DceBnwUv7eMOCpIMtK4Gu1Hr8/AbHgfnwRGFjPY9QDmBc8phuAhTX3iU4pzx3XAbLxFDyhq4DiBi53V/BPdSj+i1W74MVhdPDzmOAf9uTg8jX/RPfhl9Do4EWj5gVvFv4L7vH4ZflLYHHK34unXPaHwDL8kY8BxhK8UNeTt0Pwz3N4GvfNmQRlChQDFwfL7ELKC2Nw+5cAV+KP5PfDf4H60m6W+6fgBaNPcNsPwS+vmvtsDn65jgUSwPDgehOASUEWD1gOXJKyXBu8qHQH2qf8blA9t/Fs/BfD/YBOwD+Bu2sts77r11WmJcFz6rjg5/nsLNP78Aus5jk0OeV638R/0SwGLgXWAu1Sni+V+EVdAlyGX0AldTxfLgEW448k2wK3AvcF543Af+E8LDjvd0HWBss0uH/uBRY24fEYlPJzo58zNOH/o57HZkrw/OmGX4qP1vHYPRM8X/oBb6c8TmcG98v3g/t7Ov7/f/fg/AXATcFjOA7/f/uolOy1y3R3z+kdl03JNR//OTkQ/3/trSDb0cF9/ReCN8n4ry1rgLOC8/bHL+yRKY/fBvw3PsX4hXx/PY/RL4FbgttcEtyHJpOvuflwch4gG0/4L2Bra/3uefx3ZtuAw4Lf3QX8pYFl3QBcH3xf8080LOX864A7gu9nAf9JOW8EsC3l5x0vFvjvNk9q4u06Df/Fttn/CCkvKJuCf9DFKZkOZ+cL40HA6lrX/RF1jIrxX0S3kTKqSjmv5j7rm/K7l4AZu8l3CfBQys8WOLLWZRoqw/8C30n5eSh+aRU38vp1no9fhDOD7+ez80X6L8BtqbexnmVvrLmfgudLapmE8EeDU+p4vizni6OofWpuE355pb6YdsQfHdVXptuD58Ba/NHYwCY8Hrsr03qfMzTh/6Oe++924OHg+4OD+2CvWvm+nPLzd4D/pjz3Pybl/yd4Lp4G7AtUA51TzvslcFdK9tplWudzmt2Xaeoamd8Cj6X8fCKwNPh+OsGbm5TzbwWuSnn8bk8573hgRT2P0dXAI9TznNfJapvpbnwO9EjdxmatPcRa2zU4L/V+W5N6RWPMQcaYZ4wx640xpfjbWXvUWn7qdT7AXz1UY23K91uBdrvZ1rcv/uqqpjgDv/xtXWcaY/ql7qhUz3IWW2u7Wmt7WGsnWWv/U8dl+gO9jTGbak74q6V71XHZHvjv5uu7PbXvl05B5iHGmHnGmLXGmM3AtdR/fzdGb/zHpcYH+KVTV/ZGMcaUAD3xRwS1XY6/duGlYI/ts1Oud6kxZrkxpjS4D7vwxdu347ZZa5PAh3zx+VSjP/BQymOxHP/Fv1dw+dTlbMF/ntfnN8FzYG9r7Veste8FeRvzeOxOY54zjf3/2IUxpj1wKv5IDGvtC/irWb9R66L1/X9+VOv/p+b83sAGa21ZrfP61BOpzud0PdalfL+tjp9rrt8fOKjW/TgTf9+P5vztX+OPip80xrxvjIk0kLMgqUzr9gL+apeTGnHZ2sV0L/479X2ttV3wV4/U3vt335Tv++G/222qNfirfBrFGLMv/ijgL7u7jLV2tf3izkrpWAOsCl5wa06drbXH13HZz/BHOo2+PSluxt92Ndhauwf+i2/t+7vONw/1+Bj/BalGP/zR+Lq6L94oJwXLeKn2Gdbatdbab1lrewPnAzcZYwYZY6YAVwBfA7oFb+ZK+eLt2/FcMsaE8Ffj1vV8WoO/ijn18Whnrf0IfzSbupwO+KuWm6Mxj8fuNOU5U5eGHudTgD3w79+1xpi1+GV3eq3L1ff/2afW3vw1538MdDfGdK513keNzJ6qqc/X2tYAC2rdj52stRc0Z2HW2jJr7aXW2v3wR8A/MMYclWbGvKMyrYO1dhPwc/x/uv8zxnQyxoSMMePwV4HVpzP+O9TtxpgD2fVdL8DPjDEdjDEj8bdr/K0ZMW8HrjHGDDa+McaY+l4ATwOerxlBtIKXgM3GmCuMMe2NMUXGmFHGmANqXzAYUc0FfmeM6R1c9uBG7mnZGX8HjPLg4xmNecFYh789bnfuA74ffISiE/7o6m/W2qpGLPsLjDHdjTEz8bcJ/8pau8uIzxhzqjGmb/DjRvwX02r821aFv+2t2BhzJX4ZpJpgjPlqMDq7BP9N4OI6otwC/MIY0z/4mz2NMTVvFh8ETjDGTDbGtMFfrdfc14aGHo/67vtGP2d2o6HH9Qz859lo/G2a4/D3dxhnjBmdcrkfGmO6BW9AL+aL/597ARcZY0qMMacCw4F/W2vX4G8K+qUxpp0xZgxwDsEouInWAV7w5qg55gFDjDGnBTlLjDEHGGOGN+Hv77gfjTEnBG/uDP5jWx2cJIXKdDestdcBP8BfBfcp/hPsVvyRwvP1XPU7wNXGmDL8bVF/r+MyC/BXm/wXf3XZk82I+Ltg2U/iP8HvwN+ZYXdOB1rtc43W2mr8d7Hj8LfTfob/BqDLbq5yGf4OVS/jrwr9FY17fl6G/4alDH+Hjsa8MZkF/DlYBfa1Os6fi7936LNB9u3A9xqx3FSvBavK3wXOBb5vrb1yN5c9AHgxuPyjwMXW2lXAE8Bj+DuafBDkqL3K+hH8bWQb8d8wfdVaW1nH3/h9sOwng+fmYvxtlFhr38Tf4/le/FHqRvzVxc3R0OMxi93c9814ztT2S+CnwbIvSz3DGNMHOAq4IVgTUHNagv8xsTNSLv4I/o5QS/H3eL0j5bwXgcFBtl/g7w1c8wbp6/jbQz8GHsLfRvlUI7OneiD4+rkx5n9NvXKwqvlYYEaQZS3+/1NjPwY0iy8+RoOB/+DvpPYCcJNt4ufdC4HZzeYzaQHGGI+de1s2eZQjIi3LGGPxV1G/W8d5Z+LvNDa51YNJ1tPIVEREJE0qUxERkTRpNa+IiEiaNDIVERFJk8pUREQkTSpTERGRNKlMRURE0qQyFRERSZPKVEREJE0qUxERkTQ1auoiERFJ35IlS/YqLi6+HRiFBjPZJAm8UVVVde6ECRM+bc4CVKYiIq2kuLj49r333nt4z549N4ZCIR0xJ0skk0mzfv36EWvXrr0d+EpzlqF3RiIirWdUz549N6tIs0soFLI9e/YsxV9j0LxlZDCPiIjUL6QizU7B49LsTlSZiogUkFNPPdXr3r372MGDB4/c3WV+8IMf9N5rr73GDBs2bET//v1HHXvssQOXLFnSLpM5/vCHP+x5+umn96vrvPHjxw8DWLlyZZuanM8++2yHM888c1+AefPmdX7qqac6ZjJPurTNVETEES8Sm5DJ5cWj4SUNXebss8/+7OKLL/70rLPOGlDf5b797W+vu/rqq9cBzJkzp9uXvvSloa+//vqbvXv3bvG5mF999dUVtX932GGHbT3ssMO2Ajz99NOdO3XqVH3MMcdsaeksjaWRqYhIATnuuOPKe/bs2aRC/Na3vrVxypQppXfccUd3gIULF3Y44IADho4cOXL45MmTB3/wwQclALNnz95r4MCBI4cMGTLihBNO2A/gmWee6TB+/Phhw4cPHzF+/Phhr732Wtua5X700UclU6ZMGex53qhLL710n5rfd+jQYXztDPPmzet8xBFHDFq5cmWbv/zlLz1vueWWXsOGDRvx+OOPd+rTp8/oRCJhADZs2BBK/bm1aGQqIiINGj9+/NYVK1a0SyQS5qKLLuoXi8Xe7d27d9WcOXO6XXbZZX0eeOCB+B/+8Ie9P/jgg2Xt27e3n332WRHA2LFjt7/00ksrSkpKePjhhztffvnlfZ944on3AF5//fWOy5Yte7NTp07J8ePHjzjppJNKa0afuzN06NCK008/fX2nTp2qa0bOBx98cNnf//73LqeddtqmuXPndj/++OM3tm3btlW3TatMRUSkQTVzX7/++utt33nnnfZHHnnkEIBkMknPnj0rAYYOHbrtlFNOGfCVr3xl08yZMzcBbNiwoWj69OkD4vF4O2OMrays3DFinDx58ua99967GiAcDm+cP39+p4bKtC7nnXfe+l/96ld7n3baaZv++te/9pgzZ048/VvcNFrNKyIiDVq6dGmH4cOHb7fWmkGDBm1bsWLFWytWrHjr7bfffuu55557B+CZZ55558ILL1y/ZMmSjmPHjh1RWVnJFVdc0Wfq1Kll77zzzpv/+te/3q2oqNjRO8Z8cU1s7Z8b69hjj93y4Ycfto3FYp2qq6vNAQccsD2d29ocKlMREanXXXfd1XXhwoVdzj777A1jxozZvmHDhuL//Oc/HQESiYR55ZVX2lVXV/Pee++1OfHEE8tuuummD8vKyopKS0uLNm/eXNS3b98KgFtvvbVH6nIXLVq0x7p164rKy8vNv//9765Tp04tb0yezp07V5eVlRWl/m7GjBmfn3XWWft985vf/CxTt7spVKYiIgXkxBNPHDB58uRhq1ataturV68x119/fY+6Llezg0///v1H3XPPPXs+8cQTK3v37l3Vrl07e//9978XiUT6Dh06dMTIkSNHLFiwoFNVVZX5xje+MWDIkCEjRo0aNeL8889f16NHj+orrrhi7axZs/ruv//+w6qrq7/wNyZOnFg+ffr0AaNGjRp54oknbmzsKt5p06ZtisViXWt2QAI455xzPt+8eXPxOeecsyHtO6kZTM16cBERaVmvvfZafOzYsU5GTvnuzjvv7PbII490ffjhh1c1dxmvvfZaj7Fjx3rNua52QBIRkZx2xhln7PvMM890mTdv3juuMqhMRUQkp/35z39eA6xxmUHbTEVERNKkMhUREUmTylRERCRNKlMREZE0qUxFRArEu+++W3LQQQcN2W+//UYOGjRo5DXXXLNXXZfLtSnYrrvuup433njjnpnM11Tam1dExJVZXTI6BRuzSuudgq2kpITf/va3H06ePHnrxo0bQ+PHjx9x/PHHb54wYcIuh9/LpSnYLr/88vUtnakhGpmKiBSI/v37V06ePHkrQLdu3ZIDBw7ctnr16jYNXS/bp2D7wQ9+0PvKK6/sBfDmm2+2nTJlyuCRI0cOnzBhwtBXX321HcDcuXO7DR48eOTQoUNHTJw4cWh69+SuNDIVESlAK1eubPPWW291aOzxcLN5CrYnn3xyj5rzzz333P633XbbB6NHj048/fTTHS+44IJ+ixcvfjsaje7z5JNPvj1gwIDKmmyZpDIVESkwpaWloa9+9asDo9Homu7duycbc51snoIt9Xa9+uqrnU499dSBNb+rqKgw4B8HeObMmd60adM2zpw5c2Nz/8buqExFRApIIpEw4XB44KmnnrrhjDPO2NTY6y1durTDhAkTttZMwbZ06dJdtms+88wz7zz22GOdH3744a7XXXdd73feeeeNminYnnrqqfdWrlzZ5sgjj9yxijVTU7DVqK6upnPnzlUrVqx4q/Z599577+qnn36646OPPtpl3LhxI5cuXfpmTZFngraZiogUiGQyyYwZM/oPGTJk+6xZs9Y19nq5MAUbQPfu3ZN9+/atmDt3brea2/vCCy+0B39b6pFHHrnlhhtu+Lhbt25V77//foPbiptCZSoiUiCeeuqpTg8//PCeixYt6jxs2LARw4YNG/G3v/2tS12XzbUp2Grcd99979955509hg4dOmLw4MEj//GPf3QF+P73v993yJAhIwYPHjxy0qRJZZMmTdrWzLuxTpqCTUSklWgKtuyWzhRsGpmKiIikSWUqIiKSJpWpiIhImlSmIiIiaVKZioiIpEllKiIikiaVqYhIgdi6dasZPXr08KFDh44YNGjQyO9///u967rctGnTvD59+oweOnToCM/zRp1yyineqlWrSjKZJfXg9Kni8XjJl7/85f1g58HtAe65554uP/7xj/cGuPvuu7umTgl3ySWX9H744Yc7ZzJfU+lwgiIijoz+8+iMTsG27Ixl9U7B1q5dO7to0aKVXbp0SSYSCXPAAQcM/e9//1t61FFHbal92dmzZ3941llnbUwmk1xzzTV7HXHEEUNXrFjxZrt27Vr04ASe51U+/vjj79f+/cyZM0uBUoCHH364a1VVVWnN1HE33HDDxy2ZqTE0MhURKRChUIguXbokwT8AfFVVlWnoeLihUIirrrrq0x49elQ++OCDXQD++c9/7jFu3LhhI0aMGH7cccftV1paGgL4zne+06dmCrbzzjuvL8C9997bZcyYMcOGDx8+4pBDDhmyZs2aHYO4119/vcOkSZOG9O/ff9Rvf/vbHvDFCcFT1Uwm/tRTT3X8z3/+0/WnP/1p32HDho148803206bNs278847u0HTpofLJI1MRUQKSFVVFaNGjRqxevXqtmecccanRx555C6j0rqMGTNm6/Lly9t98sknxddee+0+zz777Nt77LFH8ic/+cne11xzTa8f/vCHn/773//u9v77778RCoWomebsmGOOKZ8xY8aKUCjE7373ux5XX3313nPmzPkQYPny5e2XLFmyvKysrGj8+PEjpk2bVtpQjmOOOWbL0UcfvemEE04oPeuss74w+0tTp4fLJJWpSCvwIrEQ0BnoUMep/W5+n3pee6AKKAPKg1NZQ1/j0fD2VrmBkjOKi4tZsWLFW5999llROBwe+PLLL7c74IADGnye1Bx6dv78+R3fe++9dgceeOAwgMrKSjNhwoTy7t27V7dt2zY5Y8aM/uFwuHT69OmlAKtWrWpz8skn912/fn1JRUVFaN99903ULPO4447b1KlTJ9upU6eqgw8+ePPChQs7Hnjggc2egq2p08NlkspUJAO8SMwA+wADAC84pX7fD8joDhyNzFXFzoL9BPgAiAdfa07xeDRc1trZxK0ePXpUT548uexf//pXl8aU6bJlyzocffTRa621TJ48efO//vWvVbUvs3Tp0uWPPvroHvfff3+3m2++ea/Fixe//d3vfrffxRdfvHbmzJml8+bN63z11Vfv2Okp01OwNXV6uJKSzP1LqkxFGsmLxHqxsyBrl2Y/oK2jaPUpBroGp32BA+u6kBeJbQDeBVamnN4G3olHwxmdXUPc+fjjj4vbtGlje/ToUV1eXm7mz5+/x2WXXba2vuskk0muvfbavdavX18ybdq0zRs2bCi69NJL+73xxhttR40alSgrKwutWrWqpH///pXl5eWh6dOnlx5++OHlQ4YMGQ1QVlZW1K9fv0qAu+66a8/UZT/22GNdf/GLX3yyefPm0OLFiztff/31HyUSiQYbtVOnTtWbN2/eZZ+f1Onhjj766C2JRMIsW7as7fjx47fXTA937LHHlvfu3bt7aWlpUY8ePTI2n6nKVKQOXiTWH5gATAy+TgD2rPdKua07ftHWLlvrRWKrgRXAS8DzwAvxaLjBbVuSfdasWVNy5plnDqiursZaa0466aQNX//61+t8LH/605/2jUaj+2zfvj00fvz4LU8//fTKdu3a2d69e1fdeuut8RkzZuxXUVFhAK666qqPunTpkjzhhBMG1ZTh7Nmz1wD85Cc/+fjrX//6wF69elVMnDhxy+rVq3e86Rw/fvyWo446avDHH3/c5rLLLvvE87zKlStXNjjP6MyZMzdccMEF3i233NLrwQcffK/m9zXTw1100UX9ysrKiqqrq80FF1ywbvTo0YlvfOMbA8rKyoqstaZmerh0789UmoJNCp4XiXUDDgEOZmd59qj3SoUtCSwHnsMv1+fj0fA7biPlBk3Blt3SmYJNI1MpOF4kNhCYDBwanIYD6W2sKSwhYGRwOg/Ai8TWAy8QlCvwsnZ+kkKiMpW850ViewInBKcpwC5HXZG09QS+EpwAKrxI7FV2luv8eDSsEZnkLZWp5CUvEtsPOAk4GX/0mfHPlUm92gAHBafvA9VeJLYI+Cfwz3g0/KHLcCKZpjKVvOFFYhPZWaCjHMeRLyoCpganG7xI7GX8Yv1HPBp+12kykQxQmUrO8iKxEuAI/AL9CtDXbSJpJMPOPYejXiS2jJ0j1tedJhNpJpWp5BQvEtsDOB6/QI8DurhNJBkwOjhd5UVi7wIPAf8AXopHw/q4geQEHehecoIXiU31IrH7gPXAfcAMVKT5aBDwQ2AxsNqLxP7oRWKHOs6Ud6qqqhg+fPiImunNatMUbE2nkalkLS8S6wKcDnwbGOE4jrS+vsB3ge96kdhrwJ+Ae+LRcLOP3Zptlg8bntEp2IavWF7vFGw1Zs+e3WvQoEHbysvLd7tjnqZgaxqNTCXreJHYBC8Sux34GPgDKlKBscBtwEdeJPa74LPC0gzvvfdeyRNPPNHlW9/6VqM+qqQp2BpHI1PJCl4k1gF/1e0F+EchEqlLV/yP2lziRWKPAzcCj2nbauNdeOGF+1533XUflpaWNunjYpqCrX4qU3HKi8SG46/GPR3/hVKkMQz+DmjHAe95kdjNwNx4NLyx/qsVtvvuu69Ljx49qqZMmbJ13rx5TdrGqCnY6qcylVYXfKTlq/ij0KmO40juGwj8BrjGi8TuBW6MR8NLHWfKSosWLer01FNPde3Tp0+XRCIR2rJlS+ikk04a8Mgjj+wynVptmoKtftpmKq3Gi8TaeJHYhfjzad6PilQyqz1wDvCqF4kt8iKxGV4kpgFDij/96U8frVu37vWPPvpo2V133fX+pEmTyhoq0mQyyezZs3dMwXb44YdveeWVVzq98cYbbQHKyspCr7/+etvS0tLQhg0biqZPn156yy23rFm+fHmH4Px6p2DbunWrWbt2bdHixYs7T548eUtjbkdjpmADf7XvK6+80q66upqaKdhuuummD8vKyoqaupq7IXqiSYvzIrEi4AzgSqC/4zhSGGomMZjtRWLXAH+NR8MZnXIr32kKtqbRFGzSYrxILAR8HbgKGOw4jhS2lcDPgb/Fo+GkqxCagi27pTMFm1bzSovwIrGvAq8Df0VFKu4NBe4FXvcisf/zIjFNuScZpTKVjPIisbAXiS3BPxzcLp8VE3FsJPAA/nbVsOswkj+0zVQywovEjgRmAwe7ziLSCGOBeV4k9gzww3g03KgjB4nsjspU0uJFYofgl+gRrrOINMMRwMteJHY/8ON4NBxv4b+XTCaTJhQKaWeVLJNMJg3Q7O3pWs0rzeJFYoO8SGwe8BwqUsltBn9HuZXBoQq7t+DfemP9+vVdghduyRLJZNKsX7++C/BGc5ehvXmlSbxIrC0QCU7tGri4SC7aBPwEuDnThylcsmTJXsXFxbfjT16vwUz2SAJvVFVVnTthwoRPm7MAlak0mheJHYs/c0ed0zaJ5JmFwLnxaPht10Ek+6lMpUFeJNYbuAE41XUWkVa2Hbga+HU8Gq5yHUayl8pUdis46MKFwC8ApxPvijj2KnBOPBp+1XUQyU4qU6mTF4kNA+4ADnGdRSRLVOEfUP/n8Wh4u+swkl1UpvIFwYHBf4h/CMC2DVxcpBCtxN+Wush1EMkeKlPZwYvExgJzgf1dZxHJcha4BbgiHg2XuQ4j7qlMpWZ+0SuBK4DMTfAnkv/WAOfHo+HHXAcRt1SmBc6LxPYF/g5Mcp1FJIfdA1wcj4Y/dx1E3FCZFjAvEvsy/qwuezZ0WRFp0IfA/8Wj4RddB5HWpzItQMFHXn6Of5QXHdZMJHMqgEvi0fDNroNI61KZFhgvEuuJP6/j0a6ziOSxPwMXxKPhba6DSOtQmRYQLxI7FPgb0Md1FpECsBSYFo+G33cdRFqeDrRcILxI7FJgPipSkdYyDnjFi8SOdx1EWp5GpnnOi8S6AHcCp7jOIlKgLP7xfa+OR8PNni9TspvKNI8FB2F4EM3yIpINHgNmxqMfajxTAAAgAElEQVThja6DSOZpNW+e8iKxc4HFqEhFssVxwBIvEhvvOohknkameSY4tu7NwLmus4hInbYB34lHw3e5DiKZozLNI14k1h7/aEYnuM4iIg26Bfie5knNDyrTPOFFYl2BecChrrOISKM9AkyPR8MJ10EkPSrTPOBFYvsATwCjXWcRkSb7L3ByPBoudx1Emk9lmuO8SGwQ8CQwwHUWEWm2F4HjtKdv7tLevDks2CvwOVSkIrnuIGCBF4nt7TqINI/KNEd5kdjh+Ec02sttEhHJkNHAIi8S81wHkaZTmeYgLxI7GXgc2MN1FhHJqIH4hTrcdRBpGpVpjvEisXPwj2rU1nUWEWkRfYBnvUhsgusg0ngq0xziRWIR4HagyHUWEWlRPYCnvUjsMNdBpHG0N28O8CIxA/wG+IHrLCLSqrYBp8aj4ZjrIFI/jUxzw+9RkYoUovbAQ14kNsN1EKmfyjTLeZHYT4Hvuc4hIs6UAPd4kdhZroPI7mk1bxbzIrHzgFtd5xCRrFCNv8r3IddBZFcq0yzlRWJfBR5Aaw9EZKcE8OV4NDzfdRD5IpVpFgoOyPA4+viLiOyqDDg8Hg3/z3UQ2UllmmW8SGwcsAAdkEFEdm89MDkeDb/tOoj4VKZZxIvEBuIfa7eX6ywikvU+AA6NR8MfuQ4i2h6XNYIDXD+JilREGqc/8LgXiXVxHURUplnBi8T2AB4D9nOdRURyyijgH14kVuI6SKFTmTrmRWJtgUeBca6ziEhOOgp9hM45lalDXiRWBNwHTHWdRURy2lnBAV7EEZWpWzcBp7gOISJ54RovEvuG6xCFSnvzOuJFYt8B/uQ6h4jklQrgmHg0/KzrIIVGZeqAF4kdjP9ZUu00ICKZtgEYF4+G17gOUki0mreVeZFYL/zDBKpIRaQldAfu8yKxYtdBConKtBUFT+6/AX1cZxGRvHYocI3rEIVEZdq6foX23BWR1nGFF4kd6zpEodA201biRWJfwx+Vioi0lk/xt59+4jpIvlOZtgIvEhsE/A/o7DqLiBScZ4Cj49Fw0nWQfKbVvC3Mi8TaAPejIhURN44AfuY6RL5Tmba864AJrkOISEG7MpgnWVqIVvO2IC8SOxH/uLsiIq59jL/9dL3rIPlIZdpCvEisD/AasKfrLCIigSeA4+LRsF74M0yreVtAcAD7e1GRikh2+RJwhesQ+Uhl2jJ+CBzmOoSISB2u8SKxQ1yHyDdazZthXiQ2EFgGtHedRURkN9YAY+PR8EbXQfKFRqaZdxMqUhHJbvviH5FNMkQj0wwK5hK8x3UOEZFGsMAh8Wh4sesg+UBlmiFeJNYNWA70cp1FRKSRlgIT49FwtesguU6reTPnV6hIRSS3jAO+6zpEPtDINAO8SOxQYCFgXGcREWmizcAwHQw/PRqZpsmLxEqAW1GRikhu2gO43nWIXKcyTd8PgZGuQ4iIpGG6F4kd7TpELtNq3jQEnyl9A2jnOouISJreBsbEo+GE6yC5SCPT9NyMilRE8sMQ4HLXIXKVRqbN5EViM4G/us4hIpJB24GR8Wj4fddBco1Gps0QfKb0d65ziIhkWDvgj65D5CKVafNcC+zlOoSISAs43ovEvuo6RK7Rat4m8iKx/YCVQLHrLCIiLWQNMDweDW9xHSRXaGTadD9DRSoi+W1f4DLXIXKJRqZN4EVig4AVQJHrLCIiLWwT0D8eDW92HSQXaGTaNFehIhWRwtAVuMh1iFyhkWkjeZHYUOBNVKYiUjg24I9Oy10HyXYamTaeRqUiUmi6o1llGkUj00bwIrERwDL05kNECs96YID27K2fyqFxrkL3lYgUpp7ABa5DZDuNTBvgRWKjgNfRFGsiUrjW4Y9Ot7kOkq002mrYLFSkIlLYegHnuw6RzTQyrYcXiY0FXkVlKiLyCbBfPBre7jpINtLItH6zUJGKiADsA3zLdYhspZHpbniR2P7AEtc5RESyyEfAQE0gviuNTHfvp64DiIhkmT7A2a5DZCONTOvgRWJ9gA/QQRpERGpbDQyKR8OVroNkE41M63Y2KlIRkbr0A2a4DpFtVKa1eJFYCDjHdQ4RkSymj8nUojLd1bFAf9chRESy2KHBYVYloDLdlXb9FhFpmEanKbQDUgovEusFrAFKXGcREclyG4E+OsSgTyPTLzoLFamISGN0A051HSJbqEwDXiRmgHNd5xARySFa1RtQme50JDDQdQgRkRxyiBeJjXQdIhuoTHfSjkciIk13uusA2UA7IAFeJNYD/5iTbVxnERHJMR8B/eLRcNJ1EJc0MvWdgYpURKQ5+gBHuQ7hWtaXqTFmrjHmU2PMGy34Z7TjkYhI8xX8qt6sL1PgLuDLLbVwLxKbAgxrqeWLiBSAU7xIrJPrEC5lfZlaa58FNrTgn5jZgssWESkEHYFprkO4lPVl2pKCz5Z+xXUOEZE8cJrrAC4VdJkCBwH7uA4hIpIHpnqRWBfXIVwp9DI9yXUAEZE8UQwc7TqEK4Vepie7DiAikkdabGfRbJf1ZWqMuQ94ARhqjPnQGJORibu9SGwI2otXRCSTVKbZylr7dWvtPtbaEmttX2vtHRlatEalIiKZ1deLxEa5DuFC1pdpC9JevCIimVeQo9OCLFMvEusKTHKdQ0QkD6lMC8hRQJHrECIieWiKF4l1dB2itRVqmX7JdQARkTzVBn9+6IKiMhURkUwruFW9BVemXiQ2DOjnOoeISB5TmRYAjUpFRFrWfsFn+QuGylRERFpCQY1OC6pMg1liDnWdQ0SkAKhM89ggYA/XIURECsDhXiTW1nWI1lJoZbq/6wAiIgWiPTDGdYjWojIVEZGWMtZ1gNZSaGU6wXUAEZECMs51gNZSaGU63nUAEZECopFpvvEisQFAd9c5REQKyJjgUxR5r2DKFG0vFRFpbXsAA1yHaA0qUxERaUkFsd20kMpUOx+JiLS+gthuWkhlqpGpiEjr08g0X3iR2L5AT9c5REQKkEameUSjUhERN/p7kVhX1yFamspURERaWt6v6i2UMtXORyIi7uT9qt5CKdORrgOIiBQwjUxzXXD0jd6uc4iIFDCNTPNAD6CN6xAiIgVssOsALa0QyrSP6wAiIgWukxeJdXQdoiUVQplqFa+IiHt7uw7QklSmIiLSGnq5DtCSCqFMtZpXRMQ9jUxznEamIiLuaWSa4zQyFRFxTyPTHKeRqYiIexqZ5jiVqYiIexqZ5iovEisG9nKdQ0RENDLNZfsAxnUIERHRyDSXaecjEZHsoJFpDtP2UhGR7NDei8T2cB2ipeR7me7jOoCIiOyQt6t6871M27sOICIiO+Ttqt58L9Mi1wFERGSHvP10hcpURERaS1vXAVqKylRERFpLsesALUVlKiIirUVlmqPy/faJiOSSjJepMabIGPOqMWZeppfdFPleNhqZiohkj5YYmV4MLG+B5TaJylRERFpLRsvUGNMXCAO3Z3K5zaEyFRGR1pLpkekNwOVAMsPLbbK83RgcUJlKXjgz9K+FRxUvrthsipNlFFeXmeJkmSmqLrdFbDGh5DYM20zIbMOYBIaEIVQJoSpji4zFhJLYkDW2qBpCFhuyELIQSmJDSTDWEEpCkX+eLaoG41/GFCWx/vfGv3zw+1ASW5TEhKy1oaT/O1OzTItJ+RuErCWUDM7feTIhawlZMMmay1sTSvqzU4SsNSaJCQHGv5wxNctNWmOwpuZvGht87182uJzF5P+AIaesb991qz+QTJ8x5gTgU2vtEmPM4RlZaBpUpiI54K7kiVNsdZvFVxbf3b/IJP3DZNrgzNpfU1iw243ZttWYrVuKQtvL25jtZaFQoiwUqigLhSrLQqGqzaFQ9eZQyJYVhZLlxrAlFDJbQia0zYSKtoVMccKYkgpj2lRBm2pj2lVDewsdgPYYk92zMvklnDTBKbTrV2ss1aHkju+TwZuNpLHYWr9PBm8sksEbBrvjskm78+ckyaLkjjctNuRfx37hOv75/uV2/j71zYoNJS011w8un/LmZMfl/TcjFor8y5man1Mvn/qz8Zdb82YneDOy43q7fK39vdnxpmXnzyFLCHact+P3tX/XkXWZ7JzDgHONMecSzA5mjPmrtfabGfwbjZbvZap3pZI3/lz9pUkPVh9WfkvJDQsmh5YdakzD/78GTHtrO7S3tsOeycyuCbNgtxmzZWvIbC03oe3lodD28pCpKeqq0qJQVVkoVF0WCiXLQiFbHjKUm1BoayhktoZM0XZjiiv8om5biWlbbWibrClqYzJzKFBjSBpCtPhrQXa/p8giJcsyt6wrgKutteXGmKOAB4EbM7f4psn3MtXIVPLKFtp3Oq3yR1NHm/ffubvNL7d1NVvGuMpiwHSwtmOHatuxR4Y3WSUhudWYLVtDoW1bQmZbuQklyopCFZtDoURZyFSVhUJVQUknN4dClIdCdkvImK3+aLpom/FH1JXGtKnyi7pdEjoERZ23R+HJAdWZWpC11gLlwY/F+G+Y6lg/0zpUpiI5aJndb/C4xG32u0UPL/pB8YPDQsb2cJ0pk0IQ6mRt507V1Z0z9/Lrq4bqmqIuD5nt5aFQzarvyrJQqDJY7V1dFjK2PBSy5X5Zh7YaE9rmj6hLakbUVca0rYZ2dueIuk1m0+adikwuzBhTBCwBBgF/sta+mMnlN4XKVCRnGXNj9SmT764+ZtPcNr9+dn/zzmRjtGmjIUVQ1NnaPTpXV+/RK8NFXQmVW0Nm61YT2rolZbX35p1FXTOitmVFIWq2UW8NmdA2Y4q3m1BJhcEfUQdFnYSO+EWdD6/XiUwuzFpbDYwzxnQFHjLGjLLWvpHJv9FY+fDg1CfD/yoi2aeUTl2nVfz8sAPN8rfmtvk1ncz2Ea4zFaoSKOmStF26UN2F6sy+/FRAxbZQaGuw2nt7mV/WFUFJV24uCiVTtlGzJWSMv43ahLabUFHCmBJ/1Tftqoxpm/SLuj3QEX+E1xq2t8RCrbWbjDHzgS8DKtMWUOo6gEhreckOHzE6cXsyUnz/wvOK5o0xhi6uM0nmtIE2bZLJNl2SdM30OKECElv8ot7q70gWSpQHI2p/NG2qN4dC1eWhkC0LhUy5X9ahbSYU2mZMcSJkShKYkipD26Co2yf9Pb47YEzq2pItmcpsjOkJVAZF2h44GvhVppbfVPlephtdBxBpTZZQ6JdV35gyp+r49Xe3iS4aHlo92XUmyX5toG2bZLJttyTdMl3UOz6aFQpt225MJgc4+wB/DrabhoC/W2udHZ9XZSqShz6ja8/jKqI9jwi9+trNJTd0aGcqB7vOJIWpnbXt21nbvrv/0azyhi7fWNba14HxmVpeuvJ9ZwWVqRS0Z5Ljx45MzB1wb9WRC6zN3AuZSDNtdh2gpahMRfJcNUXFP646d+qhiT+UxZO9FrvOIwUtb/djUZmKFIiP6bHP4RXXT/puxfeWVNjiuOs8UpBUpjlqk+sAItlmXvLgCaMSd+zzaPXBC6xtmY8qiNQhwazSjH7ONJvke5mudx1AJBtVUNL2osrvTT2y4jefrrXdXnadRwrCJ64DtKS8LtN4NLyJFvqQsEg+WGV795uU+NMBP648+8UqG/rIdR7Ja2tcB2hJeV2mgbWuA4hku3urjz5odOKObk9Xj1tgLZWu80he+tB1gJZUCGWa16sWRDJlG207nF15+dTjK365eoPtvNR1Hsk7GpnmOJWpSBMst/0H7p+4ddwvK7/+fLU1n7rOI3lDZZrjtJpXpBlurT7xkHGJ29q+mBy2wFpNGiFpU5nmOI1MRZqpjI5dpldcOXVaxax3ymz7N13nkZymbaY5TmUqkqb/2SHDRiduH3Fj1UkLk1YHQ5Fm0cg0x73tOoBIfjDmN1XTp0xM3JxclvQWWot1nUhyRoI8/9x/IZTpMtcBRPLJBrrseWLFtVNOq/zRG1ttm5Wu80hO+JBZpXn95ivvyzQ4cIM+jC6SYYuSo0ePSswd9OeqYxdYm7+zgUhG5PUqXiiAMg1odCrSApKEiq6qOnPqwYkbt72b3Od513kka61wHaClqUxFJG1r6d7r6IrfHnJ+xSWvJmzx+67zSNbJ+9fgQinTN1wHECkETyQPHD8qMbfvP6snL7CWra7zSNZ43XWAllYoZZr374pEskUlxW1+UPmdqVMrrt/woe3xkus8khVUpnliOegILiKtabXt1Xdy4g8H/rDyvJcqbVFef2Bf6vUBs0rzfge1gijTeDS8HXjXdQ6RQvRA9eEHjk7cvucT1RMWWEuF6zzS6vJ+VAoFUqYBreoVcWQ7bdufX3np1C9V/Oqjz+we/3OdR1qVyjTPqExFHHvb7jtgYuKW/a+uPO2Fams0CUVhUJnmGe3RK5Il5lYfd/CYxO0dn6seucBaqlznkRalMs0zGpmKZJEttO88s/InU0+uuPr9UtuhIF5wC9B24B3XIVpDIZXpe6DPvYlkm9fsoCFjE3NGX185bVHSms9d55GMepNZpQXxSYqCKdN4NJzE/4iMiGQdY35fPW3y/olbQq8mBz5rLUnXiSQjCuYQkwVTpgF9gFwki22ic7dTKq457OuVP1m+xbbVm9/ct8B1gNZSaGX6lOsAItKwxcmRI0cl7hg6p+r4Z62l1HUeaRaLyjRvPY2OhCSSEyyh0C+qvnnYQYk/VaxM9n3OdR5psreYVfqZ6xCtpaDKNB4NlwIvu84hIo33Kd16fqniukPPqbh06XZboiOZ5Y75rgO0poIq08CTrgOISNP9Nzlh3KjEHf3vrzp8vrVscZ1HGlQwq3ihMMtU201FclQVxSWRqvMOn1Lx+9IPknstdp1H6jXfdYDWVIhluhgocx1CRJrvQ9uz99SKGyZdXHHhK5W26APXeWQXbzGrdL3rEK2p4Mo0Hg1XUWDvmETy1SPJQyeOStzRK1Z90AJr2e46j+xQUKt4oQDLNKDtpiJ5IkGbdhdWXjz1qIrfrFtnu77iOo8ABThgKdQy1XZTkTzzvu3d/6DETRN/VnnW4iob+sR1ngJWUJ8vrWGsta4zOOFFYquBfV3nEJHM68D2LTeV3PDK1NDrhxhDies8BeZlZpUe6DpEayvUkSlodCqSt7bSruOZlZGpJ1T84oONttNrrvMUmEddB3ChkMtU201F8tybdsCg8Ynbxv6qcvpz1dYU1N6lDj3iOoALhVym/8Vfty8iee7m6pMOHZ+4tc3LySGakaZlrWJWaUHOHV2wZRqPhj8DXnWdQ0Rax2Y6dTm1YtZhX6u4cmWZbf+m6zx5qiBX8UIBl2mgIFdHiBSyl+2w4WMSc4bfVPWVhUnLRtd58sw/M7EQY8y+xphnjDHLjTFvGmMuzsRyW1Khl+ndaFWvSMGxhELXVc2YMjFxc/LNZP9F1up1IAPWAosytKwq4FJr7XBgEnChMWZEhpbdIgq6TOPR8CpgoescIuLGBrrsGa745eQzKq9Yts22edt1nhz3D2aVZmR7tLX2E2vt/4Lvy4DlQJ9MLLulFHSZBv7sOoCIuPVscuyYUYk79ru76ugF1urY3c30YEss1BjjAeOBF1ti+ZmiMoUHgK2uQ4iIW9UUFf+s6uyphyT+uOX95D7Pu86TY9YBz2Z6ocaYTsA/gEustZszvfxMKvgyjUfDZcBDrnOISHb4hD33PrLit4d8p+Li/1XY4lWu8+SIBzK1ireGMaYEv0jvsdZmZMemllTwZRrQql4R+YJ/Jw/af2Ribp+Hqg+dby3bXOfJcnMyuTBjjAHuAJZba3+XyWW3lII9Nm8qLxILAR8AfV1nEZHs09+s/fC+NrM/7m02FNwxZxsh48fiNcZMxt85dBnsOMjGj621/87k38kklWnAi8R+CURc5xCR7PW1omdeurb4jt7FJqk33judz6zS21yHcE2reXfSql4Rqdffq484cHTiju7/qd5/gbVUuM6TBbYA97kOkQ00Mk3hRWKLgYNc5xCR7DfUrF51T5trN/Ywm/d3ncWhO5hVeq7rENlAI9Mv0uhURBplpe03YGLilv1nV858vtqada7zOHK76wDZQmX6RfcDCdchRCR33F4dPmRsYk77F6pHPGst1a7ztKI3mFW62HWIbKEyTRGPhjcC/3KdQ0RySzkd9vh65U8PO6Xi6nc32w6FMgWZRqUpVKa7ust1ABHJTUvtoKFjEnNG/b7qlEVJaza4ztOCEvgThUhAZbqrx4B3XYcQkVxlzPVVp06ekLiZ15L7LczTGWn+wazSfH6z0GQq01ri0XAS+K3rHCKS2zayR/eTKmZPmVn547e22rYrXOfJML1G1qIyrdtdwKeuQ4hI7ns+OWrkqMQdg+dWfflZayl1nScDnmRW6f9ch8g2KtM6xKPh7cAfXecQkfyQJFR0ddXph01K3Jh4O9kn12ekiboOkI100Ibd8CKxbsAaoKPrLCKSX44JvbL0xpI/dm5rKge6ztJELzKrdJLrENlII9PdCD4mo12/RSTjnkpOHDcycUe/v1dNXWAtW1znaQKNSndDZVq/3wFVrkOISP6porjk8qrzpx5WccOm1cmeuXDwg7eAR1yHyFZazdsALxK7EzjTdQ4RyW+nhBa+fF3Jbb1KTHU/11l240xmleqQq7uhkWnDZqPRqYi0sIeSUw4Ylbhjr39XHzjf2qw7rOlq4F7XIbKZRqaN4EVidwFnuM4hIoVhoPnog/va/GL9XmbTRNdZAhczq/QPrkNkM41MG2c2FNQBrEXEofdsn/4HJm6aeGXlGS9U2dAnjuN8hnbGbJDKtBHi0fC7wD2uc4hIYflL9ZcOHpuY0/nZ6tELrHW2uekXzCrd6uhv5wyVaeNdg0anItLKttC+0+mVP5r6lYrZqzbZjq+18p9/D/hTK//NnKQybSSNTkXEpWV2v8HjEnPG/qby1OeS1qxvpT/7I2aVVrbS38ppKtOm+RmwzXUIESlcN1afcui4xK1tliQHP2styRb8Uy8wq/SBFlx+XlGZNkE8Gl4NXOs6h4gUts106jKt4ueHTa/42Ypy2+6tFvozl7XQcvOSyrTpfo3mOxWRLPCSHT5idOL2YbdWnbAwadmUwUX/k1mluX5A/lalz5k2gxeJHQ/EXOcQEamxJ6Wf3d3mlyuHm9WHGINJY1GVwAhmlWrQ0AQq02byIrFHgRNd5xARSXVE6NXXbi65oUM7Uzm4mYv4I7NKL8poqAKg1bzNdwmw3XUIEZFUzyTHjx2ZmDvgnqojF1hLeROvXgr8vCVy5TuVaTPFo+H3getc5xARqa2aouKfVJ079dDEH8pWJXu90ISrXsus0s9bLFgeU5mmJwrEXYcQEanLx/TY54iK6w/+bsX3llTY4ngDF38TuL4VYuUlbTNNkxeJnQw85DqHiEh92lCZ+E3JLS+cGHphkjG0q3W2BaYwq/Q5F9nygco0A7xI7DHgy65ziIg0ZID5ePW9ba5dt4/ZcEDKr+cwq/Q8Z6HygFbzZsZFQIXrECIiDVlle/c7OHHjAT+qPOfFKhv6CFgHXOE6V65TmWZAPBp+B/it6xwiIo11X/VRB41O3NHtb1VTz2dW6UbXeXKdyjRzZgNrXIcQEWmsbbR9cvrsRx9xnSMfqEwzJB4NbwW+6zqHiEgjbQAucB0iX6hMMygeDT8K3Oo6h4hII1wSj4bXug6RL1Smmfd9/M9riYhkq3nxaPhu1yHyico0w+LR8DZgBjrUoIhkp03A+a5D5BuVaQuIR8NvAJe6ziEiUocfxKPhj12HyDc6aEML8iKxh4CTXecQEQn8PR4NT3cdIh9pZNqyzgE+dB1CRARYjv+aJC1AZdqC4tHwBuCbQNJ1FhEpaOXAV+PRcFOnZJNGUpm2sHg0vAC41nUOESloZ8ej4RWuQ+QzlWnrmAU87zqEiBSk6+PR8AOuQ+Q7lWkriEfD1cA38HdJFxFpLQuBy12HKAQq01YSj4Y/AL7lOoeIFIy1wPR4NFzlOkghUJm2ong0/CBwu+scIpL3qoCvxaPhT1wHKRQq09Z3EbDYdQgRyWuReDS80HWIQqKDNjjgRWI9gReAga6ziEjeeTAeDZ/qOkSh0cjUgXg0vB44HvjcdRYRySsrgLNdhyhEKlNH4tHw28BJ6ID4IpIZW4Bp8Wi4zHWQQqQydSgeDT8HnAZoXbuIpCMJnBmPht9yHaRQqUwdC/bw1efARCQdFwSvJeKIyjQLxKPh3wB/cp1DRHLSj+LR8G2uQxQ6lWn2uBh41HUIEckp18Wj4ajrEKKPxmQVLxLrAMwHDnAcRUSy35x4NHye6xDi08g0i8Sj4a3ACcAq11lEJKv9Hfi26xCyk0amWciLxIYBzwHdXWcRkazzBHBiPBqudB1EdtLINAsF8w6eDCRcZxGRrPI8/iTfKtIsozLNUsFxNU9BhSoivteAcLA5SLKMVvNmOS8SOxZ4BGjnOouIOPMuMDkeDa9zHUTqppFplotHw08CJwLbXGcRESc+Ao5WkWY3lWkOiEfD/wHCgFbviBSWz4Fj4tHwB66DSP1UpjkiHg0/AxwHlLvOIiKtYg1wWDwaXu46iDRMZZpD4tHws8CXgVLXWUSkRb0BHKwD1+cOlWmOCWaaOQL41HUWEWkRzwJT4tHwR66DSOOpTHNQPBp+FZgMaDuKSH75J/CleDS8yXUQaRqVaY6KR8Pv4BeqtqeI5IebgFPj0fB210Gk6fQ50xznRWJ7Ao8DE11nEZFm+2k8Gv6F6xDSfCrTPOBFYp2Bh4EjXWcRkSapAs6PR8NzXQeR9Gg1bx6IR8Nl+Hv53uQ6i4g02lbgZBVpftDINM94kdjZ+KXa1nUWEdmtz4AT4tHwi66DSGaoTPOQF4kdgL9XYF/XWURkF3H8PXbfdh1EMkerefNQPBp+GX+HpIWus4jIF7wEHKIizT8q0zwVHBT7KOBG11lEBPA3v0yJR8OfuA4imafVvAXAi8TOBG5G07iJuLAVOC8eDd/jOoi0HJVpgfAisYnAQ/NrcEYAAAw3SURBVGg7qkhrWglMi0fDb7oOIi1Lq3kLRDwafgWYACxwnUWkQDwIHKAiLQwamRYYLxIrBn4HfM91FpE8lQAuj0fDf3AdRFqPyrRAeZHY6fg7RHR0nUUkj6wAZsSj4ddcB5HWpdW8BSoeDf8FGAPMdxxFJF/cDkxQkRYmjUwLnBeJGeA7wK/QKFWkOTbh7637gOsg4o7KVADwIjEPuAMdLF+kKZ4HvhGPhjW3cIHTal4BIB4Nx4GjgQuAMrdpRLLeZuAi/IMwqEhFI1PZlReJ9cPf/nOM6ywiWegB4JJ4NPyx6yCSPVSmslteJPYt4DfAHq6ziGSB94HvxqPhx1wHkeyjMpV6eZHYvsAc4Euus4g4UoH/pnJ2PBre5jqMZCeVqTRKME/q74AurrOItKJngW/Ho+HlroNIdlOZSqN5kVgf/EL9mussIi3sM+CH8Wj4LtdBJDeoTKXJgsnHrwMOdxxFJNMsMBf/cIAbXIeR3KEylWbzIrHjgSgw2nUWkQx4E3+V7iLXQST36HOm0mzxaPjfwDjgLGCN4zgizbUWuAQYryKV5tLIVDLCi8Ta4c9E82Ogq+M4Io2xFv8wmrdqL11Jl8pUMsqLxLrhF+r3gLaO44jUZR3+Nv+bVaKSKSpTaRHBUZSuAb6JNidIdviUnSW61XUYyS8qU2lRXiQ2Bn8npeNcZ5GCtR6/RG9SiUpLUZlKq/AisYOAy4BTgCLHcaQwrAd+jV+iW1yHkfymMpVW5UViA4DvA2ej+VOlZXyGf/i/G1Wi0lpUpuJEsKPSt/F3VNrHcRzJD+8AtwG3xKPhctdhpLCoTMUpLxJrA8wALgQOdBxHck8C+Cf+ZAzz49GwXtDECZWpZA0vEpuAX6ozgPaO40h2W45foH+JR8Ofuw4jojKVrBOsAj4LuAAY5DiOZI/t+BNz36YjFUm2UZlK1vIiMQMcA3wDOAkdWalQvYG/LfSv8Wh4o+swInVRmUpOCLatHo0//ZuKNf9tBf6GPwpd7DqMSENUppJzUor1VOBkVKz5YjvwH+Ah4MF4NLzZcR6RRlOZSk7zIrESvjhi7eY2kTRRKRDDL9DH9ZEWyVUqU8kbKcVaM2JVsWan1fgF+jDwTDwarnScRyRtKlPJS0GxHoW/A9MUYDxQ7DRU4UoAzwKPA4/Fo+HljvOIZJzKVAqCF4l1BCbhF+uU4PsOTkPlt7eBJ/EL9BkdYF7yncpUClIwct0fv1gnB6c9nYbKXWuAV4CXg9MSfYRFCo3KVIQdn2kdzs6R6xSgn9NQ2elT/MKsKc9X4tHwOreRRNxTmYrshheJ7Q0MBYYEp8HB14FAG4fRWstGYAkp5RmPhte4jSSSnVSmIk3kRWJFQH92lmtq2fYHQu7SNcln+HvWrgm+pn6/Jh4Nf+gwm0hOUZmKZJAXibXFH7kOAfbF3w67u1PnDP95i7/nbM1pI3WUJDvLcluG/75IwVKZijjiRWIh/AnSO9X6WvN9ETuLcTtfLMpdfhePhita+SaISEBlKiIikqZc2bYjIiKStVSmIiIiaVKZioiIpEllKiIikiaVqYiISJpUpiIiImlSmYqIiKRJZSoiIpImlamIiEiaVKYiIiJpUpmKiIikSWUqIiKSJpWpiIhImlSmIiIiaVKZioiIpEllKiIikiaVqYiISJpUpiIiImlSmYqIiKRJZSoiIpImlamIiEiaVKYiIiJpUpmKiIikSWUqIiKSJpWpiIhImlSmIiIiaVKZioiIpEllKiIikiaVqYiISJpUpiIiImlSmYqIiKRJZSoiIpImlamIiEiaVKYiIiJpUpmKiIikSWUqIiKSJpWpiIhImlSmIiIiaVKZioiIpEllKiIikiaVqYiISJpUpiIiImlSmYqIiKRJZSoiIpImlamIiEiaVKb/377Zx25ZlXH884F8WWS0RTFcLyTpKBfgC2YuhBpzy0jshZGgtV4sYkGjoatRxqgtHBAmvTAlc5SmZAxts2gRGKkBCQgUIyfwR2STrACJDOLqj/s8cvPreX6/B56fw+b12djvPOc+97muc51793V/zzkkSZIkSYdkMk06Qg11Qe33THV2L/V9p/qh3uirBzsT1O3q6i71g9VD6qZyfb360RfA/rMt6ueoY0t5jXpxKT+ovqr8m1prf7Z6Xy/4s1sdcJL3zlb3qJvVbepVPbQfo15W+z1F/chJ2h6sTurm2raazX1lXneov1bHnYzNJGnwslPtQPJ/z3PAB9SvR8RfT7UzDdS+EfGfNpt/ApgaEaubXHsyIi4ofZ4DLFf7RMT3e8vXVkTETS3qryz+DAamAt8p9X8GXvCPjzZYGBHz1bcAa9XXRsTRFm3HAM8CjwBExOIO7A4GJgF3t9F2bUSMA1BHACvUQxGxqgP7yUuYVKZJpxwBbgNmdL3QVVk2FFhRBg+py9Q/qnPVyUX5bVWH1LoZq64t7Rovv77qPHWDukX9dK3f1erdwNYm/lxT+t+m3lzqbgLeCSxW53U30IjYCXwemF7u7afeUfzYpI4v9eeXsWwu/p1b6leoj6m/Vz/VxbcF6kZ1lfqaZvGrtW0ox7nAkGJnXhf11SpGg4oSayjHUS2Ge0MZw3r1zepZ6i71tNLPK4sfp3UTr+1Uz8cA9X3quhKnX6oDy8fAFGBG8WdUUbYzi40h6s9LzNaqQ2txuVV9RN1Zi9FcYFTp63+ex2783AzMAT7b7j1J0pVMpklv8G1gstr/BO4ZDnwOeBtwHXBeRFwCLAGm1doNBkYD76VKeGdSKcl9ETESGAlcr76ptL8EmBURb60bU88GbgbeDYwARqpXR8Qc4HfA5Ii4oQ2/NwJDS3kW8Kvix7uAeWo/qgTxzYgYAVwM/Km0/3hEXFTqpquvLvX9gI0RcSHwEPCVNvwA+AKVch7RxPdWMZoErCy+DQc2t+h7f5mPbwG3RMQBYA3VPAB8GPhJRBxu5Zz6duAosBf4DXBpUfn3ADdGxG5gMZWSHRERa7t0cRswrcRsJkWBFwZRfQSNo0qijXisLX0tbOVXC+rzmiQnTC7zJh0TEfvVpVSK7VCbt22IiKcA1CeBX5T6rVSJqcGyskT4hLqT6oV3BTCspkj6A+cC/wbWR8SuJvZGAmsiYm+xeRdwObCiTX8bWCtfAVzVUFLAmcAbgEeBWerrgOUR8US5Pl19fym/vvj8DFXCubfU/xBYfoI+NaNVjDYAdxRFuaKosmb8qPa3kZiWADdSxexjwPUt7p2hXgscACZGRJRY3KsOAk4Hms3R86ivAC4DfqzPh/yMWpMV5bn4gzqwu77axJ6bJElrMpkmvcUtVF/39b3EI5TVD6s34um1a8/Vykdrv49y/HMZXewE1YtvWkSsrF9QxwAHW/jXWy/LC4DttT4/GBE7urTZrq6jUnEr1U9SjWss8I6I+Ke6hir5NqPrmE+GpjECUC8vvv1AnRcRS3vwIQAi4uGylDwa6BsR21rYXhgR87vULQK+EREPlHma3YP/fYB/FAXdjPrz0xtzW5/XJDlhcpk36RUi4m/AMqrlxQa7gYtKeTzQcn+tGyaofco+6jnADmAl8Jna/t15ZXm1O9YBo9UBal/gGqol1bYpe3zzqRIDxY9p5UMBtX5QaWdE3Ao8AAyjUoZ/L4l0KHBpres+HDs4NIlqSbQdDgBntbjWNEbqG4GnI+J24HvAhS3un1j7+2itfimVWj3RA1j9gT2lXD8R3XQMEbEf2KVOKP6rDu/BRnfxaIk6DPgy1XZFkpwUqUyT3mQBxx/iuB24X10PrKK1auyOHVRJbyAwJSL+pS6h2kvdWBLZXuDq7jqJiKfULwKrqZTMgxFxfxv2h6ibqFTkAWBR7STvV6kU+Zbix26qPbyJwLXqYeAvVIdbDgJT1C1lTL+t2TgInK8+BuzjWCLrloh4Rn24HDr6Gccng1YxGkN1uOgw1SnaVv8N5YyirvtQfXg0uAv4GseWgdtlNtWS7R6qsTf2uH8K3Gd1eGtal3smA99Vv0T1IXYP8Hg3NrYAR9THgTt72DcdVeb15cDTwPQ8yZt0ghG9saKUJMlLgbIHOz4irjvVviTJi4lUpkmStIW6CHgPcOWp9iVJXmykMk2SJEmSDskDSEmSJEnSIZlMkyRJkqRDMpkmSZIkSYdkMk2SJEmSDslkmiRJkiQdksk0SZIkSTrkvzURuP//IINCAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Subsetting to plot a graphic.\n", "dis_gp = df_vit.handcap.value_counts();\n", "\n", "# Creating the Pie Chart.\n", "dis_gp[1:].plot(kind='pie',figsize = (8,8), title = 'Graphic 7 - Pie chart of Disabled Patient Appointments');\n", "plt.xlabel('Number of Desabilities by Patient ID'); # X label\n", "plt.ylabel(''); # Y label\n", "plt.legend(('1 Desability', '2 Desabilities','3 Desabilities','4 Desabilities')); # Legend" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, almost 91% of the patient with disabilities has at least one disability, 8.1% has two disabilities, 0.6% has three disabilities, and 0.01% has four disabilities.\n", "\n", "Now, I will calculate the no-show rate." ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 Disability: 17.92% no-show\n", "2 Disabilities: 20.22% no-show\n", "3 Disabilities: 23.08% no-show\n", "4 Disabilities: 33.33% no-show\n" ] } ], "source": [ "# variable initialization\n", "no_show_dis_all = []\n", "no_show_dis_all_p = []\n", "\n", "# Loop to calculte the number of no-show and show-up.\n", "for index in range(1,5):\n", " no_show_dis_all.append(list(df_vit[df_vit.handcap == index].no_show.value_counts()))\n", "\n", "# Unpacking the no_show_dis_all\n", "no_show_dis,show_up_dis = zip(*no_show_dis_all)\n", "\n", "# Loop to calculate the percentage.\n", "for index in range(0,4):\n", " no_show_dis_all_p.append(round(100 * show_up_dis[index]/(no_show_dis[index] + show_up_dis[index]),2))\n", "\n", "# Print the percentage\n", "print(\"1 Disability: {}% no-show\\n2 Disabilities: {}% no-show\\n3 Disabilities: {}% no-show\\n4 Disabilities: {}% no-show\".format(*no_show_dis_all_p))" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Rate of No-show for any level of handcap: 18.16%\n" ] } ], "source": [ "# No-show and show-up of any patient with more than or equal to 1 disability.\n", "dis_more_one = df_vit[df_vit.handcap >= 1].no_show.value_counts()\n", "\n", "# Percentage\n", "dis_more_one_p = 100 * dis_more_one[1]/(dis_more_one[0]+dis_more_one[1])\n", "\n", "# No-show to the all sample of patients with more than or equal to 1 disability.\n", "print(\"Rate of No-show for any level of handcap: {}%\".format(round(dis_more_one_p,2)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Analysing each group of disability separated (1, 2, 3, and 4) the no-show increase if the number of disability increases. Keep in mind, for patients with 4 disabilities there are 3 patients, with 3 disabilities there are 6, with 2 disabilities there are 9, and with 1 disability there are 1,025 patients.\n", "\n", "### Answer Question 3\n", "\n", ">Using all sample of disabilities the no-show rate (18.16%) is slightly lower than the role dataset (20%), so in average a patient with a disability could have a better rate of show-up.\n", "\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conclusions \n", "\n", "Throughout of this document, I have pose three questions. I will repeat the same text from the Synopsis:\n", "\n", "**1. Are patients with many appointments are more likely to show-up then the patients with less than two appointments?**\n", ">Seems to be true, but I lack information to affirm it, what I could say is when I analyse patient with a higher number of appointments, they tend to have a better rate of show-up.\n", "\n", "**2. Are patients with many appointments in average older than those with few appointments?**\n", ">I also identify a positive correlation, when I analyse patients with higher number of appointments they tend to have an age average higher than the population.\n", "\n", "**3. Are patients with disabilities tend to have better rates of show-up? Is the number of disabilities raises the show-up rates?**\n", ">Due to the small number of patients with disabilities, it is difficult to answer this question, but I have found a better rate of show-up in the group of patients with disabilities, however, patients with many disabilities tend to have worse show-up rates.\n", "\n", "\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Appendix A \n", "\n", "This appendix aims to go in depth to find new data available on internet. Mostly of this information is public and disclosure in the Vitória Council Website.\n", "\n", "### Regional Administration Dataset \n", "\n", "The city of Vitória is divided into 9 (nine) Regional Administration, according with the Vitória City Council [Website](http://www.vitoria.es.gov.br/prefeitura/gerencias-regionais-veja-os-enderecos), do not expected to read any information in english, so the most important thing to gather in this page are the address of each suburbs administration building and the oficial name of suburbs administration.\n", "\n", "
\n", "\n", "
Table A1 - Regional Administration Names and Address.
\n", "\n", "|Regional Administation|Address|General Information|\n", "|:-------------------------:|:--------------------:|:--------------------:|\n", "|Central 1 - Centro |Praça Américo Poli Monjardim, Forte São João|[See more][1]|\n", "|Central 2 - Santo Antônio |Avenida Santo Antônio, 1400, Santo Antônio|[See more][2]|\n", "|Central 3 - Jucutuquara |Rua Santa Rita de Cássia, S/N, De Lourdes|[See more][3]|\n", "|Central 4 - Maruípe |Rua Marechal Floriano, 709, Maruípe|[See more][4]|\n", "|Central 5 - Praia do Canto |Avenida Rio Branco, 80, Santa Lúcia|[See more][5]|\n", "|Central 6 - Goiabeiras |Rua Desembargador Cassiano Castelo, 65, Goiabeiras|[See more][6]|\n", "|Central 7 - São Pedro |Avenida Beira Mar, 360, São Pedro|[See more][7]|\n", "|Central 8 - Jardim Camburi |Avenida Santos Evangelista, 15, Jardim Camburi|[See more][8]|\n", "|Central 9 - Jardim da Penha|Praça Philogomiro Lannes, S/N, Jardim da Penha|[See more][9]|\n", "\n", "[1]: http://legado.vitoria.es.gov.br/regionais/dados_regiao/regiao_1/regiao1.asp\n", "\n", "[2]: http://legado.vitoria.es.gov.br/regionais/dados_regiao/regiao_2/regiao2.asp\n", "\n", "[3]: http://legado.vitoria.es.gov.br/regionais/dados_regiao/regiao_3/regiao3.asp\n", "\n", "[4]: http://legado.vitoria.es.gov.br/regionais/dados_regiao/regiao_4/regiao4.asp\n", "\n", "[5]: http://legado.vitoria.es.gov.br/regionais/dados_regiao/regiao_5/regiao5.asp\n", "\n", "[6]: http://legado.vitoria.es.gov.br/regionais/dados_regiao/regiao_6/regiao6.asp\n", "\n", "[7]: http://legado.vitoria.es.gov.br/regionais/dados_regiao/regiao_7/regiao7.asp\n", "\n", "[8]: http://legado.vitoria.es.gov.br/regionais/dados_regiao/regiao_8/regiao8.asp\n", "\n", "[9]: http://legado.vitoria.es.gov.br/regionais/dados_regiao/regiao_9/regiao9.asp\n", "\n", "Going depth to find more information in the Vitória Council Website, I have found a map of each Regional administration (Table A1 Column General Information), with general information, such as:\n", "\n", "* Number of Neighboor;\n", "* Area (m²);\n", "* Population (in 2010);\n", "* Demographic density (inhabitants/km²);\n", "* Numbers of home (in 2010);\n", "* Average Income (in 2010);\n", "* Economic Activities (in 2012).\n", "\n", "I have recorded these information into one dataset (I do not know how to \"webscrap\" these infos, and probabily this is going to take too much time of my stydies to learn how to do it. I am new in Python). This dataset, co called `vitoria_reg_adm.csv` is available in my Github Repository. The Table A2 shows the description variables. \n", "\n", "
Table A2 - Data type of the vit_reg_adm.csv file.
\n", "\n", "|Variable|Type||\n", "|:-:|:-:|:-:|\n", "|id_ra|int|ID of the Regional Administration (Following the
sequence stablished in Table A1\n", "|reg_adm|str|Regional Administration Name\n", "|nbh_qty|int|Quantity of Neighborhood in each Regional Administration\n", "|area_m2|float|Regional Administration Area in m²\n", "|pop_2010|int|Population in 2010\n", "|den_km2|int|Demographic Density in km²\n", "|home_2010|int|number of homes\n", "|avg_inc_2010|float|Monthly Average Income (in BRL)\n", "|eco_act_2012|int|Commercial Stores Quantity\n", "\n", "Keep in mind, in Brazil the \".\" is the thousands separator, whereas \",\" is the decimals separator." ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
id_rareg_admnbh_qtyarea_m2pop_2010den_km2home_2010avg_inc_2010eco_act_2012
01Centro82072047196111314069521425,8227796
12Santo Antônio12442549435261217910796649,844738
23Jucutuquara144793706341416418105801217,6918621
34Maruípe12568421654402712217009806,7210903
45Praia do Canto95334352342369570121333844,9748150
\n", "
" ], "text/plain": [ " id_ra reg_adm nbh_qty area_m2 pop_2010 den_km2 home_2010 \\\n", "0 1 Centro 8 2072047 19611 13140 6952 \n", "1 2 Santo Antônio 12 4425494 35261 2179 10796 \n", "2 3 Jucutuquara 14 4793706 34141 6418 10580 \n", "3 4 Maruípe 12 5684216 54402 7122 17009 \n", "4 5 Praia do Canto 9 5334352 34236 9570 12133 \n", "\n", " avg_inc_2010 eco_act_2012 \n", "0 1425,82 27796 \n", "1 649,84 4738 \n", "2 1217,69 18621 \n", "3 806,72 10903 \n", "4 3844,97 48150 " ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Loading the Vitoria Regional Administration data.\n", "vit_reg_adm = pd.read_csv('02-Datasets/vit_reg_adm.csv', sep=\";\");\n", "\n", "# Print\n", "vit_reg_adm.head()" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(9, 9)" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Number of Neighborhood and Variables\n", "vit_reg_adm.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Neighborhood Dataset \n", "\n", "Vitória City has 80 (eighty) neighborhoods, as you can check in the [Vitória Council Website][1], in this link you can access all neighborhood boundaries, even more in each of this `pdf`'s you can see the roads and streets and a short resume of each neighborhood. Figure A1, shows an example of information in each `pdf`.\n", "\n", "\n", "
Figura A1 - Vila Rubim Boundaries and Some Descriptive Analysis on bottom left.
\n", "\n", "Based on the Table A1 Column General Informations links, and list of all [neighborhoods][1], I will create an auxiliary dataset to record additional data, so-called `vit_aux.csv`.\n", "\n", "In this auxililary dataset I have followed the neighborhood classification of Vitória Council Website (See Table A1, each rows has own on URL, in it you can see the picture of each Regional Administration), in some cases I have found neighborhood in differents Regional Administration. The Table A3 shows the Description Types of each variable/features.\n", "\n", "
Table A3 - Description types of vit_aux.csv.
\n", "\n", "|Variable|Type|Description|\n", "|:-:|:-:|:-:|\n", "|id_ra|int|ID of the Regional Administration (Following the
sequence stablished in Table A1)\n", "|reg_adm|str|Regional Administration Name||\n", "|nbh|str|Neighborhood Name\n", "|pop_2000|int|Population in Census 2000\n", "|pop_2010|int|Population in Census 2010\n", "|male|int|Quantity of male in given neighborhood\n", "|female|int|Quantity of female in given neighborhood\n", "|avg_inc_mon|float|Average Income per month (in BRL)\n", "\n", "Keep in mind, in Brazil the \".\" is the thousands separator, whereas \",\" is the decimals separator.\n", "\n", "In this [link][2] you can see with details a big map of Vitória and its Regional Administration Areas.\n", "\n", "\n", "[1]: http://legado.vitoria.es.gov.br/regionais/geral/bairros.asp\n", "[2]: http://legado.vitoria.es.gov.br/regionais/bairros/Mapa_bairros/LIMITE_BAIRROS.pdf" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
id_rareg_admnbhpop_2000pop_2010malefemaleavg_inc_mon
01CentroCentro92409838441654221791,15
11CentroDo Moscoso854795369426647,05
21CentroFonte Grande14591231592639706,03
31CentroIlha do Príncipe2810261311941419623,21
41CentroParque Moscoso170817737919821754,79
\n", "
" ], "text/plain": [ " id_ra reg_adm nbh pop_2000 pop_2010 male female \\\n", "0 1 Centro Centro 9240 9838 4416 5422 \n", "1 1 Centro Do Moscoso 854 795 369 426 \n", "2 1 Centro Fonte Grande 1459 1231 592 639 \n", "3 1 Centro Ilha do Príncipe 2810 2613 1194 1419 \n", "4 1 Centro Parque Moscoso 1708 1773 791 982 \n", "\n", " avg_inc_mon \n", "0 1791,15 \n", "1 647,05 \n", "2 706,03 \n", "3 623,21 \n", "4 1754,79 " ] }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Loading the Vitoria Auxiliary data.\n", "vit_aux = pd.read_csv('02-Datasets/vit_aux.csv', sep=\";\");\n", "\n", "# Print\n", "vit_aux.head()" ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(80, 8)" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Number of Neighborhood and Variables\n", "vit_aux.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Age Dataset \n", "\n", "I have found a table in the Vitória Council Website [(click here to visit the original)][a] with 20 categories (cut by 5 year each one) and by gender. \n", "
\n", "\n", "
Table A4 - First rows of the vit_age.csv
\n", "\n", "|age|male|female|\n", "|:-:|:-:|:-:|\n", "|0 to 04 years|9932|9666\n", "|05 to 09 years|10165|9727\n", "|10 to 14 years|11944|11686\n", "|15 to 19 years|12450|12932\n", "|20 to 24 years|15297|15970\n", "|25 to 29 years|15511|16621\n", "\n", "This dataset is important to understand the population distribution by age, and how it affects the appointment.\n", "\n", "[a]: http://legado.vitoria.es.gov.br/regionais/dados_socioeconomicos/populacao/2000_2010/tab2.asp" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agemalefemale
00 to 04 years99329666
105 to 09 years101659727
210 to 14 years1194411686
315 to 19 years1245012932
420 to 24 years1529715970
\n", "
" ], "text/plain": [ " age male female\n", "0 0 to 04 years 9932 9666\n", "1 05 to 09 years 10165 9727\n", "2 10 to 14 years 11944 11686\n", "3 15 to 19 years 12450 12932\n", "4 20 to 24 years 15297 15970" ] }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Loading the Vitoria Age data.\n", "vit_age = pd.read_csv('02-Datasets/vit_age.csv', sep=\";\");\n", "\n", "# Print\n", "vit_age.head()" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "y = np.arange(len(vit_age['male']))\n", "x = vit_age['male']\n", "objects = vit_age['age']\n", "\n", "plt.barh(y, x, align='center', alpha=0.5)\n", "plt.yticks(y , objects)\n", "plt.xlabel('Inhabitants')\n", "plt.title('Male Distribution by 5 year')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Vitória UBS Dataset \n", "\n", "Vitória City has 38 UBS.\n", "\n", "
Table A5 - Number of USB in each Health Territory Area.
\n", "\n", "|Health Territory Area|Number of Neighborhood Covered|UBS Family|US PACS|Reference Center|Specialized Center|Polyclinic|Total|\n", "|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|\n", "|Centro |11|3|2|2|1|0|8|\n", "|Forte São João|20|2|2|4|0|0|8|\n", "|Santo Antônio |9 |3|0|0|0|0|3|\n", "|Penha |14|1|4|0|0|0|5|\n", "|Maruípe |18|6|0|0|0|0|6|\n", "|São Pedro |10|4|0|2|1|1|8|\n", "\n", "I have not created any external document for this information.\n" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## References \n", "\n", "Throughout the report I have inserted many links to different sources, for this reason, this chapter does not have too many links. I do not think it is necessary to insert the same link here. So, in this chapter, I only put the most important URL to this research.\n", "\n", "* [Kaggle;](https://www.kaggle.com/joniarroba/noshowappointments)\n", "* [IBGE;](https://sidra.ibge.gov.br/home/pmc/brasil)\n", "* [Vitória Council Website.](http://legado.vitoria.es.gov.br/regionais/home.asp)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.1" } }, "nbformat": 4, "nbformat_minor": 2 }