{
"cells": [
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import altair as alt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Transforming Data\n",
"\n",
"The bread and butter of a data workflow is cleaning and preparation, taking raw datasets and transforming them into a useful form.\n",
"\n",
"Today, we'll use some prescription cost data from the NHS to make a chart of the top 50 most used Cardiovascular substances.\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Introducing Tools: Pandas\n",
"\n",
"The first tool we'll use today is `Pandas`, a Python library used to work with datasets. It provides access to `DataFrames` - tables we analyse with code.\n",
"\n",
"Python already has a few built in data structures, for example lists and dictionaries:\n",
"\n",
""
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {},
"outputs": [],
"source": [
"london = {\n",
" \"name\": \"London\",\n",
" \"population\": 8308369,\n",
" \"area\": 1572\n",
"} # This is an example of a dictionary\n",
"\n",
"locations = [\n",
" {\n",
" \"name\": \"London\",\n",
" \"population\": 8_982_000,\n",
" \"area\": 606\n",
" },\n",
" {\n",
" \"name\": \"Newport\",\n",
" \"population\": 128_060,\n",
" \"area\": 32.52\n",
" },\n",
" {\n",
" \"name\": \"Darlington\",\n",
" \"population\": 93_015,\n",
" \"area\": 7.62\n",
" },\n",
"\n",
"]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"
\n",
"
\n",
"
\n",
"
\n",
"Which we can turn into Pandas `DataFrames`"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" name | \n",
" population | \n",
" area | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" London | \n",
" 8982000 | \n",
" 606.00 | \n",
"
\n",
" \n",
" 1 | \n",
" Newport | \n",
" 128060 | \n",
" 32.52 | \n",
"
\n",
" \n",
" 2 | \n",
" Darlington | \n",
" 93015 | \n",
" 7.62 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" name population area\n",
"0 London 8982000 606.00\n",
"1 Newport 128060 32.52\n",
"2 Darlington 93015 7.62"
]
},
"execution_count": 49,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.DataFrame(locations)\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
"\n",
"and manipulate in different ways.\n",
"\n",
"For example, we can add a density column:"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" name | \n",
" population | \n",
" area | \n",
" density | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" London | \n",
" 8982000 | \n",
" 606.00 | \n",
" 14821.782178 | \n",
"
\n",
" \n",
" 1 | \n",
" Newport | \n",
" 128060 | \n",
" 32.52 | \n",
" 3937.884379 | \n",
"
\n",
" \n",
" 2 | \n",
" Darlington | \n",
" 93015 | \n",
" 7.62 | \n",
" 12206.692913 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" name population area density\n",
"0 London 8982000 606.00 14821.782178\n",
"1 Newport 128060 32.52 3937.884379\n",
"2 Darlington 93015 7.62 12206.692913"
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['density'] = df['population'] / df['area']\n",
"df"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" name | \n",
" population | \n",
" area | \n",
" density | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" London | \n",
" 8982000 | \n",
" 606.00 | \n",
" 14821.782178 | \n",
"
\n",
" \n",
" 2 | \n",
" Darlington | \n",
" 93015 | \n",
" 7.62 | \n",
" 12206.692913 | \n",
"
\n",
" \n",
" 1 | \n",
" Newport | \n",
" 128060 | \n",
" 32.52 | \n",
" 3937.884379 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" name population area density\n",
"0 London 8982000 606.00 14821.782178\n",
"2 Darlington 93015 7.62 12206.692913\n",
"1 Newport 128060 32.52 3937.884379"
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sorted_df = df.sort_values(by=\"density\", ascending=False)\n",
"sorted_df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Using Your Own Data: Prescription Cost Analysis\n",
"\n",
"Let's use this dataset of prescription costs to make a scatter chart of the most prescribed substances in the category 'Cardiovascular System'/\n",
"\n",
"We will:\n",
"\n",
"1. Load in the data, from an `.xlsx` document with lots of sheets.\n",
"2. Rename and restructure our fields.\n",
"3. Filter our data.\n",
"4. Export it to a csv for use with Vega-lite.\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Loading our Data\n",
"\n",
"I have uploaded the `xlsx` spreedsheet to GitHub but you could easily use a file from your computer.\n",
"\n",
"We want data just from the sheet 'Chemical_Substances'."
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [],
"source": [
"df = pd.read_excel(\"https://github.com/EconomicsObservatory/courses/raw/main/5/sample_data/pca_summary_tables_2023_24_v001.xlsx\", sheet_name=\"Chemical_Substances\", skiprows=3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's take a look at the DataFrame."
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Financial Year | \n",
" BNF Chemical Substance Code | \n",
" BNF Chemical Substance Name | \n",
" BNF Paragraph Code | \n",
" BNF Paragraph Name | \n",
" BNF Section Code | \n",
" BNF Section Name | \n",
" BNF Chapter Code | \n",
" BNF Chapter Name | \n",
" Total Items | \n",
" Total Cost (£) | \n",
" Cost Per Item (£) | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 2023/2024 | \n",
" 0101010C0 | \n",
" Aluminium hydroxide | \n",
" 10101 | \n",
" Antacids and simeticone | \n",
" 101 | \n",
" Dyspepsia and gastro-oesophageal reflux disease | \n",
" 1 | \n",
" Gastro-Intestinal System | \n",
" 1 | \n",
" 13.71 | \n",
" 13.710000 | \n",
"
\n",
" \n",
" 1 | \n",
" 2023/2024 | \n",
" 0101010F0 | \n",
" Magnesium carbonate | \n",
" 10101 | \n",
" Antacids and simeticone | \n",
" 101 | \n",
" Dyspepsia and gastro-oesophageal reflux disease | \n",
" 1 | \n",
" Gastro-Intestinal System | \n",
" 398 | \n",
" 59544.37 | \n",
" 149.608970 | \n",
"
\n",
" \n",
" 2 | \n",
" 2023/2024 | \n",
" 0101010G0 | \n",
" Co-magaldrox (Magnesium/aluminium hydroxide) | \n",
" 10101 | \n",
" Antacids and simeticone | \n",
" 101 | \n",
" Dyspepsia and gastro-oesophageal reflux disease | \n",
" 1 | \n",
" Gastro-Intestinal System | \n",
" 28198 | \n",
" 144222.89 | \n",
" 5.114650 | \n",
"
\n",
" \n",
" 3 | \n",
" 2023/2024 | \n",
" 0101010I0 | \n",
" Magnesium oxide | \n",
" 10101 | \n",
" Antacids and simeticone | \n",
" 101 | \n",
" Dyspepsia and gastro-oesophageal reflux disease | \n",
" 1 | \n",
" Gastro-Intestinal System | \n",
" 4439 | \n",
" 421785.42 | \n",
" 95.018117 | \n",
"
\n",
" \n",
" 4 | \n",
" 2023/2024 | \n",
" 0101010J0 | \n",
" Magnesium trisilicate | \n",
" 10101 | \n",
" Antacids and simeticone | \n",
" 101 | \n",
" Dyspepsia and gastro-oesophageal reflux disease | \n",
" 1 | \n",
" Gastro-Intestinal System | \n",
" 1068 | \n",
" 6383.25 | \n",
" 5.976826 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Financial Year BNF Chemical Substance Code \\\n",
"0 2023/2024 0101010C0 \n",
"1 2023/2024 0101010F0 \n",
"2 2023/2024 0101010G0 \n",
"3 2023/2024 0101010I0 \n",
"4 2023/2024 0101010J0 \n",
"\n",
" BNF Chemical Substance Name BNF Paragraph Code \\\n",
"0 Aluminium hydroxide 10101 \n",
"1 Magnesium carbonate 10101 \n",
"2 Co-magaldrox (Magnesium/aluminium hydroxide) 10101 \n",
"3 Magnesium oxide 10101 \n",
"4 Magnesium trisilicate 10101 \n",
"\n",
" BNF Paragraph Name BNF Section Code \\\n",
"0 Antacids and simeticone 101 \n",
"1 Antacids and simeticone 101 \n",
"2 Antacids and simeticone 101 \n",
"3 Antacids and simeticone 101 \n",
"4 Antacids and simeticone 101 \n",
"\n",
" BNF Section Name BNF Chapter Code \\\n",
"0 Dyspepsia and gastro-oesophageal reflux disease 1 \n",
"1 Dyspepsia and gastro-oesophageal reflux disease 1 \n",
"2 Dyspepsia and gastro-oesophageal reflux disease 1 \n",
"3 Dyspepsia and gastro-oesophageal reflux disease 1 \n",
"4 Dyspepsia and gastro-oesophageal reflux disease 1 \n",
"\n",
" BNF Chapter Name Total Items Total Cost (£) Cost Per Item (£) \n",
"0 Gastro-Intestinal System 1 13.71 13.710000 \n",
"1 Gastro-Intestinal System 398 59544.37 149.608970 \n",
"2 Gastro-Intestinal System 28198 144222.89 5.114650 \n",
"3 Gastro-Intestinal System 4439 421785.42 95.018117 \n",
"4 Gastro-Intestinal System 1068 6383.25 5.976826 "
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head(5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" \n",
"\n",
"## Renaming and Restructuring\n",
"\n",
"We have lots of columns in our data, not all of which we need. Let's rename the columns and only keep the ones we want."
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [],
"source": [
"df = df.rename(columns={ # specifying that we want to rename columns\n",
" \"BNF Chemical Substance Name\": \"substance_name\",\n",
" \"BNF Paragraph Name\": \"paragraph_name\",\n",
" \"BNF Section Name\": \"section_name\",\n",
" \"BNF Chapter Name\": \"chapter_name\",\n",
" \"Total Items\": \"total_items\",\n",
" \"Total Cost (£)\": \"total_cost\",\n",
" \"Cost Per Item (£)\": \"cost_per_item\",\n",
"})\n",
"\n",
"df = df[['subtance_name', 'paragraph_name', 'section_name', 'chapter_name', 'total_items', 'total_cost', 'cost_per_item']] # specifying just the columns we want to keep"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"## Filtering our Data\n",
"\n",
"We only want data for the 'Cardiovascular System' chapter. Let's filter using `df.query`."
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" subtance_name | \n",
" paragraph_name | \n",
" section_name | \n",
" chapter_name | \n",
" total_items | \n",
" total_cost | \n",
" cost_per_item | \n",
"
\n",
" \n",
" \n",
" \n",
" 111 | \n",
" Digoxin | \n",
" Cardiac glycosides | \n",
" Positive inotropic drugs | \n",
" Cardiovascular System | \n",
" 2561506 | \n",
" 8415584.68 | \n",
" 3.285405 | \n",
"
\n",
" \n",
" 112 | \n",
" Enoximone | \n",
" Phosphodiesterase Type-3 inhibitors | \n",
" Positive inotropic drugs | \n",
" Cardiovascular System | \n",
" 1 | \n",
" 15.02 | \n",
" 15.020000 | \n",
"
\n",
" \n",
" 113 | \n",
" Bendroflumethiazide | \n",
" Thiazides and related diuretics | \n",
" Diuretics | \n",
" Cardiovascular System | \n",
" 5895469 | \n",
" 4962387.04 | \n",
" 0.841729 | \n",
"
\n",
" \n",
" 114 | \n",
" Chlorothiazide | \n",
" Thiazides and related diuretics | \n",
" Diuretics | \n",
" Cardiovascular System | \n",
" 2314 | \n",
" 106645.49 | \n",
" 46.087074 | \n",
"
\n",
" \n",
" 115 | \n",
" Chlortalidone | \n",
" Thiazides and related diuretics | \n",
" Diuretics | \n",
" Cardiovascular System | \n",
" 15546 | \n",
" 1154916.51 | \n",
" 74.290268 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 286 | \n",
" Gemfibrozil | \n",
" Lipid-regulating drugs | \n",
" Lipid-regulating drugs | \n",
" Cardiovascular System | \n",
" 6040 | \n",
" 198960.84 | \n",
" 32.940536 | \n",
"
\n",
" \n",
" 287 | \n",
" Nicotinic acid | \n",
" Lipid-regulating drugs | \n",
" Lipid-regulating drugs | \n",
" Cardiovascular System | \n",
" 4 | \n",
" 133.32 | \n",
" 33.330000 | \n",
"
\n",
" \n",
" 288 | \n",
" Pravastatin sodium | \n",
" Lipid-regulating drugs | \n",
" Lipid-regulating drugs | \n",
" Cardiovascular System | \n",
" 2233790 | \n",
" 4518365.47 | \n",
" 2.022735 | \n",
"
\n",
" \n",
" 289 | \n",
" Simvastatin | \n",
" Lipid-regulating drugs | \n",
" Lipid-regulating drugs | \n",
" Cardiovascular System | \n",
" 13666725 | \n",
" 14597488.57 | \n",
" 1.068104 | \n",
"
\n",
" \n",
" 290 | \n",
" Sodium tetradecyl sulphate | \n",
" Local sclerosants | \n",
" Local sclerosants | \n",
" Cardiovascular System | \n",
" 2 | \n",
" 9.08 | \n",
" 4.540000 | \n",
"
\n",
" \n",
"
\n",
"
180 rows × 7 columns
\n",
"
"
],
"text/plain": [
" subtance_name paragraph_name \\\n",
"111 Digoxin Cardiac glycosides \n",
"112 Enoximone Phosphodiesterase Type-3 inhibitors \n",
"113 Bendroflumethiazide Thiazides and related diuretics \n",
"114 Chlorothiazide Thiazides and related diuretics \n",
"115 Chlortalidone Thiazides and related diuretics \n",
".. ... ... \n",
"286 Gemfibrozil Lipid-regulating drugs \n",
"287 Nicotinic acid Lipid-regulating drugs \n",
"288 Pravastatin sodium Lipid-regulating drugs \n",
"289 Simvastatin Lipid-regulating drugs \n",
"290 Sodium tetradecyl sulphate Local sclerosants \n",
"\n",
" section_name chapter_name total_items \\\n",
"111 Positive inotropic drugs Cardiovascular System 2561506 \n",
"112 Positive inotropic drugs Cardiovascular System 1 \n",
"113 Diuretics Cardiovascular System 5895469 \n",
"114 Diuretics Cardiovascular System 2314 \n",
"115 Diuretics Cardiovascular System 15546 \n",
".. ... ... ... \n",
"286 Lipid-regulating drugs Cardiovascular System 6040 \n",
"287 Lipid-regulating drugs Cardiovascular System 4 \n",
"288 Lipid-regulating drugs Cardiovascular System 2233790 \n",
"289 Lipid-regulating drugs Cardiovascular System 13666725 \n",
"290 Local sclerosants Cardiovascular System 2 \n",
"\n",
" total_cost cost_per_item \n",
"111 8415584.68 3.285405 \n",
"112 15.02 15.020000 \n",
"113 4962387.04 0.841729 \n",
"114 106645.49 46.087074 \n",
"115 1154916.51 74.290268 \n",
".. ... ... \n",
"286 198960.84 32.940536 \n",
"287 133.32 33.330000 \n",
"288 4518365.47 2.022735 \n",
"289 14597488.57 1.068104 \n",
"290 9.08 4.540000 \n",
"\n",
"[180 rows x 7 columns]"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = df.query(\"chapter_name == 'Cardiovascular System'\")\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We have 180 rows. Let's just keep the top 50 with the highest item count, which we can do by sorting and taking the top 50 rows:"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" subtance_name | \n",
" paragraph_name | \n",
" section_name | \n",
" chapter_name | \n",
" total_items | \n",
" total_cost | \n",
" cost_per_item | \n",
"
\n",
" \n",
" \n",
" \n",
" 279 | \n",
" Atorvastatin | \n",
" Lipid-regulating drugs | \n",
" Lipid-regulating drugs | \n",
" Cardiovascular System | \n",
" 65475356 | \n",
" 1.517661e+08 | \n",
" 2.317912 | \n",
"
\n",
" \n",
" 217 | \n",
" Amlodipine | \n",
" Calcium-channel blockers | \n",
" Nitrates, calcium-channel blockers & other ant... | \n",
" Cardiovascular System | \n",
" 37421399 | \n",
" 3.695931e+07 | \n",
" 0.987652 | \n",
"
\n",
" \n",
" 191 | \n",
" Ramipril | \n",
" Renin-angiotensin system drugs | \n",
" Hypertension and heart failure | \n",
" Cardiovascular System | \n",
" 33447735 | \n",
" 4.859265e+07 | \n",
" 1.452794 | \n",
"
\n",
" \n",
" 151 | \n",
" Bisoprolol fumarate | \n",
" Beta-adrenoceptor blocking drugs | \n",
" Beta-adrenoceptor blocking drugs | \n",
" Cardiovascular System | \n",
" 28799455 | \n",
" 2.899027e+07 | \n",
" 1.006626 | \n",
"
\n",
" \n",
" 259 | \n",
" Aspirin | \n",
" Antiplatelet drugs | \n",
" Antiplatelet drugs | \n",
" Cardiovascular System | \n",
" 20199764 | \n",
" 1.858923e+07 | \n",
" 0.920270 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 173 | \n",
" Indoramin | \n",
" Alpha-adrenoceptor blocking drugs | \n",
" Hypertension and heart failure | \n",
" Cardiovascular System | \n",
" 1 | \n",
" 4.017000e+01 | \n",
" 40.170000 | \n",
"
\n",
" \n",
" 112 | \n",
" Enoximone | \n",
" Phosphodiesterase Type-3 inhibitors | \n",
" Positive inotropic drugs | \n",
" Cardiovascular System | \n",
" 1 | \n",
" 1.502000e+01 | \n",
" 15.020000 | \n",
"
\n",
" \n",
" 131 | \n",
" Co-triamterzide(Triamterene/hydrochlorothiazide) | \n",
" Potassium sparing diuretics and compounds | \n",
" Diuretics | \n",
" Cardiovascular System | \n",
" 1 | \n",
" 9.500000e-01 | \n",
" 0.950000 | \n",
"
\n",
" \n",
" 240 | \n",
" Ephedrine hydrochloride | \n",
" Vasoconstrictor sympathomimetics | \n",
" Sympathomimetics and other vasoconstrictor drugs | \n",
" Cardiovascular System | \n",
" 1 | \n",
" 1.909000e+01 | \n",
" 19.090000 | \n",
"
\n",
" \n",
" 181 | \n",
" Co-zidocapt (Hydrochlorothiazide/captopril) | \n",
" Renin-angiotensin system drugs | \n",
" Hypertension and heart failure | \n",
" Cardiovascular System | \n",
" 1 | \n",
" 1.400000e+01 | \n",
" 14.000000 | \n",
"
\n",
" \n",
"
\n",
"
180 rows × 7 columns
\n",
"
"
],
"text/plain": [
" subtance_name \\\n",
"279 Atorvastatin \n",
"217 Amlodipine \n",
"191 Ramipril \n",
"151 Bisoprolol fumarate \n",
"259 Aspirin \n",
".. ... \n",
"173 Indoramin \n",
"112 Enoximone \n",
"131 Co-triamterzide(Triamterene/hydrochlorothiazide) \n",
"240 Ephedrine hydrochloride \n",
"181 Co-zidocapt (Hydrochlorothiazide/captopril) \n",
"\n",
" paragraph_name \\\n",
"279 Lipid-regulating drugs \n",
"217 Calcium-channel blockers \n",
"191 Renin-angiotensin system drugs \n",
"151 Beta-adrenoceptor blocking drugs \n",
"259 Antiplatelet drugs \n",
".. ... \n",
"173 Alpha-adrenoceptor blocking drugs \n",
"112 Phosphodiesterase Type-3 inhibitors \n",
"131 Potassium sparing diuretics and compounds \n",
"240 Vasoconstrictor sympathomimetics \n",
"181 Renin-angiotensin system drugs \n",
"\n",
" section_name chapter_name \\\n",
"279 Lipid-regulating drugs Cardiovascular System \n",
"217 Nitrates, calcium-channel blockers & other ant... Cardiovascular System \n",
"191 Hypertension and heart failure Cardiovascular System \n",
"151 Beta-adrenoceptor blocking drugs Cardiovascular System \n",
"259 Antiplatelet drugs Cardiovascular System \n",
".. ... ... \n",
"173 Hypertension and heart failure Cardiovascular System \n",
"112 Positive inotropic drugs Cardiovascular System \n",
"131 Diuretics Cardiovascular System \n",
"240 Sympathomimetics and other vasoconstrictor drugs Cardiovascular System \n",
"181 Hypertension and heart failure Cardiovascular System \n",
"\n",
" total_items total_cost cost_per_item \n",
"279 65475356 1.517661e+08 2.317912 \n",
"217 37421399 3.695931e+07 0.987652 \n",
"191 33447735 4.859265e+07 1.452794 \n",
"151 28799455 2.899027e+07 1.006626 \n",
"259 20199764 1.858923e+07 0.920270 \n",
".. ... ... ... \n",
"173 1 4.017000e+01 40.170000 \n",
"112 1 1.502000e+01 15.020000 \n",
"131 1 9.500000e-01 0.950000 \n",
"240 1 1.909000e+01 19.090000 \n",
"181 1 1.400000e+01 14.000000 \n",
"\n",
"[180 rows x 7 columns]"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = df.sort_values(\"total_items\", ascending=False)\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" \n",
"Our data is sorted - let's just take the top 50 rows."
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [],
"source": [
"df = df.head(50)"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" subtance_name | \n",
" paragraph_name | \n",
" section_name | \n",
" chapter_name | \n",
" total_items | \n",
" total_cost | \n",
" cost_per_item | \n",
"
\n",
" \n",
" \n",
" \n",
" 279 | \n",
" Atorvastatin | \n",
" Lipid-regulating drugs | \n",
" Lipid-regulating drugs | \n",
" Cardiovascular System | \n",
" 65475356 | \n",
" 1.517661e+08 | \n",
" 2.317912 | \n",
"
\n",
" \n",
" 217 | \n",
" Amlodipine | \n",
" Calcium-channel blockers | \n",
" Nitrates, calcium-channel blockers & other ant... | \n",
" Cardiovascular System | \n",
" 37421399 | \n",
" 3.695931e+07 | \n",
" 0.987652 | \n",
"
\n",
" \n",
" 191 | \n",
" Ramipril | \n",
" Renin-angiotensin system drugs | \n",
" Hypertension and heart failure | \n",
" Cardiovascular System | \n",
" 33447735 | \n",
" 4.859265e+07 | \n",
" 1.452794 | \n",
"
\n",
" \n",
" 151 | \n",
" Bisoprolol fumarate | \n",
" Beta-adrenoceptor blocking drugs | \n",
" Beta-adrenoceptor blocking drugs | \n",
" Cardiovascular System | \n",
" 28799455 | \n",
" 2.899027e+07 | \n",
" 1.006626 | \n",
"
\n",
" \n",
" 259 | \n",
" Aspirin | \n",
" Antiplatelet drugs | \n",
" Antiplatelet drugs | \n",
" Cardiovascular System | \n",
" 20199764 | \n",
" 1.858923e+07 | \n",
" 0.920270 | \n",
"
\n",
" \n",
" 289 | \n",
" Simvastatin | \n",
" Lipid-regulating drugs | \n",
" Lipid-regulating drugs | \n",
" Cardiovascular System | \n",
" 13666725 | \n",
" 1.459749e+07 | \n",
" 1.068104 | \n",
"
\n",
" \n",
" 205 | \n",
" Losartan potassium | \n",
" Renin-angiotensin system drugs | \n",
" Hypertension and heart failure | \n",
" Cardiovascular System | \n",
" 11912296 | \n",
" 1.589276e+07 | \n",
" 1.334147 | \n",
"
\n",
" \n",
" 260 | \n",
" Clopidogrel | \n",
" Antiplatelet drugs | \n",
" Antiplatelet drugs | \n",
" Cardiovascular System | \n",
" 11352719 | \n",
" 1.459731e+07 | \n",
" 1.285799 | \n",
"
\n",
" \n",
" 121 | \n",
" Furosemide | \n",
" Loop diuretics | \n",
" Diuretics | \n",
" Cardiovascular System | \n",
" 11071618 | \n",
" 1.083850e+07 | \n",
" 0.978945 | \n",
"
\n",
" \n",
" 258 | \n",
" Apixaban | \n",
" Oral anticoagulants | \n",
" Anticoagulants and protamine | \n",
" Cardiovascular System | \n",
" 8470475 | \n",
" 2.388879e+08 | \n",
" 28.202419 | \n",
"
\n",
" \n",
" 203 | \n",
" Candesartan cilexetil | \n",
" Renin-angiotensin system drugs | \n",
" Hypertension and heart failure | \n",
" Cardiovascular System | \n",
" 7909086 | \n",
" 1.863590e+07 | \n",
" 2.356265 | \n",
"
\n",
" \n",
" 186 | \n",
" Lisinopril | \n",
" Renin-angiotensin system drugs | \n",
" Hypertension and heart failure | \n",
" Cardiovascular System | \n",
" 7695206 | \n",
" 1.051503e+07 | \n",
" 1.366439 | \n",
"
\n",
" \n",
" 156 | \n",
" Propranolol hydrochloride | \n",
" Beta-adrenoceptor blocking drugs | \n",
" Beta-adrenoceptor blocking drugs | \n",
" Cardiovascular System | \n",
" 7475845 | \n",
" 2.274102e+07 | \n",
" 3.041933 | \n",
"
\n",
" \n",
" 172 | \n",
" Doxazosin mesilate | \n",
" Alpha-adrenoceptor blocking drugs | \n",
" Hypertension and heart failure | \n",
" Cardiovascular System | \n",
" 7212082 | \n",
" 1.553900e+07 | \n",
" 2.154579 | \n",
"
\n",
" \n",
" 113 | \n",
" Bendroflumethiazide | \n",
" Thiazides and related diuretics | \n",
" Diuretics | \n",
" Cardiovascular System | \n",
" 5895469 | \n",
" 4.962387e+06 | \n",
" 0.841729 | \n",
"
\n",
" \n",
" 117 | \n",
" Indapamide | \n",
" Thiazides and related diuretics | \n",
" Diuretics | \n",
" Cardiovascular System | \n",
" 5668511 | \n",
" 1.131126e+07 | \n",
" 1.995456 | \n",
"
\n",
" \n",
" 268 | \n",
" Rosuvastatin calcium | \n",
" Lipid-regulating drugs | \n",
" Lipid-regulating drugs | \n",
" Cardiovascular System | \n",
" 4840746 | \n",
" 1.040777e+07 | \n",
" 2.150035 | \n",
"
\n",
" \n",
" 249 | \n",
" Edoxaban | \n",
" Oral anticoagulants | \n",
" Anticoagulants and protamine | \n",
" Cardiovascular System | \n",
" 4524522 | \n",
" 2.281015e+08 | \n",
" 50.414491 | \n",
"
\n",
" \n",
" 216 | \n",
" Isosorbide mononitrate | \n",
" Nitrates | \n",
" Nitrates, calcium-channel blockers & other ant... | \n",
" Cardiovascular System | \n",
" 4449173 | \n",
" 1.668716e+07 | \n",
" 3.750620 | \n",
"
\n",
" \n",
" 150 | \n",
" Atenolol | \n",
" Beta-adrenoceptor blocking drugs | \n",
" Beta-adrenoceptor blocking drugs | \n",
" Cardiovascular System | \n",
" 4447011 | \n",
" 6.378309e+06 | \n",
" 1.434291 | \n",
"
\n",
" \n",
" 257 | \n",
" Rivaroxaban | \n",
" Oral anticoagulants | \n",
" Anticoagulants and protamine | \n",
" Cardiovascular System | \n",
" 4275906 | \n",
" 2.256078e+08 | \n",
" 52.762567 | \n",
"
\n",
" \n",
" 220 | \n",
" Felodipine | \n",
" Calcium-channel blockers | \n",
" Nitrates, calcium-channel blockers & other ant... | \n",
" Cardiovascular System | \n",
" 3899144 | \n",
" 2.209864e+07 | \n",
" 5.667563 | \n",
"
\n",
" \n",
" 187 | \n",
" Perindopril erbumine | \n",
" Renin-angiotensin system drugs | \n",
" Hypertension and heart failure | \n",
" Cardiovascular System | \n",
" 3703573 | \n",
" 8.244155e+06 | \n",
" 2.226000 | \n",
"
\n",
" \n",
" 222 | \n",
" Lercanidipine hydrochloride | \n",
" Calcium-channel blockers | \n",
" Nitrates, calcium-channel blockers & other ant... | \n",
" Cardiovascular System | \n",
" 3637908 | \n",
" 2.180692e+07 | \n",
" 5.994357 | \n",
"
\n",
" \n",
" 254 | \n",
" Warfarin sodium | \n",
" Oral anticoagulants | \n",
" Anticoagulants and protamine | \n",
" Cardiovascular System | \n",
" 3502636 | \n",
" 7.181262e+06 | \n",
" 2.050245 | \n",
"
\n",
" \n",
" 124 | \n",
" Spironolactone | \n",
" Potassium-sparing diuretics and aldosterone an... | \n",
" Diuretics | \n",
" Cardiovascular System | \n",
" 3495687 | \n",
" 1.302949e+07 | \n",
" 3.727303 | \n",
"
\n",
" \n",
" 283 | \n",
" Ezetimibe | \n",
" Lipid-regulating drugs | \n",
" Lipid-regulating drugs | \n",
" Cardiovascular System | \n",
" 3243395 | \n",
" 2.758251e+07 | \n",
" 8.504209 | \n",
"
\n",
" \n",
" 111 | \n",
" Digoxin | \n",
" Cardiac glycosides | \n",
" Positive inotropic drugs | \n",
" Cardiovascular System | \n",
" 2561506 | \n",
" 8.415585e+06 | \n",
" 3.285405 | \n",
"
\n",
" \n",
" 288 | \n",
" Pravastatin sodium | \n",
" Lipid-regulating drugs | \n",
" Lipid-regulating drugs | \n",
" Cardiovascular System | \n",
" 2233790 | \n",
" 4.518365e+06 | \n",
" 2.022735 | \n",
"
\n",
" \n",
" 219 | \n",
" Diltiazem hydrochloride | \n",
" Calcium-channel blockers | \n",
" Nitrates, calcium-channel blockers & other ant... | \n",
" Cardiovascular System | \n",
" 2119889 | \n",
" 1.825240e+07 | \n",
" 8.610073 | \n",
"
\n",
" \n",
" 120 | \n",
" Bumetanide | \n",
" Loop diuretics | \n",
" Diuretics | \n",
" Cardiovascular System | \n",
" 1900963 | \n",
" 8.161129e+06 | \n",
" 4.293155 | \n",
"
\n",
" \n",
" 214 | \n",
" Glyceryl trinitrate | \n",
" Nitrates | \n",
" Nitrates, calcium-channel blockers & other ant... | \n",
" Cardiovascular System | \n",
" 1666073 | \n",
" 4.859361e+06 | \n",
" 2.916655 | \n",
"
\n",
" \n",
" 204 | \n",
" Irbesartan | \n",
" Renin-angiotensin system drugs | \n",
" Hypertension and heart failure | \n",
" Cardiovascular System | \n",
" 1605469 | \n",
" 2.701446e+06 | \n",
" 1.682652 | \n",
"
\n",
" \n",
" 125 | \n",
" Eplerenone | \n",
" Potassium-sparing diuretics and aldosterone an... | \n",
" Diuretics | \n",
" Cardiovascular System | \n",
" 1474357 | \n",
" 9.841154e+06 | \n",
" 6.674879 | \n",
"
\n",
" \n",
" 225 | \n",
" Nifedipine | \n",
" Calcium-channel blockers | \n",
" Nitrates, calcium-channel blockers & other ant... | \n",
" Cardiovascular System | \n",
" 1275775 | \n",
" 1.620652e+07 | \n",
" 12.703277 | \n",
"
\n",
" \n",
" 201 | \n",
" Sacubitril/valsartan | \n",
" Renin-angiotensin system drugs | \n",
" Hypertension and heart failure | \n",
" Cardiovascular System | \n",
" 1249037 | \n",
" 1.087769e+08 | \n",
" 87.088650 | \n",
"
\n",
" \n",
" 183 | \n",
" Enalapril maleate | \n",
" Renin-angiotensin system drugs | \n",
" Hypertension and heart failure | \n",
" Cardiovascular System | \n",
" 1225636 | \n",
" 3.127985e+06 | \n",
" 2.552132 | \n",
"
\n",
" \n",
" 228 | \n",
" Nicorandil | \n",
" Other antianginal drugs | \n",
" Nitrates, calcium-channel blockers & other ant... | \n",
" Cardiovascular System | \n",
" 980841 | \n",
" 1.810236e+06 | \n",
" 1.845595 | \n",
"
\n",
" \n",
" 230 | \n",
" Ranolazine | \n",
" Other antianginal drugs | \n",
" Nitrates, calcium-channel blockers & other ant... | \n",
" Cardiovascular System | \n",
" 928513 | \n",
" 3.887056e+07 | \n",
" 41.863237 | \n",
"
\n",
" \n",
" 147 | \n",
" Carvedilol | \n",
" Beta-adrenoceptor blocking drugs | \n",
" Beta-adrenoceptor blocking drugs | \n",
" Cardiovascular System | \n",
" 683567 | \n",
" 1.536766e+06 | \n",
" 2.248157 | \n",
"
\n",
" \n",
" 148 | \n",
" Nebivolol | \n",
" Beta-adrenoceptor blocking drugs | \n",
" Beta-adrenoceptor blocking drugs | \n",
" Cardiovascular System | \n",
" 643256 | \n",
" 6.564158e+06 | \n",
" 10.204581 | \n",
"
\n",
" \n",
" 285 | \n",
" Fenofibrate | \n",
" Lipid-regulating drugs | \n",
" Lipid-regulating drugs | \n",
" Cardiovascular System | \n",
" 592177 | \n",
" 2.576592e+06 | \n",
" 4.351050 | \n",
"
\n",
" \n",
" 221 | \n",
" Lacidipine | \n",
" Calcium-channel blockers | \n",
" Nitrates, calcium-channel blockers & other ant... | \n",
" Cardiovascular System | \n",
" 585484 | \n",
" 2.772281e+06 | \n",
" 4.735024 | \n",
"
\n",
" \n",
" 265 | \n",
" Tranexamic acid | \n",
" Antifibrinolytic drugs and haemostatics | \n",
" Antifibrinolytic drugs and haemostatics | \n",
" Cardiovascular System | \n",
" 584162 | \n",
" 3.454837e+06 | \n",
" 5.914176 | \n",
"
\n",
" \n",
" 229 | \n",
" Ivabradine | \n",
" Other antianginal drugs | \n",
" Nitrates, calcium-channel blockers & other ant... | \n",
" Cardiovascular System | \n",
" 577052 | \n",
" 1.386686e+07 | \n",
" 24.030516 | \n",
"
\n",
" \n",
" 226 | \n",
" Verapamil hydrochloride | \n",
" Calcium-channel blockers | \n",
" Nitrates, calcium-channel blockers & other ant... | \n",
" Cardiovascular System | \n",
" 574629 | \n",
" 4.675124e+06 | \n",
" 8.135899 | \n",
"
\n",
" \n",
" 264 | \n",
" Ticagrelor | \n",
" Antiplatelet drugs | \n",
" Antiplatelet drugs | \n",
" Cardiovascular System | \n",
" 514604 | \n",
" 2.817548e+07 | \n",
" 54.751768 | \n",
"
\n",
" \n",
" 138 | \n",
" Flecainide acetate | \n",
" Drugs for arrhythmias | \n",
" Anti-arrhythmic drugs | \n",
" Cardiovascular System | \n",
" 460073 | \n",
" 1.705849e+06 | \n",
" 3.707778 | \n",
"
\n",
" \n",
" 135 | \n",
" Amiodarone hydrochloride | \n",
" Drugs for arrhythmias | \n",
" Anti-arrhythmic drugs | \n",
" Cardiovascular System | \n",
" 413624 | \n",
" 7.689201e+05 | \n",
" 1.858983 | \n",
"
\n",
" \n",
" 157 | \n",
" Sotalol hydrochloride | \n",
" Beta-adrenoceptor blocking drugs | \n",
" Beta-adrenoceptor blocking drugs | \n",
" Cardiovascular System | \n",
" 405826 | \n",
" 1.000734e+06 | \n",
" 2.465919 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" subtance_name \\\n",
"279 Atorvastatin \n",
"217 Amlodipine \n",
"191 Ramipril \n",
"151 Bisoprolol fumarate \n",
"259 Aspirin \n",
"289 Simvastatin \n",
"205 Losartan potassium \n",
"260 Clopidogrel \n",
"121 Furosemide \n",
"258 Apixaban \n",
"203 Candesartan cilexetil \n",
"186 Lisinopril \n",
"156 Propranolol hydrochloride \n",
"172 Doxazosin mesilate \n",
"113 Bendroflumethiazide \n",
"117 Indapamide \n",
"268 Rosuvastatin calcium \n",
"249 Edoxaban \n",
"216 Isosorbide mononitrate \n",
"150 Atenolol \n",
"257 Rivaroxaban \n",
"220 Felodipine \n",
"187 Perindopril erbumine \n",
"222 Lercanidipine hydrochloride \n",
"254 Warfarin sodium \n",
"124 Spironolactone \n",
"283 Ezetimibe \n",
"111 Digoxin \n",
"288 Pravastatin sodium \n",
"219 Diltiazem hydrochloride \n",
"120 Bumetanide \n",
"214 Glyceryl trinitrate \n",
"204 Irbesartan \n",
"125 Eplerenone \n",
"225 Nifedipine \n",
"201 Sacubitril/valsartan \n",
"183 Enalapril maleate \n",
"228 Nicorandil \n",
"230 Ranolazine \n",
"147 Carvedilol \n",
"148 Nebivolol \n",
"285 Fenofibrate \n",
"221 Lacidipine \n",
"265 Tranexamic acid \n",
"229 Ivabradine \n",
"226 Verapamil hydrochloride \n",
"264 Ticagrelor \n",
"138 Flecainide acetate \n",
"135 Amiodarone hydrochloride \n",
"157 Sotalol hydrochloride \n",
"\n",
" paragraph_name \\\n",
"279 Lipid-regulating drugs \n",
"217 Calcium-channel blockers \n",
"191 Renin-angiotensin system drugs \n",
"151 Beta-adrenoceptor blocking drugs \n",
"259 Antiplatelet drugs \n",
"289 Lipid-regulating drugs \n",
"205 Renin-angiotensin system drugs \n",
"260 Antiplatelet drugs \n",
"121 Loop diuretics \n",
"258 Oral anticoagulants \n",
"203 Renin-angiotensin system drugs \n",
"186 Renin-angiotensin system drugs \n",
"156 Beta-adrenoceptor blocking drugs \n",
"172 Alpha-adrenoceptor blocking drugs \n",
"113 Thiazides and related diuretics \n",
"117 Thiazides and related diuretics \n",
"268 Lipid-regulating drugs \n",
"249 Oral anticoagulants \n",
"216 Nitrates \n",
"150 Beta-adrenoceptor blocking drugs \n",
"257 Oral anticoagulants \n",
"220 Calcium-channel blockers \n",
"187 Renin-angiotensin system drugs \n",
"222 Calcium-channel blockers \n",
"254 Oral anticoagulants \n",
"124 Potassium-sparing diuretics and aldosterone an... \n",
"283 Lipid-regulating drugs \n",
"111 Cardiac glycosides \n",
"288 Lipid-regulating drugs \n",
"219 Calcium-channel blockers \n",
"120 Loop diuretics \n",
"214 Nitrates \n",
"204 Renin-angiotensin system drugs \n",
"125 Potassium-sparing diuretics and aldosterone an... \n",
"225 Calcium-channel blockers \n",
"201 Renin-angiotensin system drugs \n",
"183 Renin-angiotensin system drugs \n",
"228 Other antianginal drugs \n",
"230 Other antianginal drugs \n",
"147 Beta-adrenoceptor blocking drugs \n",
"148 Beta-adrenoceptor blocking drugs \n",
"285 Lipid-regulating drugs \n",
"221 Calcium-channel blockers \n",
"265 Antifibrinolytic drugs and haemostatics \n",
"229 Other antianginal drugs \n",
"226 Calcium-channel blockers \n",
"264 Antiplatelet drugs \n",
"138 Drugs for arrhythmias \n",
"135 Drugs for arrhythmias \n",
"157 Beta-adrenoceptor blocking drugs \n",
"\n",
" section_name chapter_name \\\n",
"279 Lipid-regulating drugs Cardiovascular System \n",
"217 Nitrates, calcium-channel blockers & other ant... Cardiovascular System \n",
"191 Hypertension and heart failure Cardiovascular System \n",
"151 Beta-adrenoceptor blocking drugs Cardiovascular System \n",
"259 Antiplatelet drugs Cardiovascular System \n",
"289 Lipid-regulating drugs Cardiovascular System \n",
"205 Hypertension and heart failure Cardiovascular System \n",
"260 Antiplatelet drugs Cardiovascular System \n",
"121 Diuretics Cardiovascular System \n",
"258 Anticoagulants and protamine Cardiovascular System \n",
"203 Hypertension and heart failure Cardiovascular System \n",
"186 Hypertension and heart failure Cardiovascular System \n",
"156 Beta-adrenoceptor blocking drugs Cardiovascular System \n",
"172 Hypertension and heart failure Cardiovascular System \n",
"113 Diuretics Cardiovascular System \n",
"117 Diuretics Cardiovascular System \n",
"268 Lipid-regulating drugs Cardiovascular System \n",
"249 Anticoagulants and protamine Cardiovascular System \n",
"216 Nitrates, calcium-channel blockers & other ant... Cardiovascular System \n",
"150 Beta-adrenoceptor blocking drugs Cardiovascular System \n",
"257 Anticoagulants and protamine Cardiovascular System \n",
"220 Nitrates, calcium-channel blockers & other ant... Cardiovascular System \n",
"187 Hypertension and heart failure Cardiovascular System \n",
"222 Nitrates, calcium-channel blockers & other ant... Cardiovascular System \n",
"254 Anticoagulants and protamine Cardiovascular System \n",
"124 Diuretics Cardiovascular System \n",
"283 Lipid-regulating drugs Cardiovascular System \n",
"111 Positive inotropic drugs Cardiovascular System \n",
"288 Lipid-regulating drugs Cardiovascular System \n",
"219 Nitrates, calcium-channel blockers & other ant... Cardiovascular System \n",
"120 Diuretics Cardiovascular System \n",
"214 Nitrates, calcium-channel blockers & other ant... Cardiovascular System \n",
"204 Hypertension and heart failure Cardiovascular System \n",
"125 Diuretics Cardiovascular System \n",
"225 Nitrates, calcium-channel blockers & other ant... Cardiovascular System \n",
"201 Hypertension and heart failure Cardiovascular System \n",
"183 Hypertension and heart failure Cardiovascular System \n",
"228 Nitrates, calcium-channel blockers & other ant... Cardiovascular System \n",
"230 Nitrates, calcium-channel blockers & other ant... Cardiovascular System \n",
"147 Beta-adrenoceptor blocking drugs Cardiovascular System \n",
"148 Beta-adrenoceptor blocking drugs Cardiovascular System \n",
"285 Lipid-regulating drugs Cardiovascular System \n",
"221 Nitrates, calcium-channel blockers & other ant... Cardiovascular System \n",
"265 Antifibrinolytic drugs and haemostatics Cardiovascular System \n",
"229 Nitrates, calcium-channel blockers & other ant... Cardiovascular System \n",
"226 Nitrates, calcium-channel blockers & other ant... Cardiovascular System \n",
"264 Antiplatelet drugs Cardiovascular System \n",
"138 Anti-arrhythmic drugs Cardiovascular System \n",
"135 Anti-arrhythmic drugs Cardiovascular System \n",
"157 Beta-adrenoceptor blocking drugs Cardiovascular System \n",
"\n",
" total_items total_cost cost_per_item \n",
"279 65475356 1.517661e+08 2.317912 \n",
"217 37421399 3.695931e+07 0.987652 \n",
"191 33447735 4.859265e+07 1.452794 \n",
"151 28799455 2.899027e+07 1.006626 \n",
"259 20199764 1.858923e+07 0.920270 \n",
"289 13666725 1.459749e+07 1.068104 \n",
"205 11912296 1.589276e+07 1.334147 \n",
"260 11352719 1.459731e+07 1.285799 \n",
"121 11071618 1.083850e+07 0.978945 \n",
"258 8470475 2.388879e+08 28.202419 \n",
"203 7909086 1.863590e+07 2.356265 \n",
"186 7695206 1.051503e+07 1.366439 \n",
"156 7475845 2.274102e+07 3.041933 \n",
"172 7212082 1.553900e+07 2.154579 \n",
"113 5895469 4.962387e+06 0.841729 \n",
"117 5668511 1.131126e+07 1.995456 \n",
"268 4840746 1.040777e+07 2.150035 \n",
"249 4524522 2.281015e+08 50.414491 \n",
"216 4449173 1.668716e+07 3.750620 \n",
"150 4447011 6.378309e+06 1.434291 \n",
"257 4275906 2.256078e+08 52.762567 \n",
"220 3899144 2.209864e+07 5.667563 \n",
"187 3703573 8.244155e+06 2.226000 \n",
"222 3637908 2.180692e+07 5.994357 \n",
"254 3502636 7.181262e+06 2.050245 \n",
"124 3495687 1.302949e+07 3.727303 \n",
"283 3243395 2.758251e+07 8.504209 \n",
"111 2561506 8.415585e+06 3.285405 \n",
"288 2233790 4.518365e+06 2.022735 \n",
"219 2119889 1.825240e+07 8.610073 \n",
"120 1900963 8.161129e+06 4.293155 \n",
"214 1666073 4.859361e+06 2.916655 \n",
"204 1605469 2.701446e+06 1.682652 \n",
"125 1474357 9.841154e+06 6.674879 \n",
"225 1275775 1.620652e+07 12.703277 \n",
"201 1249037 1.087769e+08 87.088650 \n",
"183 1225636 3.127985e+06 2.552132 \n",
"228 980841 1.810236e+06 1.845595 \n",
"230 928513 3.887056e+07 41.863237 \n",
"147 683567 1.536766e+06 2.248157 \n",
"148 643256 6.564158e+06 10.204581 \n",
"285 592177 2.576592e+06 4.351050 \n",
"221 585484 2.772281e+06 4.735024 \n",
"265 584162 3.454837e+06 5.914176 \n",
"229 577052 1.386686e+07 24.030516 \n",
"226 574629 4.675124e+06 8.135899 \n",
"264 514604 2.817548e+07 54.751768 \n",
"138 460073 1.705849e+06 3.707778 \n",
"135 413624 7.689201e+05 1.858983 \n",
"157 405826 1.000734e+06 2.465919 "
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" \n",
"\n",
"# Exporting the Data\n",
"\n",
"Finally let's save the cleaned and filtered data as a `CSV` to use with Vega-lite."
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [],
"source": [
"df.to_csv(\"sample_data/cardiovascular_system_substances.csv\", index=False)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "base",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}