{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## AI for Medicine Course 1 Week 1 lecture exercises"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Data Exploration\n",
"In the first assignment of this course, you will work with chest x-ray images taken from the public [ChestX-ray8 dataset](https://arxiv.org/abs/1705.02315). In this notebook, you'll get a chance to explore this dataset and familiarize yourself with some of the techniques you'll use in the first graded assignment.\n",
"\n",
"
\n",
"\n",
"The first step before jumping into writing code for any machine learning project is to explore your data. A standard Python package for analyzing and manipulating data is [pandas](https://pandas.pydata.org/docs/#). \n",
"\n",
"With the next two code cells, you'll import `pandas` and a package called `numpy` for numerical manipulation, then use `pandas` to read a csv file into a dataframe and print out the first few rows of data."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# Import necessary packages\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"%matplotlib inline\n",
"import os\n",
"import seaborn as sns\n",
"sns.set()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"There are 1000 rows and 16 columns in this data frame\n"
]
},
{
"data": {
"text/html": [
"
| \n", " | Image | \n", "Atelectasis | \n", "Cardiomegaly | \n", "Consolidation | \n", "Edema | \n", "Effusion | \n", "Emphysema | \n", "Fibrosis | \n", "Hernia | \n", "Infiltration | \n", "Mass | \n", "Nodule | \n", "PatientId | \n", "Pleural_Thickening | \n", "Pneumonia | \n", "Pneumothorax | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "00008270_015.png | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "8270 | \n", "0 | \n", "0 | \n", "0 | \n", "
| 1 | \n", "00029855_001.png | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "29855 | \n", "0 | \n", "0 | \n", "0 | \n", "
| 2 | \n", "00001297_000.png | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1297 | \n", "1 | \n", "0 | \n", "0 | \n", "
| 3 | \n", "00012359_002.png | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "12359 | \n", "0 | \n", "0 | \n", "0 | \n", "
| 4 | \n", "00017951_001.png | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "17951 | \n", "0 | \n", "0 | \n", "0 | \n", "