{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# EXPLORATION OF TCIA QIN-HEADNECK DATA COLLECTION\n", "\n", "This is a Jupyter Notebook that demonstrates how Python can be used to explore the content of a publicly available DICOM dataset stored on The Cancer Imaging Archive (TCIA) and described here: https://wiki.cancerimagingarchive.net/display/Public/QIN-HEADNECK. \n", "\n", "This notebook was created as part of the preparations to the [DICOM4MICCAI tutorial](http://qiicr.org/dicom4miccai) at the [MICCAI 2017 conference](https://miccai2017.org) on Sept 10, 2017. \n", "\n", "The tutorial was organized by the [Quantitative Image Informatics for Cancer Research (QIICR)](http://qiicr.org) project funded by the [Informatics Technology for Cancer Research (ITCR)](https://itcr.nci.nih.gov/) program of the National Cancer Institute, award U24 CA180918.\n", "\n", "More pointers related to the material covered in this notebook:\n", "\n", "* DICOM4MICCAI gitbook https://qiicr.gitbooks.io/dicom4miccai-handson\n", "* dcmqi: conversion between DICOM and quantitative image analysis results https://github.com/QIICR/dcmqi\n", "* QIICR project GitHub organization: https://github.org/QIICR\n", "* QIICR home page: http://qiicr.org\n", "\n", "## Feedback\n", "\n", "Questions, comments, suggestions, corrections are welcomed!\n", "\n", "Please email `andrey.fedorov@gmail.com`, or [join the discussion on gitter]( https://gitter.im/QIICR/dcmqi)!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Table of Contents\n", "\n", "* Introduction and prerequisites\n", " * Dataset overview\n", " * Conversion of the DICOM dataset into tabular form\n", " * Python tools\n", "* Exploring the DICOM-stored measurements\n", " * Reading measurements from DICOM SR derived tables\n", " * Linking individual measurements with the images\n", "* Further reading" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction and prerequisites\n", "\n", "The goal of this tutorial is to demonstrate how Python can be used to work with the data produced by quantitative image analysis and stored using the DICOM format. \n", "\n", "You don't need to know much about DICOM to follow along, but you will need to learn more if you want to use DICOM in your work. You will find pointers in the Further reading section." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## DICOM Dataset overview\n", "\n", "The dataset used in this tutorial is discussed in detail in this publication:\n", "\n", "> Fedorov A., Clunie D., Ulrich E., Bauer C., Wahle A., Brown B., Onken M., Riesmeier J., Pieper S., Kikinis R., Buatti J., Beichel RR. _DICOM for quantitative imaging biomarker development: a standards based approach to sharing clinical data and structured PET/CT analysis results in head and neck cancer research_. PeerJ 4:e2057, 2016. DOI: [10.7717/peerj.2057](https://dx.doi.org/10.7717/peerj.2057)\n", "\n", "Here is a bird's eye view of the QIN-HEADNECK dataset: \n", "* 156 subjects with head and neck cancer\n", "* each subject had one or more PET/CT study (each study is expected to include a CT and a PET DICOM imaging series) for disease staging and treatment response assessment\n", "* images for a subset of 59 subjects were analyzed as follows:\n", " * primary tumor and the involved lymph nodes were segmented by each of the two readers, on two occasions, using [3D Slicer](http://slicer.org) both manually and using an interactive automated segmentation tool described in [(Beichel et al. 2016)](http://onlinelibrary.wiley.com/doi/10.1118/1.4948679/full)\n", " * the following reference regions used for PET normalization were segmented using automatic tools: cerebellum, liver and aortic arch\n", " * all segmentations were saved as DICOM Segmentation objects (DICOM SEG)\n", " * all PET images were normalized by Standardized Uptake Value (SUV) body weight \n", " * quantitative measurements were calculated from the PET images after applying Standardized Uptake Value (SUV) normalilzation for all the regions defined by the segmentations; SUV normalization factor for each DICOM series was saved into DICOM Real-World Value Mapping object (DICOM RWVM)\n", " * all resulting measurements were saved as DICOM Structured Report obects following [DICOM SR Template 1500](http://dicom.nema.org/medical/dicom/current/output/chtml/part16/chapter_A.html#sect_TID_1500)\n", " \n", "![](assets/headneck-diagram.jpg)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conversion of the DICOM dataset into tabular form\n", "\n", "The DICOM dataset was converted into a collection of tables using this converter script: https://github.com/QIICR/dcm2tables. The script extracts data elements from the DICOM files and stores them as a collection of tab-delimited text files that follow [this schema](https://app.quickdatabasediagrams.com/#/schema/_71V1H1AXEqqKWDnvx4VXw).\n", "\n", "You can download the collection of the extracted tables here: https://github.com/fedorov/dicom4miccai-handson/releases/download/miccai2017/QIN-HEADNECK-Tables.tgz. Uncompress the file, note the location of the resulting directory, and set the value of the variable below to that location." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "tablesPath = '/home/jovyan/data/QIN-HEADNECK-Tables'\n", "# set this to your location of the tables if running locally\n", "#tablesPath = '/Users/fedorov/github/dcm2tables/Tables'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will discuss the contents of the relevant specific tables generated by this script further in this notebook in the context." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Python tools\n", "\n", "In this demonstration we will use the following Python packages:\n", "* Pandas for working with the tabular data\n", "* numpy for numerical operations\n", "* [matplotlib](https://matplotlib.org/index.html), [seaborn](https://seaborn.pydata.org/) and [bokeh](http://bokeh.pydata.org/en/latest/) for plotting\n", "\n", "**NOTE: there appears to be an issue using the (as of writing) latest 0.12.7 version of bokeh for some of the plotting operations in this notebook. If you are using a local installation of bokeh, you will need to make sure you are using bokeh 0.12.6!**\n", "\n", "If you are working with this notebook on your own system, you will need to install those packages as a prerequisite to import the packages!\n", "\n", "Run the cell below to confirm that all prerequisite packages are installed properly." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/javascript": [ "\n", "(function(global) {\n", " function now() {\n", " return new Date();\n", " }\n", "\n", " var force = true;\n", "\n", " if (typeof (window._bokeh_onload_callbacks) === \"undefined\" || force === true) {\n", " window._bokeh_onload_callbacks = [];\n", " window._bokeh_is_loading = undefined;\n", " }\n", "\n", "\n", " \n", " if (typeof (window._bokeh_timeout) === \"undefined\" || force === true) {\n", " window._bokeh_timeout = Date.now() + 5000;\n", " window._bokeh_failed_load = false;\n", " }\n", "\n", " var NB_LOAD_WARNING = {'data': {'text/html':\n", " \"\\n\"+\n", " \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n", " \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n", " \"
\\n\"+\n", " \"\\n\"+\n",
" \"from bokeh.resources import INLINE\\n\"+\n",
" \"output_notebook(resources=INLINE)\\n\"+\n",
" \"
\\n\"+\n",
" \"\\n\"+\n", " \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n", " \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n", " \"
\\n\"+\n", " \"\\n\"+\n",
" \"from bokeh.resources import INLINE\\n\"+\n",
" \"output_notebook(resources=INLINE)\\n\"+\n",
" \"
\\n\"+\n",
" \"