{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Tce3stUlHN0L"
      },
      "source": [
        "##### Copyright 2019 The TensorFlow IO Authors."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "cellView": "form",
        "id": "tuOe1ymfHZPu"
      },
      "outputs": [],
      "source": [
        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qFdPvlXBOdUN"
      },
      "source": [
        "# Decode DICOM files for medical imaging"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MfBg1C5NB3X0"
      },
      "source": [
        "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://www.tensorflow.org/io/tutorials/dicom\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/io/blob/master/docs/tutorials/dicom.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://github.com/tensorflow/io/blob/master/docs/tutorials/dicom.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
        "  </td>\n",
        "      <td>\n",
        "    <a href=\"https://storage.googleapis.com/tensorflow_docs/io/docs/tutorials/dicom.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\" />Download notebook</a>\n",
        "  </td>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "xHxb-dlhMIzW"
      },
      "source": [
        "## Overview\n",
        "\n",
        "This tutorial shows how to use `tfio.image.decode_dicom_image` in TensorFlow IO to decode DICOM files with TensorFlow."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MUXex9ctTuDB"
      },
      "source": [
        "## Setup and Usage"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4YsfgDMZW5g6"
      },
      "source": [
        "#### Download DICOM image\n",
        "\n",
        "The DICOM image used in this tutorial is from the [NIH Chest X-ray dataset](https://cloud.google.com/healthcare/docs/resources/public-datasets/nih-chest).\n",
        "\n",
        "The NIH Chest X-ray dataset consists of 100,000 de-identified images of chest x-rays in PNG format, provided by NIH Clinical Center and could be downloaded through [this link](https://nihcc.app.box.com/v/ChestXray-NIHCC).\n",
        "\n",
        "Google Cloud also provides a DICOM version of the images, available in [Cloud Storage](https://cloud.google.com/healthcare/docs/resources/public-datasets/nih-chest).\n",
        "\n",
        "In this tutorial, you will download a sample file of the dataset from the [GitHub repo](https://github.com/tensorflow/io/raw/master/docs/tutorials/dicom/dicom_00000001_000.dcm)\n",
        "\n",
        "\n",
        "Note: For more information about the dataset, please find the following reference:\n",
        "\n",
        "- Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Mohammadhadi Bagheri, Ronald Summers, ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases, IEEE CVPR, pp. 3462-3471, 2017\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Tu01THzWcE-J"
      },
      "outputs": [],
      "source": [
        "!curl -OL https://github.com/tensorflow/io/raw/master/docs/tutorials/dicom/dicom_00000001_000.dcm\n",
        "!ls -l dicom_00000001_000.dcm"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "upgCc3gXybsA"
      },
      "source": [
        "### Install required Packages, and restart runtime"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "NwL3fEMQuZrk"
      },
      "outputs": [],
      "source": [
        "try:\n",
        "  # Use the Colab's preinstalled TensorFlow 2.x\n",
        "  %tensorflow_version 2.x \n",
        "except:\n",
        "  pass"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "uUDYyMZRfkX4"
      },
      "outputs": [],
      "source": [
        "!pip install tensorflow-io"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yZmI7l_GykcW"
      },
      "source": [
        "### Decode DICOM image"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "YUj0878jPyz7"
      },
      "outputs": [],
      "source": [
        "import matplotlib.pyplot as plt\n",
        "import numpy as np\n",
        "\n",
        "import tensorflow as tf"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "zK7IEukfuUuF"
      },
      "outputs": [],
      "source": [
        "import tensorflow_io as tfio\n",
        "\n",
        "image_bytes = tf.io.read_file('dicom_00000001_000.dcm')\n",
        "\n",
        "image = tfio.image.decode_dicom_image(image_bytes, dtype=tf.uint16)\n",
        "\n",
        "skipped = tfio.image.decode_dicom_image(image_bytes, on_error='skip', dtype=tf.uint8)\n",
        "\n",
        "lossy_image = tfio.image.decode_dicom_image(image_bytes, scale='auto', on_error='lossy', dtype=tf.uint8)\n",
        "\n",
        "\n",
        "fig, axes = plt.subplots(1,2, figsize=(10,10))\n",
        "axes[0].imshow(np.squeeze(image.numpy()), cmap='gray')\n",
        "axes[0].set_title('image')\n",
        "axes[1].imshow(np.squeeze(lossy_image.numpy()), cmap='gray')\n",
        "axes[1].set_title('lossy image');"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VbkKcNZunw3N"
      },
      "source": [
        "### Decode DICOM Metadata and working with Tags"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "D7tuwYksn8e7"
      },
      "source": [
        "`decode_dicom_data` decodes tag information. `dicom_tags` contains useful information as the patient's age and sex, so you can use DICOM tags such as `dicom_tags.PatientsAge` and `dicom_tags.PatientsSex`. tensorflow_io borrow the same tag notation from the pydicom dicom package."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "OqHkXwF0oI3L"
      },
      "outputs": [],
      "source": [
        "tag_id = tfio.image.dicom_tags.PatientsAge\n",
        "tag_value = tfio.image.decode_dicom_data(image_bytes,tag_id)\n",
        "print(tag_value)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "J2wZ-7OcoPPs"
      },
      "outputs": [],
      "source": [
        "print(f\"PatientsAge : {tag_value.numpy().decode('UTF-8')}\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Ce6ymbskoTOe"
      },
      "outputs": [],
      "source": [
        "tag_id = tfio.image.dicom_tags.PatientsSex\n",
        "tag_value = tfio.image.decode_dicom_data(image_bytes,tag_id)\n",
        "print(f\"PatientsSex : {tag_value.numpy().decode('UTF-8')}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "WodUv8O1VKmr"
      },
      "source": [
        "## Documentation\n",
        "This package has two operations which wrap `DCMTK` functions. `decode_dicom_image` decodes the pixel data from DICOM files, and `decode_dicom_data` decodes tag information. `tags` contains useful DICOM tags such as `tags.PatientsName`. The tag notation is borrowed from the [`pydicom`](https://pydicom.github.io/) dicom package.\n",
        "\n",
        "### Getting DICOM Image Data\n",
        "```python\n",
        "io.dicom.decode_dicom_image(\n",
        "    contents,\n",
        "    color_dim=False,\n",
        "    on_error='skip',\n",
        "    scale='preserve',\n",
        "    dtype=tf.uint16,\n",
        "    name=None\n",
        ")\n",
        "```\n",
        "\n",
        " - **`contents`**: A Tensor of type string. 0-D. The byte string encoded DICOM file\n",
        " - **`color_dim`**: An optional `bool`. Defaults to `False`. If `True`, a third channel will be appended to all images forming a 3-D tensor. A 1024 x 1024 grayscale image will be 1024 x 1024 x 1\n",
        " - **`on_error`**: Defaults to `skip`. This attribute establishes the behavior in case an error occurs on opening the image or if the output type cannot accomodate all the possible input values. For example if the user sets the output dtype to `tf.uint8`, but a dicom image stores a `tf.uint16` type. `strict` throws an error. `skip` returns a 1-D empty tensor.  `lossy` continues with the operation scaling the value via the `scale` attribute. \n",
        " - **`scale`**:  Defaults to `preserve`. This attribute establishes what to do with the scale of the input values. `auto` will autoscale the input values, if the output type is integer, `auto` will use the maximum output scale for example a `uint8` which stores values from [0, 255] can be linearly stretched to fill a `uint16` that is [0,65535]. If the output is float, `auto` will scale to [0,1]. `preserve` keeps the values as they are, an input value greater than the maximum possible output will be clipped. \n",
        " - **`dtype`**: An optional `tf.DType` from: `tf.uint8, tf.uint16, tf.uint32, tf.uint64, tf.float16, tf.float32, tf.float64`. Defaults to `tf.uint16`. \n",
        " - **`name`**: A name for the operation (optional).\n",
        " \n",
        " **Returns**\n",
        "   A `Tensor` of type `dtype` and the shape is determined by the DICOM file. \n",
        "\n",
        "### Getting DICOM Tag Data\n",
        "```python\n",
        "io.dicom.decode_dicom_data(\n",
        "    contents,\n",
        "    tags=None,\n",
        "    name=None\n",
        ")\n",
        "```\n",
        "\n",
        " - **`contents`**: A Tensor of type string. 0-D. The byte string encoded DICOM file\n",
        " - **`tags`**: A Tensor of type `tf.uint32` of any dimension. These `uint32` numbers map directly to DICOM tags\n",
        " - **`name`**: A name for the operation (optional).\n",
        "\n",
        " **Returns**\n",
        "   A `Tensor` of type `tf.string` and same shape as `tags`.  If a dicom tag is a list of strings, they are combined into one string and seperated by a double backslash `\\\\`. There is a bug in [DCMTK](https://support.dcmtk.org/docs/) if the tag is a list of numbers, only the zeroth element will be returned as a string.\n",
        "\n",
        "\n",
        "\n",
        "### Bibtex\n",
        "\n",
        "If this package helped, please kindly cite the below:\n",
        "\n",
        "```\n",
        "@misc{marcelo_lerendegui_2019_3337331,\n",
        "  author       = {Marcelo Lerendegui and\n",
        "                  Ouwen Huang},\n",
        "  title        = {Tensorflow Dicom Decoder},\n",
        "  month        = jul,\n",
        "  year         = 2019,\n",
        "  doi          = {10.5281/zenodo.3337331},\n",
        "  url          = {https://doi.org/10.5281/zenodo.3337331}\n",
        "}\n",
        "```\n",
        "\n",
        "### License\n",
        "\n",
        "Copyright 2019 Marcelo Lerendegui, Ouwen Huang, Gradient Health Inc.\n",
        "\n",
        "Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "you may not use this file except in compliance with the License.\n",
        "You may obtain a copy of the License at\n",
        "\n",
        "   http://www.apache.org/licenses/LICENSE-2.0\n",
        "\n",
        "Unless required by applicable law or agreed to in writing, software\n",
        "distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "See the License for the specific language governing permissions and\n",
        "limitations under the License."
      ]
    }
  ],
  "metadata": {
    "colab": {
      "collapsed_sections": [
        "Tce3stUlHN0L"
      ],
      "name": "dicom.ipynb",
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}