{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Packaging the AI with Docker"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Docker is a software that is used to create containers. These containers are independent units, usually with an installed software that can be run inside the container. Thus it is akin to a virtual maschine, however docker goes one step further and is more independent than a virtual maschine as it creates a standardized brige between an operating system and the software installed in the container. As benefits, the software contained in a container can be run as long as the docker software is installed in the operating system. This frees up users from the complexities and nightmares usually associated with installing dependencies, and compabilities of different operating systems, settings, and many more varieties within each user's computer. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Making a Docker Image for Training"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First we need to clone the github repository containing the model using: \n",
"```console\n",
"git clone https://github.com/igemsoftware2019/iGemMarburg2019.git \n",
"cd iGemMarburg2019/AI
\n",
"```\n",
"Now we build the docker image using the included process.dockerfile below. The purpose is to build all the required dependencies and run the train.sh script, which does all the steps elaborated in the AI documentation file. The default training steps is 100 and can be changed by changing the NUM\\_STEPS parameter in the train.sh file. Images must be labeled (the label needs to be colony), have corresponding .xml files, and be put in their corresponding folders (either in <path to test images> or <path to train images>). Subsequently, the following command needs to be executed: \n",
"```console\n",
"docker build -f training.dockerfile -t . \n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# training.dockerfile\n",
"# Copyright 2019, iGEM Marburg 2019\n",
"# This program is free software: you can redistribute it and/or modify\n",
"# it under the terms of the GNU General Public License as published by\n",
"# the Free Software Foundation, either version 3 of the License, or\n",
"# (at your option) any later version. This program is distributed in the hope that it will be useful,\n",
"# but WITHOUT ANY WARRANTY; without even the implied warranty of\n",
"# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n",
"# GNU General Public License for more details.\n",
"# You should have received a copy of the GNU General Public License\n",
"# along with this program. If not, see .\n",
"\n",
"FROM tensorflow/tensorflow:1.14.0-py3\n",
"\n",
"RUN pip install --user Cython contextlib2 pillow lxml matplotlib pandas && pip install --user pycocotools\n",
"\n",
"COPY models /tf/models\n",
"COPY train.sh /train.sh\n",
"RUN chmod +x /train.sh\n",
"\n",
"RUN export PYTHONPATH=$PYTHONPATH:/tf/models/research:/tf/models/research/object_detection:/tf/models/research/slim\n",
"RUN cd /tf/models/research && python setup.py build && python setup.py install\n",
"RUN cd /tf/models/research/slim && python setup.py build && python setup.py install\n",
"\n",
"ENTRYPOINT /train.sh"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#train.sh\n",
"#!/usr/bin/env bash\n",
"\n",
"# Copyright 2019, iGEM Marburg 2019\n",
"# This program is free software: you can redistribute it and/or modify\n",
"# it under the terms of the GNU General Public License as published by\n",
"# the Free Software Foundation, either version 3 of the License, or\n",
"# (at your option) any later version. This program is distributed in the hope that it will be useful,\n",
"# but WITHOUT ANY WARRANTY; without even the implied warranty of\n",
"# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n",
"# GNU General Public License for more details.\n",
"# You should have received a copy of the GNU General Public License\n",
"# along with this program. If not, see .\n",
"\n",
"mkdir /tf/trained_model\n",
"\n",
"cd /tf/models/research/object_detection && python xml_to_csv.py\n",
"\n",
"cd /tf/models/research && python generate_tfrecord.py \\\n",
" --csv_input=object_detection/images/train_labels.csv \\\n",
" --image_dir=object_detection/images/train \\\n",
" --output_path=mscoco_train.record\n",
"cd /tf/models/research && python generate_tfrecord.py \\\n",
" --csv_input=object_detection/images/test_labels.csv \\\n",
" --image_dir=object_detection/images/test \\\n",
" --output_path=mscoco_val.record\n",
"\n",
"NUM_STEPS=${NUM_STEPS:-100}\n",
"\n",
"cd /tf/models/research\n",
"\n",
"python model_main.py \\\n",
" --logtostderr \\\n",
" --model_dir=/tf/trained_model \\\n",
" --num_train_steps=${NUM_STEPS} \\\n",
" --train_dir=object_detection/training/ \\\n",
" --pipeline_config_path=object_detection/training/faster_rcnn_resnet101_coco.config\n",
"\n",
"suffix=$(ls /tf/trained_model | grep \".index\" | tail -1 | cut -d '.' -f 2 | cut -d '-' -f 2)\n",
"\n",
"cd /tf/models/research/object_detection\n",
"python export_inference_graph.py \\\n",
" --input_type=image_tensor \\\n",
" --pipeline_config_path=training/faster_rcnn_resnet101_coco.config \\\n",
" --trained_checkpoint_prefix=/tf/trained_model/model.ckpt-${suffix} \\\n",
" --output_directory=inference_graph"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Running a Docker Container"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Subsequently the image can be run as a container using the following command. \n",
"```console\n",
"docker run \\ \n",
" --rm \\ \n",
" -v :/tf/models/research/object_detection/images/train \\ \n",
" -v :/tf/models/research/object_detection/images/test \\ \n",
" -v