{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Image Classification\n", "Written by Philipp Rudiger
\n", "Created: November 30, 2018
\n", "Last updated: August 3, 2021" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Satellite images often need to be classified (assigned to a fixed set of types) or to be used for detection of various features of interest. Here we will look at the classification case, using labelled satellite images from various categories from the [UCMerced LandUse dataset](http://weegee.vision.ucmerced.edu/datasets/landuse.html). scikit-learn is useful for general numeric data types, but it doesn't have significant support for working with images. Luckily, there are various deep-learning and convolutional-network libraries that do support images well, including Keras (backed by TensorFlow) as we will use here." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import intake\n", "\n", "import numpy as np\n", "import holoviews as hv\n", "import pandas as pd\n", "import random\n", "\n", "from holoviews import opts\n", "hv.extension('bokeh')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Get the classes and files\n", "\n", "All of the labeled image classification data is in a public bucket on s3 with a corresponding intake catalog. This catalog provides several ways to access the data. You can access one image by landuse and id, access all the images for a given landuse, or access all the images. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cat = intake.open_catalog('https://s3.amazonaws.com/earth-data/UCMerced_LandUse/catalog.yml')\n", "list(cat)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first time you run the cell below it will download all the images which takes about 3 minutes on my machine. After that, the images are cached and it'll take about 100 ms." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%time da = cat.UCMerced_LandUse_all().to_dask()\n", "da" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can see what's going on more easily if we convert the data to a dataset with each data variable representing a different landuse. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds = da.to_dataset(dim='landuse')\n", "ds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Split files into train and test sets" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order to accurately test the performance of the classifier we are building we will split the data into training and test sets with an 80/20 split. We randomly sample the images in each category and assign them either to the training or test set:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "train_set = np.random.choice(ds.id, 80, False)\n", "test_set = np.setdiff1d(ds.id, train_set)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Define function to sample from train or test set" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "landuses = da.landuse.data\n", "landuse_list = list(landuses)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next we define a function that randomly samples an image, either from the training or test set and ant" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def get_sample(landuse=None, set='training'):\n", " landuse = landuse or np.random.choice(landuses)\n", " i = random.choice(train_set if set == 'training' else train_set)\n", " return ds[landuse].sel(id=i)\n", "\n", "def plot(data):\n", " options = opts.RGB(xaxis=None, yaxis=None)\n", " title = '{}: {}'.format(data.name, data.id.item())\n", " plot = hv.RGB(data.data.compute())\n", " return plot.options(options).relabel(title)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can inspect the data on one of these samples to see that the data is loaded as an `xarray.DataArray`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data = get_sample()\n", "data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can plot this array as a holoviews RGB image so we can visualize it:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plot(data)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hv.Layout(list(map(plot, map(get_sample, np.random.choice(landuses, 4))))).cols(2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define the model\n", "\n", "Now it's time to define a model. The code snippet below defines a convolutional neural network containing a stack of 3 convolution layers with a ReLU activation and followed by max-pooling layers. This is very similar to the architectures that Yann LeCun advocated in the 1990s for image classification (with the exception of ReLU)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from keras.models import Sequential\n", "from keras.layers import Conv2D, MaxPooling2D\n", "from keras.layers import Activation, Dropout, Flatten, Dense\n", "\n", "size = (150, 150)\n", "\n", "model = Sequential()\n", "model.add(Conv2D(32, (3, 3), input_shape=(*size, 3)))\n", "model.add(Activation('relu'))\n", "model.add(MaxPooling2D(pool_size=(2, 2)))\n", "\n", "model.add(Conv2D(32, (3, 3)))\n", "model.add(Activation('relu'))\n", "model.add(MaxPooling2D(pool_size=(2, 2)))\n", "\n", "model.add(Conv2D(64, (3, 3)))\n", "model.add(Activation('relu'))\n", "model.add(MaxPooling2D(pool_size=(2, 2)))\n", "\n", "model.add(Flatten()) # this converts our 3D feature maps to 1D feature vectors\n", "model.add(Dense(64))\n", "model.add(Activation('relu'))\n", "model.add(Dropout(0.5))\n", "model.add(Dense(21))\n", "model.add(Activation('softmax'))\n", "\n", "model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Declare the data\n", "\n", "The dataset of images we have is relatively small so to avoid overfitting we will want to apply some augmentation to it. The code below defines a generator that randomly samples from either the training or test set and generates a randomly rotated and cropped 150x150 window onto the 256x256 images along with a label array:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def get_array(array, size=(150, 150)):\n", " # Randomly flip\n", " if random.getrandbits(1): # equivalent to random.choice([True, False]) but much faster\n", " array = array.transpose(1, 0, 2)\n", " if random.getrandbits(1):\n", " array = array[::-1]\n", " if random.getrandbits(1):\n", " array = array[:, ::-1]\n", " # Randomly crop\n", " sh, sw = size\n", " h, w = array.shape[:2]\n", " b = np.random.randint(h-sh)\n", " l = np.random.randint(w-sw)\n", " array = array[b:b+sh, l:l+sw]\n", " return array/255.\n", "\n", "# set up an mapping to an identity matrix to use for one-hot encoding\n", "one_hot_mapping = dict(zip(landuses, np.eye(21)))\n", "\n", "def gen_samples(set='training', labels=None):\n", " \"Generates random arrays along with landuse labels\"\n", " while True:\n", " choice = get_sample(set=set)\n", " if labels is not None:\n", " labels.append(choice.name)\n", " one_hot = one_hot_mapping[choice.name]\n", " data = choice.data.compute()\n", " yield get_array(data)[np.newaxis, :], one_hot[np.newaxis, :] " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Run the model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before we start running the model let's set up a keras Callback to build a dashboard to monitor the accuracy and loss during training:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import time\n", "from keras.callbacks import Callback\n", "\n", "class MonitorCallback(Callback):\n", " \"\"\"\n", " Builds a streaming dashboard to monitor the accuracy\n", " and loss during training using HoloViews streams.\n", " \"\"\"\n", " \n", " _format = '%s - Epoch: %d - Elapsed time: %.2fs'\n", " \n", " def __init__(self, metrics=['acc', 'loss']):\n", " super().__init__()\n", " sample = {'Epoch': np.array([])}\n", " for metric in metrics:\n", " sample[metric] = np.array([])\n", " self.buffer = hv.streams.Buffer(sample)\n", " dmaps = []\n", " for metric in metrics:\n", " def cb(data, metric=metric):\n", " return hv.Curve(\n", " data, 'Epoch', metric, label=self._format\n", " % (metric, self.epoch, self.elapsed_time))\n", " dmap = hv.DynamicMap(cb, streams=[self.buffer])\n", " dmaps.append(dmap)\n", " self.layout = hv.Layout(dmaps)\n", " self.metrics = metrics\n", " self.start_time = None\n", " self.epoch = 0\n", "\n", " def on_train_begin(self, logs={}):\n", " self.start_time = time.time()\n", " \n", " @property\n", " def elapsed_time(self):\n", " if self.start_time is None:\n", " return 0\n", " else:\n", " return time.time() - self.start_time\n", "\n", " def on_epoch_end(self, epoch, logs=None):\n", " self.epoch += 1\n", " data = {'Epoch': [self.epoch]}\n", " for metric in self.metrics:\n", " data[metric] = [logs.get(metric)]\n", " self.buffer.send(data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can create an instance of the callback and display the dashboard, which will at first appear blank." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "monitor = MonitorCallback()\n", "monitor.layout.options('Curve', width=450)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next we will fit the model with our training data generator, as the model is running it will update the dashboard above:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%time\n", "history = model.fit_generator(gen_samples('training'), steps_per_epoch=50, epochs=500, verbose=False, callbacks=[monitor])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Evaluate the model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First we will have a look at the monitoring output but smooth it slightly so we can make out the overall trend:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from holoviews.operation.timeseries import rolling\n", "monitor.layout.options('Curve', width=400).map(rolling, hv.Curve)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let us test the predictions on the test set, first visually:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def set_title_color(color, *args):\n", " \"\"\"Helper function to set title color\"\"\"\n", " args[0].handles['plot'].title.text_color = color" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def get_prediction(cls):\n", " sample = get_sample(cls, 'test')\n", " array = get_array(sample.data.compute())[np.newaxis, ...]\n", " p = model.predict(array).argmax()\n", " p = landuses[p]\n", " return (plot(sample)\n", " .relabel('Predicted: %s - Actual: %s' % (p, cls))\n", " .options(hooks=[lambda *args: set_title_color('red' if p!=cls else 'blue', *args)]))\n", "\n", "options = dict(fontsize={'title': '8pt'}, width=250, height=250)\n", "hv.Layout([get_prediction(landuse).options(**options) for landuse in landuses]).cols(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And now numerically by running 1,000 predictions on the test set:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ntesting = 1000\n", "labels = []\n", "test_gen = gen_samples('test', labels)\n", "prediction = model.predict_generator(test_gen, steps=ntesting)\n", "y_pred = landuses[prediction.argmax(axis=1)]\n", "y_true = np.array(labels[:ntesting])\n", "\n", "accuracy = (y_pred==y_true).sum()/ntesting\n", "\n", "print(f'Accuracy on test set {accuracy}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next we can see how well the classifier performs on the different categories. To see how it performs we will make 50 predictions on each category and record both the accuracy and the predictions:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def predict(cls, iterations=50):\n", " accurate, predictions = [], []\n", " for i in range(iterations):\n", " sample = get_sample(cls, 'test')\n", " array = get_array(sample.data.compute())[np.newaxis, ...]\n", " p = model.predict(array).argmax()\n", " p = landuses[p]\n", " predictions.append(p)\n", " accurate.append(p == cls)\n", " return np.sum(accurate)/float(iterations), predictions\n", "\n", "accuracies = [(c, *predict(c)) for c in landuses]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now break down the accuracy by landuse category:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df = pd.DataFrame(accuracies, columns=['landuse', 'accuracy', 'predictions'])\n", "\n", "hv.Bars(df, 'landuse', 'accuracy', label='Accuracy by Landuse Category').options(\n", " width=700, xrotation=45, color='landuse', \n", " cmap='Category20', show_legend=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another interesting way of viewing this data is to look at which categories the classifier got confused on. We will count how many times the classifier classified one category as another category and visualize the result as a Chord graph where each edge is colored by the actual category. By clicking on a node we can reveal which other categories incorrectly identified an image as being of that category:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pdf = pd.DataFrame([(p, l) for (_, l, _, ps) in df.itertuples() for p in ps], columns=['Prediction', 'Actual'])\n", "graph = pdf.groupby(['Prediction', 'Actual']).size().to_frame().reset_index()\n", "confusion = graph.rename(columns={0: 'Count'})\n", "\n", "hv.Chord(confusion).relabel('Confusion Graph').options(\n", " node_color='index', cmap='Category20', edge_color='Actual', labels='index',\n", " width=600, height=600)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Clicking on buildings, for instance, reveals a lot of confusion about overpasses, mediumresidential, and intersections, all of which do share visual features in common. Conversely, number of buildings were misidentified as parklots, which is also reasonable. As we saw in the bar chart above, forests on the other hand, have lots of edges leading back to itself, demonstrating the high accuracy observed for that category of images." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "confusion.Count /= 50\n", "hv.HeatMap(confusion, label='Confusion Matrix').sort().options(\n", " xrotation=45, width=500, height=500, cmap='blues', tools=['hover'], invert_yaxis=True, zlim=(0,1))" ] } ], "metadata": { "language_info": { "name": "python", "pygments_lexer": "ipython3" } }, "nbformat": 4, "nbformat_minor": 2 }