{ "cells": [ { "attachments": {}, "cell_type": "markdown", "id": "2a261517", "metadata": {}, "source": [ "# Single Image Super Resolution with OpenVINO™\n", "\n", "Super Resolution is the process of enhancing the quality of an image by increasing the pixel count using deep learning. This notebook shows the Single Image Super Resolution (SISR) which takes just one low resolution image. A model called [single-image-super-resolution-1032](https://docs.openvino.ai/2024/omz_models_model_single_image_super_resolution_1032.html), which is available in Open Model Zoo, is used in this tutorial. It is based on the research paper cited below.\n", "\n", "Y. Liu et al., [\"An Attention-Based Approach for Single Image Super Resolution,\"](https://arxiv.org/abs/1807.06779) 2018 24th International Conference on Pattern Recognition (ICPR), 2018, pp. 2777-2784, doi: 10.1109/ICPR.2018.8545760.\n", "\n", "\n", "#### Table of contents:\n", "\n", "- [Preparation](#Preparation)\n", " - [Install requirements](#Install-requirements)\n", " - [Imports](#Imports)\n", " - [Settings](#Settings)\n", " - [Select inference device](#Select-inference-device)\n", " - [Functions](#Functions)\n", "- [Load the Superresolution Model](#Load-the-Superresolution-Model)\n", "- [Load and Show the Input Image](#Load-and-Show-the-Input-Image)\n", "- [Superresolution on a Crop of the Image](#Superresolution-on-a-Crop-of-the-Image)\n", " - [Crop the Input Image once.](#Crop-the-Input-Image-once.)\n", " - [Reshape/Resize Crop for Model Input](#Reshape/Resize-Crop-for-Model-Input)\n", " - [Do Inference](#Do-Inference)\n", " - [Show and Save Results](#Show-and-Save-Results)\n", " - [Save Superresolution and Bicubic Image Crop](#Save-Superresolution-and-Bicubic-Image-Crop)\n", " - [Write Animated GIF with Bicubic/Superresolution Comparison](#Write-Animated-GIF-with-Bicubic/Superresolution-Comparison)\n", " - [Create a Video with Sliding Bicubic/Superresolution Comparison](#Create-a-Video-with-Sliding-Bicubic/Superresolution-Comparison)\n", "- [Superresolution on full input image](#Superresolution-on-full-input-image)\n", " - [Compute patches](#Compute-patches)\n", " - [Do Inference](#Do-Inference)\n", " - [Save superresolution image and the bicubic image](#Save-superresolution-image-and-the-bicubic-image)\n", "\n" ] }, { "attachments": {}, "cell_type": "markdown", "id": "ac2e7027", "metadata": {}, "source": [ "## Preparation\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "### Install requirements\n", "[back to top ⬆️](#Table-of-contents:)\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "67a93878", "metadata": {}, "outputs": [], "source": [ "import platform\n", "\n", "%pip install -q \"openvino>=2023.1.0\"\n", "%pip install -q opencv-python\n", "%pip install -q pillow tqdm\n", "\n", "if platform.system() != \"Windows\":\n", " %pip install -q \"matplotlib>=3.4\"\n", "else:\n", " %pip install -q \"matplotlib>=3.4,<3.7\"" ] }, { "attachments": {}, "cell_type": "markdown", "id": "95b5ce15", "metadata": {}, "source": [ "### Imports\n", "[back to top ⬆️](#Table-of-contents:)\n" ] }, { "cell_type": "code", "execution_count": 2, "id": "9d4b04ac", "metadata": { "tags": [] }, "outputs": [], "source": [ "import os\n", "import time\n", "from pathlib import Path\n", "\n", "import cv2\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "from IPython.display import HTML, FileLink\n", "from IPython.display import Image as DisplayImage\n", "from IPython.display import Pretty, ProgressBar, clear_output, display\n", "from PIL import Image\n", "import openvino as ov" ] }, { "cell_type": "code", "execution_count": 3, "id": "60abfe8c", "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Define a download file helper function\n", "def download_file(url: str, path: Path) -> None:\n", " \"\"\"Download file.\"\"\"\n", " import requests\n", "\n", " path.parent.mkdir(parents=True, exist_ok=True)\n", "\n", " r = requests.get(url)\n", " with path.open(\"wb\") as f:\n", " f.write(r.content)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "fb55a28c", "metadata": {}, "source": [ "### Settings\n", "[back to top ⬆️](#Table-of-contents:)\n" ] }, { "attachments": {}, "cell_type": "markdown", "id": "8a5d8a21-200d-44cd-bf34-91081b8832d4", "metadata": {}, "source": [ "#### Select inference device\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "select device from dropdown list for running inference using OpenVINO" ] }, { "cell_type": "code", "execution_count": 5, "id": "b162fa07-29c0-4bf6-ac48-8d2616ee335a", "metadata": { "jupyter": { "source_hidden": true }, "tags": [ "hide-input" ] }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "94055c24cdc24427b5a90c3cf584e459", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Dropdown(description='Device:', index=2, options=('CPU', 'GPU', 'AUTO'), value='AUTO')" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import ipywidgets as widgets\n", "\n", "core = ov.Core()\n", "device = widgets.Dropdown(\n", " options=core.available_devices + [\"AUTO\"],\n", " value=\"AUTO\",\n", " description=\"Device:\",\n", " disabled=False,\n", ")\n", "\n", "device" ] }, { "cell_type": "code", "execution_count": 6, "id": "322df115", "metadata": { "tags": [] }, "outputs": [], "source": [ "# 1032: 4x superresolution, 1033: 3x superresolution\n", "model_name = \"single-image-super-resolution-1032\"\n", "\n", "base_model_dir = Path(\"./model\").expanduser()\n", "\n", "model_xml_name = f\"{model_name}.xml\"\n", "model_bin_name = f\"{model_name}.bin\"\n", "\n", "model_xml_path = base_model_dir / model_xml_name\n", "model_bin_path = base_model_dir / model_bin_name\n", "\n", "if not model_xml_path.exists():\n", " base_url = f\"https://storage.openvinotoolkit.org/repositories/open_model_zoo/2023.0/models_bin/1/{model_name}/FP16/\"\n", " model_xml_url = base_url + model_xml_name\n", " model_bin_url = base_url + model_bin_name\n", "\n", " download_file(model_xml_url, model_xml_path)\n", " download_file(model_bin_url, model_bin_path)\n", "else:\n", " print(f\"{model_name} already downloaded to {base_model_dir}\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "ac90276e", "metadata": {}, "source": [ "### Functions\n", "[back to top ⬆️](#Table-of-contents:)\n" ] }, { "cell_type": "code", "execution_count": 7, "id": "dd1426c7", "metadata": { "tags": [] }, "outputs": [], "source": [ "def write_text_on_image(image: np.ndarray, text: str) -> np.ndarray:\n", " \"\"\"\n", " Write the specified text in the top left corner of the image\n", " as white text with a black border.\n", "\n", " :param image: image as numpy arry with HWC shape, RGB or BGR\n", " :param text: text to write\n", " :return: image with written text, as numpy array\n", " \"\"\"\n", " font = cv2.FONT_HERSHEY_PLAIN\n", " org = (20, 20)\n", " font_scale = 4\n", " font_color = (255, 255, 255)\n", " line_type = 1\n", " font_thickness = 2\n", " text_color_bg = (0, 0, 0)\n", " x, y = org\n", "\n", " image = cv2.UMat(image)\n", " (text_w, text_h), _ = cv2.getTextSize(text, font, font_scale, font_thickness)\n", " result_im = cv2.rectangle(image, org, (x + text_w, y + text_h), text_color_bg, -1)\n", "\n", " textim = cv2.putText(\n", " result_im,\n", " text,\n", " (x, y + text_h + font_scale - 1),\n", " font,\n", " font_scale,\n", " font_color,\n", " font_thickness,\n", " line_type,\n", " )\n", " return textim.get()\n", "\n", "\n", "def convert_result_to_image(result) -> np.ndarray:\n", " \"\"\"\n", " Convert network result of floating point numbers to image with integer\n", " values from 0-255. Values outside this range are clipped to 0 and 255.\n", "\n", " :param result: a single superresolution network result in N,C,H,W shape\n", " \"\"\"\n", " result = result.squeeze(0).transpose(1, 2, 0)\n", " result *= 255\n", " result[result < 0] = 0\n", " result[result > 255] = 255\n", " result = result.astype(np.uint8)\n", " return result\n", "\n", "\n", "def to_rgb(image_data) -> np.ndarray:\n", " \"\"\"\n", " Convert image_data from BGR to RGB\n", " \"\"\"\n", " return cv2.cvtColor(image_data, cv2.COLOR_BGR2RGB)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "944a341f", "metadata": {}, "source": [ "## Load the Superresolution Model\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "The Super Resolution model expects two inputs: the input image and a bicubic interpolation of the input image to the target size of 1920x1080. It returns the super resolution version of the image in 1920x1800 (for the default superresolution model (1032)).\n", "\n", "Load the model in OpenVINO Runtime with `core.read_model`, compile it for the specified device with `core.compile_model`, and get information about the network inputs and outputs." ] }, { "cell_type": "code", "execution_count": 9, "id": "c090e75e-12ef-4cdf-b542-9a0f35e83b4f", "metadata": { "tags": [] }, "outputs": [], "source": [ "core = ov.Core()\n", "model = core.read_model(model=model_xml_path)\n", "compiled_model = core.compile_model(model=model, device_name=device.value)\n", "\n", "# Network inputs and outputs are dictionaries. Get the keys for the\n", "# dictionaries.\n", "original_image_key, bicubic_image_key = compiled_model.inputs\n", "output_key = compiled_model.output(0)\n", "\n", "# Get the expected input and target shape. The `.dims[2:]` returns the height\n", "# and width. The `resize` function of OpenCV expects the shape as (width, height),\n", "# so reverse the shape with `[::-1]` and convert it to a tuple.\n", "input_height, input_width = list(original_image_key.shape)[2:]\n", "target_height, target_width = list(bicubic_image_key.shape)[2:]\n", "\n", "upsample_factor = int(target_height / input_height)\n", "\n", "print(f\"The network expects inputs with a width of {input_width}, \" f\"height of {input_height}\")\n", "print(f\"The network returns images with a width of {target_width}, \" f\"height of {target_height}\")\n", "\n", "print(f\"The image sides are upsampled by a factor of {upsample_factor}. \" f\"The new image is {upsample_factor**2} times as large as the \" \"original image\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "2aa08460", "metadata": {}, "source": [ "## Load and Show the Input Image\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "> **NOTE**: For the best results, use raw images (like `TIFF`, `BMP` or `PNG`). Compressed images (like `JPEG`) may appear distorted after processing with the super resolution model." ] }, { "cell_type": "code", "execution_count": 10, "id": "6cbbbd13", "metadata": { "tags": [] }, "outputs": [], "source": [ "IMAGE_PATH = Path(\"./data/tower.jpg\")\n", "OUTPUT_PATH = Path(\"output/\")\n", "\n", "os.makedirs(str(OUTPUT_PATH), exist_ok=True)\n", "\n", "download_file(\n", " \"https://storage.openvinotoolkit.org/repositories/openvino_notebooks/data/data/image/tower.jpg\",\n", " IMAGE_PATH,\n", ")\n", "full_image = cv2.imread(str(IMAGE_PATH))\n", "\n", "# Uncomment these lines to load a raw image as BGR.\n", "# import rawpy\n", "# with rawpy.imread(IMAGE_PATH) as raw:\n", "# full_image = raw.postprocess()[:,:,(2,1,0)]\n", "\n", "plt.imshow(to_rgb(full_image))\n", "print(f\"Showing full image with width {full_image.shape[1]} \" f\"and height {full_image.shape[0]}\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "fc5d16dd", "metadata": {}, "source": [ "## Superresolution on a Crop of the Image\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "### Crop the Input Image once.\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "Crop the network input size. Give the X (width) and Y (height) coordinates for the top left corner of the crop. Set the `CROP_FACTOR` variable to 2 to make a crop that is larger than the network input size (this only works with the `single-image-super-resolution-1032` model). The crop will be downsampled before propagating to the network. This is useful for very high resolution images, where a crop of the network input size is too small to show enough information. It can also improve the result. Keep in mind that with a `CROP_FACTOR` or 2 the net upsampling factor will be halved. If the superresolution network increases the side lengths of the image by a factor of 4, it upsamples a 480x270 crop to 1920x1080. With a `CROP_FACTOR` of 2, a 960x540 crop is upsampled to the same 1920x1080: the side lengths are twice as large as the crop size." ] }, { "cell_type": "code", "execution_count": 11, "id": "f86ec4d4", "metadata": { "tags": [] }, "outputs": [], "source": [ "# Set `CROP_FACTOR` to 2 to crop with twice the input width and height\n", "# This only works with the 1032 (4x) superresolution model!\n", "# Set it to 1 to crop the image with the exact input size.\n", "CROP_FACTOR = 2\n", "adjusted_upsample_factor = upsample_factor // CROP_FACTOR\n", "\n", "image_id = \"flag\" # A tag to recognize the saved images.\n", "starty = 3200\n", "startx = 0\n", "\n", "# Perform the crop.\n", "image_crop = full_image[\n", " starty : starty + input_height * CROP_FACTOR,\n", " startx : startx + input_width * CROP_FACTOR,\n", "]\n", "\n", "# Show the cropped image.\n", "print(f\"Showing image crop with width {image_crop.shape[1]} and \" f\"height {image_crop.shape[0]}.\")\n", "plt.imshow(to_rgb(image_crop));" ] }, { "attachments": {}, "cell_type": "markdown", "id": "d5d8917b", "metadata": {}, "source": [ "### Reshape/Resize Crop for Model Input\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "The input image is resized to a network input size, and reshaped to (N,C,H,W) (N=number of images, C=number of channels, H=height, W=width). The image is also resized to the network output size, with bicubic interpolation. This bicubic image is the second input to the network." ] }, { "cell_type": "code", "execution_count": 12, "id": "dd2968e8", "metadata": { "tags": [] }, "outputs": [], "source": [ "# Resize the image to the target shape with bicubic interpolation.\n", "bicubic_image = cv2.resize(src=image_crop, dsize=(target_width, target_height), interpolation=cv2.INTER_CUBIC)\n", "\n", "# If required, resize the image to the input image shape.\n", "if CROP_FACTOR > 1:\n", " image_crop = cv2.resize(src=image_crop, dsize=(input_width, input_height))\n", "\n", "# Reshape the images from (H,W,C) to (N,C,H,W).\n", "input_image_original = np.expand_dims(image_crop.transpose(2, 0, 1), axis=0)\n", "input_image_bicubic = np.expand_dims(bicubic_image.transpose(2, 0, 1), axis=0)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "d2e217a6", "metadata": {}, "source": [ "### Do Inference\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "Do inference and convert the inference result to an `RGB` image." ] }, { "cell_type": "code", "execution_count": 13, "id": "7b649dd4", "metadata": { "tags": [] }, "outputs": [], "source": [ "result = compiled_model(\n", " {\n", " original_image_key.any_name: input_image_original,\n", " bicubic_image_key.any_name: input_image_bicubic,\n", " }\n", ")[output_key]\n", "\n", "# Get inference result as numpy array and reshape to image shape and data type\n", "result_image = convert_result_to_image(result)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "d16b4928", "metadata": {}, "source": [ "### Show and Save Results\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "Show the bicubic image and the enhanced superresolution image." ] }, { "cell_type": "code", "execution_count": 14, "id": "3349cbc2", "metadata": { "tags": [] }, "outputs": [], "source": [ "fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(30, 15))\n", "ax[0].imshow(to_rgb(bicubic_image))\n", "ax[1].imshow(to_rgb(result_image))\n", "ax[0].set_title(\"Bicubic\")\n", "ax[1].set_title(\"Superresolution\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "2a643428", "metadata": { "tags": [] }, "source": [ "#### Save Superresolution and Bicubic Image Crop\n", "[back to top ⬆️](#Table-of-contents:)\n" ] }, { "cell_type": "code", "execution_count": 15, "id": "d7efcad7", "metadata": { "tags": [] }, "outputs": [], "source": [ "# Add a text with \"SUPER\" or \"BICUBIC\" to the superresolution or bicubic image.\n", "image_super = write_text_on_image(image=result_image, text=\"SUPER\")\n", "image_bicubic = write_text_on_image(image=bicubic_image, text=\"BICUBIC\")\n", "\n", "# Store the image and the results.\n", "crop_image_path = Path(f\"{OUTPUT_PATH.stem}/{image_id}_{adjusted_upsample_factor}x_crop.png\")\n", "superres_image_path = Path(f\"{OUTPUT_PATH.stem}/{image_id}_{adjusted_upsample_factor}x_crop_superres.png\")\n", "bicubic_image_path = Path(f\"{OUTPUT_PATH.stem}/{image_id}_{adjusted_upsample_factor}x_crop_bicubic.png\")\n", "cv2.imwrite(\n", " filename=str(crop_image_path),\n", " img=image_crop,\n", " params=[cv2.IMWRITE_PNG_COMPRESSION, 0],\n", ")\n", "cv2.imwrite(\n", " filename=str(superres_image_path),\n", " img=image_super,\n", " params=[cv2.IMWRITE_PNG_COMPRESSION, 0],\n", ")\n", "cv2.imwrite(\n", " filename=str(bicubic_image_path),\n", " img=image_bicubic,\n", " params=[cv2.IMWRITE_PNG_COMPRESSION, 0],\n", ")\n", "print(f\"Images written to directory: {OUTPUT_PATH}\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "d217c577", "metadata": {}, "source": [ "#### Write Animated GIF with Bicubic/Superresolution Comparison\n", "[back to top ⬆️](#Table-of-contents:)\n" ] }, { "cell_type": "code", "execution_count": 16, "id": "a2f19391", "metadata": { "tags": [] }, "outputs": [], "source": [ "print(image_bicubic.shape)\n", "print(image_super.shape)\n", "\n", "result_pil = Image.fromarray(to_rgb(image_super))\n", "bicubic_pil = Image.fromarray(to_rgb(image_bicubic))\n", "gif_image_path = Path(f\"{OUTPUT_PATH.stem}/{image_id}_comparison_{adjusted_upsample_factor}x.gif\")\n", "\n", "result_pil.save(\n", " fp=str(gif_image_path),\n", " format=\"GIF\",\n", " append_images=[bicubic_pil],\n", " save_all=True,\n", " duration=1000,\n", " loop=0,\n", ")\n", "\n", "# The `DisplayImage(str(gif_image_path))` function does not work in Colab.\n", "DisplayImage(data=open(gif_image_path, \"rb\").read(), width=1920 // 2)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "b3e41243", "metadata": {}, "source": [ "#### Create a Video with Sliding Bicubic/Superresolution Comparison\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "This may take a while. For the video, the superresolution and bicubic image are resized by a factor of 2 to improve processing speed. This gives an indication of the superresolution effect. The video is saved as an `.avi` file. You can click on the link to download the video, or open it directly from the `output/` directory, and play it locally.\n", "> Note: If you run the example in Google Colab, download video files using the `Files` tool. " ] }, { "cell_type": "code", "execution_count": 17, "id": "3b3b1db5", "metadata": { "tags": [] }, "outputs": [], "source": [ "FOURCC = cv2.VideoWriter_fourcc(*\"MJPG\")\n", "\n", "result_video_path = Path(f\"{OUTPUT_PATH.stem}/{image_id}_crop_comparison_{adjusted_upsample_factor}x.avi\")\n", "video_target_height, video_target_width = (\n", " result_image.shape[0] // 2,\n", " result_image.shape[1] // 2,\n", ")\n", "\n", "out_video = cv2.VideoWriter(\n", " filename=str(result_video_path),\n", " fourcc=FOURCC,\n", " fps=90,\n", " frameSize=(video_target_width, video_target_height),\n", ")\n", "\n", "resized_result_image = cv2.resize(src=result_image, dsize=(video_target_width, video_target_height))\n", "resized_bicubic_image = cv2.resize(src=bicubic_image, dsize=(video_target_width, video_target_height))\n", "\n", "progress_bar = ProgressBar(total=video_target_width)\n", "progress_bar.display()\n", "\n", "for i in range(video_target_width):\n", " # Create a frame where the left part (until i pixels width) contains the\n", " # superresolution image, and the right part (from i pixels width) contains\n", " # the bicubic image.\n", " comparison_frame = np.hstack(\n", " (\n", " resized_result_image[:, :i, :],\n", " resized_bicubic_image[:, i:, :],\n", " )\n", " )\n", " # Create a small black border line between the superresolution\n", " # and bicubic part of the image.\n", " comparison_frame[:, i - 1 : i + 1, :] = 0\n", " out_video.write(image=comparison_frame)\n", " progress_bar.progress = i\n", " progress_bar.update()\n", "out_video.release()\n", "clear_output()\n", "\n", "video_link = FileLink(result_video_path)\n", "video_link.html_link_str = \"%s\"\n", "display(HTML(f\"The video has been saved to {video_link._repr_html_()}\"))" ] }, { "attachments": {}, "cell_type": "markdown", "id": "12d53209", "metadata": {}, "source": [ "## Superresolution on full input image\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "Superresolution on the full image is done by dividing the image into patches of equal size, doing superresolution on each path, and then stitching the resulting patches together again. For this demo, patches near the border of the image are ignored.\n", "\n", "Adjust the `CROPLINES` setting in the next cell if you see boundary effects." ] }, { "attachments": {}, "cell_type": "markdown", "id": "ad91374e", "metadata": {}, "source": [ "### Compute patches\n", "[back to top ⬆️](#Table-of-contents:)\n" ] }, { "cell_type": "code", "execution_count": 18, "id": "fc35dc09", "metadata": { "tags": [] }, "outputs": [], "source": [ "# Set the number of lines to crop from the network result to prevent\n", "# boundary effects. The value of `CROPLINES` should be an integer >= 1.\n", "CROPLINES = 10\n", "# See Superresolution on one crop of the image for description of `CROP_FACTOR`.\n", "CROP_FACTOR = 2\n", "\n", "full_image_height, full_image_width = full_image.shape[:2]\n", "\n", "# Compute x and y coordinates of left top of image tiles.\n", "x_coords = list(range(0, full_image_width, input_width * CROP_FACTOR - CROPLINES * 2))\n", "while full_image_width - x_coords[-1] < input_width * CROP_FACTOR:\n", " x_coords.pop(-1)\n", "y_coords = list(range(0, full_image_height, input_height * CROP_FACTOR - CROPLINES * 2))\n", "while full_image_height - y_coords[-1] < input_height * CROP_FACTOR:\n", " y_coords.pop(-1)\n", "\n", "# Compute the width and height to crop the full image. The full image is\n", "# cropped at the border to tiles of the input size.\n", "crop_width = x_coords[-1] + input_width * CROP_FACTOR\n", "crop_height = y_coords[-1] + input_height * CROP_FACTOR\n", "\n", "# Compute the width and height of the target superresolution image.\n", "new_width = x_coords[-1] * (upsample_factor // CROP_FACTOR) + target_width - CROPLINES * 2 * (upsample_factor // CROP_FACTOR)\n", "new_height = y_coords[-1] * (upsample_factor // CROP_FACTOR) + target_height - CROPLINES * 2 * (upsample_factor // CROP_FACTOR)\n", "print(f\"The output image will have a width of {new_width} \" f\"and a height of {new_height}\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "0cabe8da", "metadata": {}, "source": [ "### Do Inference\n", "[back to top ⬆️](#Table-of-contents:)\n", "\n", "The code below reads one patch of the image at a time. Each patch is reshaped to the network input shape and upsampled with bicubic interpolation to the target shape. Both the original and the bicubic images are propagated through the network. The network result is a numpy array with floating point values, with a shape of `(1,3,1920,1080)`. This array is converted to an 8-bit image with the `(1080,1920,3)` shape and written to a `full_superresolution_image`. The bicubic image is written to a `full_bicubic_image` for comparison. A progress bar shows the progress of the process. Inference time is measured, as well as total time to process each patch." ] }, { "cell_type": "code", "execution_count": 19, "id": "84306b5f", "metadata": { "tags": [] }, "outputs": [], "source": [ "start_time = time.perf_counter()\n", "patch_nr = 0\n", "num_patches = len(x_coords) * len(y_coords)\n", "progress_bar = ProgressBar(total=num_patches)\n", "progress_bar.display()\n", "\n", "# Crop image to fit tiles of the input size.\n", "full_image_crop = full_image.copy()[:crop_height, :crop_width, :]\n", "\n", "# Create an empty array of the target size.\n", "full_superresolution_image = np.empty((new_height, new_width, 3), dtype=np.uint8)\n", "\n", "# Create a bicubic upsampled image of the target size for comparison.\n", "full_bicubic_image = cv2.resize(\n", " src=full_image_crop[CROPLINES:-CROPLINES, CROPLINES:-CROPLINES, :],\n", " dsize=(new_width, new_height),\n", " interpolation=cv2.INTER_CUBIC,\n", ")\n", "\n", "total_inference_duration = 0\n", "for y in y_coords:\n", " for x in x_coords:\n", " patch_nr += 1\n", "\n", " # Crop the input image.\n", " image_crop = full_image_crop[\n", " y : y + input_height * CROP_FACTOR,\n", " x : x + input_width * CROP_FACTOR,\n", " ]\n", "\n", " # Resize the images to the target shape with bicubic interpolation\n", " bicubic_image = cv2.resize(\n", " src=image_crop,\n", " dsize=(target_width, target_height),\n", " interpolation=cv2.INTER_CUBIC,\n", " )\n", "\n", " if CROP_FACTOR > 1:\n", " image_crop = cv2.resize(src=image_crop, dsize=(input_width, input_height))\n", "\n", " input_image_original = np.expand_dims(image_crop.transpose(2, 0, 1), axis=0)\n", "\n", " input_image_bicubic = np.expand_dims(bicubic_image.transpose(2, 0, 1), axis=0)\n", "\n", " # Do inference.\n", " inference_start_time = time.perf_counter()\n", "\n", " result = compiled_model(\n", " {\n", " original_image_key.any_name: input_image_original,\n", " bicubic_image_key.any_name: input_image_bicubic,\n", " }\n", " )[output_key]\n", "\n", " inference_stop_time = time.perf_counter()\n", " inference_duration = inference_stop_time - inference_start_time\n", " total_inference_duration += inference_duration\n", "\n", " # Reshape an inference result to the image shape and the data type.\n", " result_image = convert_result_to_image(result)\n", "\n", " # Add the inference result of this patch to the full superresolution\n", " # image.\n", " adjusted_upsample_factor = upsample_factor // CROP_FACTOR\n", " new_y = y * adjusted_upsample_factor\n", " new_x = x * adjusted_upsample_factor\n", " full_superresolution_image[\n", " new_y : new_y + target_height - CROPLINES * adjusted_upsample_factor * 2,\n", " new_x : new_x + target_width - CROPLINES * adjusted_upsample_factor * 2,\n", " ] = result_image[\n", " CROPLINES * adjusted_upsample_factor : -CROPLINES * adjusted_upsample_factor,\n", " CROPLINES * adjusted_upsample_factor : -CROPLINES * adjusted_upsample_factor,\n", " :,\n", " ]\n", "\n", " progress_bar.progress = patch_nr\n", " progress_bar.update()\n", "\n", " if patch_nr % 10 == 0:\n", " clear_output(wait=True)\n", " progress_bar.display()\n", " display(\n", " Pretty(f\"Processed patch {patch_nr}/{num_patches}. \" f\"Inference time: {inference_duration:.2f} seconds \" f\"({1/inference_duration:.2f} FPS)\")\n", " )\n", "\n", "end_time = time.perf_counter()\n", "duration = end_time - start_time\n", "clear_output(wait=True)\n", "print(\n", " f\"Processed {num_patches} patches in {duration:.2f} seconds. \"\n", " f\"Total patches per second (including processing): \"\n", " f\"{num_patches/duration:.2f}.\\nInference patches per second: \"\n", " f\"{num_patches/total_inference_duration:.2f} \"\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "61760bac", "metadata": {}, "source": [ "### Save superresolution image and the bicubic image\n", "[back to top ⬆️](#Table-of-contents:)\n" ] }, { "cell_type": "code", "execution_count": 20, "id": "1c98569a", "metadata": { "tags": [] }, "outputs": [], "source": [ "full_superresolution_image_path = Path(f\"{OUTPUT_PATH.stem}/full_superres_{adjusted_upsample_factor}x.jpg\")\n", "full_bicubic_image_path = Path(f\"{OUTPUT_PATH.stem}/full_bicubic_{adjusted_upsample_factor}x.jpg\")\n", "\n", "cv2.imwrite(str(full_superresolution_image_path), full_superresolution_image)\n", "cv2.imwrite(str(full_bicubic_image_path), full_bicubic_image);" ] }, { "cell_type": "code", "execution_count": 21, "id": "a262e821", "metadata": { "tags": [] }, "outputs": [], "source": [ "bicubic_link = FileLink(full_bicubic_image_path)\n", "image_link = FileLink(full_superresolution_image_path)\n", "bicubic_link.html_link_str = \"%s\"\n", "image_link.html_link_str = \"%s\"\n", "display(\n", " HTML(\n", " \"The images are saved in the images directory. You can also download \"\n", " \"them by clicking on these links:\"\n", " f\"