{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Scanned Text Recognition\n", "\n", "- functionality (mostly done)\n", " - resolution detection, deep language modeling\n", " - font identification, reading order detection\n", " - upsampling, downsampling, better noise removal\n", " - character and word bounding boxes\n", "- training / data\n", " - larger, more diverse training sets\n", " - large scale self-supervised training (research)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Scanned Text Recognition (more)\n", "\n", "- other work\n", " - purely convolutional OCR (better suited to current accelerators)\n", " - replace line normalization, layout extraction with deep models\n", " - replace data augmentation with deep models (like GAN)\n", " - better semantic segmentation (text, image, table, graph, figure, ...)\n", " - non-CTC models and/or automatic decoding\n", " - benchmarking of attention-based models" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Camera Captured Recognition\n", "\n", "- functionality\n", " - page boundary detection models\n", " - DL dewarping\n", " - DL depth estimation (from RGB, from stereo)\n", "- training / data\n", " - large collection of photographically captured images\n", " - automatic generation of photographically distorted images (ray tracing)\n", " - automatic DL-based data augmentation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Scene Text Recognition\n", "\n", "- functionality\n", " - DL text detection / extraction (reimplement standard convolutional models)\n", "- training / data\n", " - good datasets exist; train on them\n", " - benchmark CTC vs attention-based models" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.14" } }, "nbformat": 4, "nbformat_minor": 2 }