{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": { "pycharm": { "name": "#%%\n" }, "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/html": [ "<script>\n", " function code_toggle() {\n", " if (code_shown){\n", " $('div.input').hide('500');\n", " $('#toggleButton').val('Show Code')\n", " } else {\n", " $('div.input').show('500');\n", " $('#toggleButton').val('Hide Code')\n", " }\n", " code_shown = !code_shown\n", " }\n", "\n", " $( document ).ready(function(){\n", " code_shown=false;\n", " $('div.input').hide()\n", " });\n", "</script>\n", "<form action=\"javascript:code_toggle()\"><input type=\"submit\" id=\"toggleButton\" value=\"Show Code\"></form>\n" ], "text/plain": [ "<IPython.core.display.HTML object>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%html\n", "<script>\n", " function code_toggle() {\n", " if (code_shown){\n", " $('div.input').hide('500');\n", " $('#toggleButton').val('Show Code')\n", " } else {\n", " $('div.input').show('500');\n", " $('#toggleButton').val('Hide Code')\n", " }\n", " code_shown = !code_shown\n", " }\n", "\n", " $( document ).ready(function(){\n", " code_shown=false;\n", " $('div.input').hide()\n", " });\n", "</script>\n", "<form action=\"javascript:code_toggle()\"><input type=\"submit\" id=\"toggleButton\" value=\"Show Code\"></form>" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "pycharm": { "name": "#%%\n" }, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "%load_ext tikzmagic" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Interpretability\n", "\n", "or: opening the black box of NLP models\n", "\n", "<img src=\"https://blackboxnlp.github.io/logo.svg\" width=400px>" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "* Visualisation\n", "* Probing\n", "* Adversarial evaluation" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Visualisation\n", "\n", "<center>\n", " <img src=\"https://d3i71xaburhd42.cloudfront.net/fafb602db42240f5fb1e1b113fa0ed8647b45adc/8-Figure5-1.png\" width=50%>\n", " </a>\n", "</center>\n", "\n", "<div style=\"text-align: right;\">\n", " (from <a href=\"https://www.aclweb.org/anthology/2020.acl-tutorials.1.pdf\">Belinkov, Gehrmann and Pavlick, 2020</a>\n", "</div>" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Inspecting weights\n", "\n", "Easy with [n-gram feature-based lienar text classifiers](doc_classify_slides_short.ipynb):\n", "\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Not so simple for neural models.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Visualising embeddings\n", "\n", "t-SNE for dimensionality reduction ([van der Maaten and Hinton, 2009](https://lvdmaaten.github.io/publications/papers/JMLR_2008.pdf))\n", "\n", "word2vec:\n", "<img src=\"../img/word_representations.svg\" width=70%>" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Visualising embeddings\n", "\n", "t-SNE for dimensionality reduction ([van der Maaten and Hinton, 2009](https://lvdmaaten.github.io/publications/papers/JMLR_2008.pdf))\n", "\n", "BERT:\n", " <img src=\"https://home.ttic.edu/~kgimpel/viz-bert/viz-bert-voc.png\" width=70%/>" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Visualising attention\n", "\n", "<img src=\"dl-applications-figures/reordering.png\" width=60%>" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Visualising self-attention\n", "\n", "<center>\n", " <img src=\"https://d3i71xaburhd42.cloudfront.net/0de0a44b859a3719d11834479112314b4caba669/2-Figure2-1.png\" width=100%>\n", "</center>\n", "\n", "<div style=\"text-align: right;\">\n", " (from <a href=\"https://aclanthology.org/P19-3007\">Vig, 2019</a>)\n", "</div>" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Visualising self-attention\n", "\n", "<center>\n", " <img src=\"https://d3i71xaburhd42.cloudfront.net/0de0a44b859a3719d11834479112314b4caba669/5-Figure6-1.png\" width=100%>\n", "</center>\n", "\n", "<div style=\"text-align: right;\">\n", " (from <a href=\"https://aclanthology.org/P19-3007\">Vig, 2019</a>)\n", "</div>" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Caveats\n", "\n", "- Attention is not explanation (<a href=\"https://www.aclweb.org/anthology/N19-1357.pdf\">Jain and Wallace, 2019</a>)\n", "- Attention is not not explanation (<a href=\"https://www.aclweb.org/anthology/D19-1002.pdf\">Wiegreffe and Pinter, 2019</a>)\n", "\n", "<center>\n", " <img src=\"https://d3i71xaburhd42.cloudfront.net/ce177672b00ddf46e4906157a7e997ca9338b8b9/3-Table1-1.png\" width=80%>\n", "</center>" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Close inspection\n", "\n", "<center>\n", " <a href=\"slides/cs224n-2020-lecture20-interpretability-26-32.pdf\">\n", " <img src=\"https://www.prevuemeetings.com/wp-content/uploads/2016/11/site-inspection.jpg\" width=30%>\n", " </a>\n", "</center>\n", "\n", "<div style=\"text-align: right;\">\n", " (from John Hewitt; <a href=\"http://web.stanford.edu/class/cs224n/slides/cs224n-2020-lecture20-interpretability.pdf\">slides</a>)\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Probing" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- To find out whether a representation R encodes a property P\n", " - train a model to predict P from the R\n", " - or cast the problem directly according to R's training objective" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Linguistic knowledge in BERT\n", "\n", "Train a simple linear classifier on top of BERT\n", "\n", "<img src=\"../img/liu_probing.png\" width=70%/>\n", "\n", "(from [Liu et al., 2019](https://aclanthology.org/N19-1112/))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "<img src=\"https://d3i71xaburhd42.cloudfront.net/e0ba34397ccd04721d465af842c4388752cf6017/5-Table1-1.png\" width=100%/>\n", "\n", "(from [Liu et al., 2019](https://aclanthology.org/N19-1112/))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### World knowledge in BERT\n", "\n", "Probing by masked language modelling (no extra parameters!)\n", "\n", "<center>\n", " <img src=\"https://d3i71xaburhd42.cloudfront.net/d0086b86103a620a86bc918746df0aa642e2a8a3/1-Figure1-1.png\" width=70%>\n", "</center>\n", "\n", "<div style=\"text-align: right;\">\n", " (from <a href=\"https://aclanthology.org/D19-1250/\">Petroni et al., 2019</a>)\n", "</div>" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Language models as linguistic test subjects\n", "\n", "<center>\n", " <a href=\"slides/cs224n-2020-lecture20-interpretability-14-21.pdf\">\n", " <img src=\"https://elearningindustry.com/wp-content/uploads/2019/06/do-this-not-that-when-writing-multiple-choice-questions.jpg\" width=50%>\n", " </a>\n", "</center>\n", "\n", "<div style=\"text-align: right;\">\n", " (from John Hewitt; <a href=\"http://web.stanford.edu/class/cs224n/slides/cs224n-2020-lecture20-interpretability.pdf\">slides</a>)\n", "</div>" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Adversarial evaluation" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "BERT struggles with quantifiers:\n", "\n", "<center>\n", " <img src=\"https://d3i71xaburhd42.cloudfront.net/8fe49d025e8b83816f7169daa74becaac6184f9e/1-Table1-1.png\" width=70%>\n", "</center>\n", "\n", "<div style=\"text-align: right;\">\n", " (from <a href=\"https://aclanthology.org/2022.naacl-main.359/\">Cui et al., 2022</a>)\n", "</div>" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "BERT struggles with negation:\n", "\n", "<center>\n", " <img src=\"https://d3i71xaburhd42.cloudfront.net/f3b1ad7986eea54d381f65abf4a0da5887129339/2-Table1-1.png\" width=100%>\n", "</center>\n", "\n", "<div style=\"text-align: right;\">\n", " (from <a href=\"https://aclanthology.org/2021.conll-1.19/\">Hartmann et al., 2021</a>)\n", "</div>" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "<center>\n", " <a href=\"slides/acl_2020_interpretability_tutorial-94-115.pdf\">\n", " <img src=\"https://blog.acolyer.org/wp-content/uploads/2017/09/adversarial-reading-fig-1.jpeg?w=480\" width=50%>\n", " </a>\n", "</center>\n", "\n", "(from [Belinkov, Gehrmann and Pavlick, 2020](https://www.aclweb.org/anthology/2020.acl-tutorials.1.pdf); [slides](https://sebastiangehrmann.com/assets/files/acl_2020_interpretability_tutorial.pdf))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Further reading\n", "\n", "- [Belinkov and Glass, 2020. Analysis Methods in Neural Language Processing: A Survey](https://www.aclweb.org/anthology/Q19-1004.pdf)\n", "- [Hewitt, 2020. Designing and Interpreting Probes](https://nlp.stanford.edu//~johnhew//interpreting-probes.html)\n", "- [Lawrence, 2020. Interpretability and Analysis of Models for NLP @ ACL 2020](https://medium.com/@lawrence.carolin/interpretability-and-analysis-of-models-for-nlp-e6b977ac1dc6)\n", "- [Søgaard, 2021. Explainable Natural Language Processing. Morgan & Claypool](https://link.springer.com/book/10.1007/978-3-031-02180-0)\n", "- Jesse Vig's blog posts:\n", " - [Deconstructing BERT: Distilling 6 Patterns from 100 Million Parameters](https://towardsdatascience.com/deconstructing-bert-distilling-6-patterns-from-100-million-parameters-b49113672f77)\n", " - [Deconstructing BERT, Part 2: Visualizing the Inner Workings of Attention](https://towardsdatascience.com/deconstructing-bert-part-2-visualizing-the-inner-workings-of-attention-60a16d86b5c1)" ] } ], "metadata": { "celltoolbar": "Slideshow", "hide_input": false, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.2" } }, "nbformat": 4, "nbformat_minor": 4 }