In [1]:
%%html
<script>
  function code_toggle() {
    if (code_shown){
      $('div.input').hide('500');
      $('#toggleButton').val('Show Code')
    } else {
      $('div.input').show('500');
      $('#toggleButton').val('Hide Code')
    }
    code_shown = !code_shown
  }

  $( document ).ready(function(){
    code_shown=false;
    $('div.input').hide()
  });
</script>
<form action="javascript:code_toggle()"><input type="submit" id="toggleButton" value="Show Code"></form>
<style>
.rendered_html td {
    font-size: xx-large;
    text-align: left; !important
}
.rendered_html th {
    font-size: xx-large;
    text-align: left; !important
}
</style>

In [2]:
%%capture
%load_ext autoreload
%autoreload 2
import sys
sys.path.append("../statnlpbook/")

#util.execute_notebook('relation_extraction.ipynb')

<!---
Latex Macros
-->
$$
\newcommand{\Xs}{\mathcal{X}}
\newcommand{\Ys}{\mathcal{Y}}
\newcommand{\y}{\mathbf{y}}
\newcommand{\balpha}{\boldsymbol{\alpha}}
\newcommand{\bbeta}{\boldsymbol{\beta}}
\newcommand{\aligns}{\mathbf{a}}
\newcommand{\align}{a}
\newcommand{\source}{\mathbf{s}}
\newcommand{\target}{\mathbf{t}}
\newcommand{\ssource}{s}
\newcommand{\starget}{t}
\newcommand{\repr}{\mathbf{f}}
\newcommand{\repry}{\mathbf{g}}
\newcommand{\x}{\mathbf{x}}
\newcommand{\prob}{p}
\newcommand{\a}{\alpha}
\newcommand{\b}{\beta}
\newcommand{\vocab}{V}
\newcommand{\params}{\boldsymbol{\theta}}
\newcommand{\param}{\theta}
\DeclareMathOperator{\perplexity}{PP}
\DeclareMathOperator{\argmax}{argmax}
\DeclareMathOperator{\argmin}{argmin}
\newcommand{\train}{\mathcal{D}}
\newcommand{\counts}[2]{\#_{#1}(#2) }
\newcommand{\length}[1]{\text{length}(#1) }
\newcommand{\indi}{\mathbb{I}}
$$

In [3]:
%load_ext tikzmagic

<img src="https://imgs.xkcd.com/comics/easy_or_hard.png" width=80%/>

# Question Answering

* Flavours of question answering
* Information retrieval
* Machine reading comprehension
* Question answering from structured knowledge
* Executable semantic parsing
* Relation extraction via reading comprehension

Question:

>Which university did Turing go to?

Answer:

>Princeton

[Passage](https://en.wikipedia.org/wiki/Alan_Turing):

> In 1938, he obtained his PhD from the Department of Mathematics at Princeton University.

Knowledge base:

https://www.wikidata.org/wiki/Q7251#P69

<center><img src="../img/quiz_time.png"></center>

## [ucph.page.link/qa](https://ucph.page.link/qa)

([Responses](https://docs.google.com/forms/d/17j4Msqa_L_so14KGZEdGClZaAbRsvn0rQedeze8kaRM/edit#responses))

## Flavours of Question answering (QA)

Factoid questions:

* Information retrieval (IR)-based QA on **unstructured** data (text)
* Knowledge-based QA on **structured** data (DB/KB)

Non-factoid questions:

* "How" questions
> How do I delete my Instagram account?
* "Why" questions
> Why is the sky blue?

* Math problems
<img src="../img/geometry.png" width=50%/>
<div style="text-align: right;">
    (from <a href="https://aclanthology.org/2021.acl-long.528/">Lu et al., 2021</a>)
</div>

### Formats

* Extractive (span selection)
* Cloze
* Boolean
* Multi-choice
* Abstractive

### QA on collections of documents

* [TriviaQA](https://nlp.cs.washington.edu/triviaqa/)
* [SearchQA](https://github.com/nyu-dl/dl4ir-searchqA)
* [MS MARCO](https://microsoft.github.io/msmarco/)
* [AmazonQA](https://github.com/amazonqa/amazonqa)
* [TrecQA](https://trec.nist.gov/data/qa.html)
* [WebQA](https://webqna.github.io/)

> Which politician won the Nobel Peace Prize in 2009?

### Multi-hop QA

* [HotPotQA](https://hotpotqa.github.io/)
* [QAngaroo](https://qangaroo.cs.ucl.ac.uk/)
* [ComplexWebQuestions](https://www.tau-nlp.sites.tau.ac.il/compwebq)
* [HybridQA](https://hybridqa.github.io/)

> What is the middle name of the player with the second most National Football League career rushing yards?

### QA on structured knowledge

* [FreebaseQA](https://github.com/kelvin-jiang/FreebaseQA)
* [Event-QA](https://github.com/tarcisiosouza/Event-QA)
* [WikiTableQuestions](https://github.com/ppasupat/WikiTableQuestions)
* [SimpleQuestions](https://github.com/davidgolub/SimpleQA)
* [WikiSQL](https://github.com/salesforce/WikiSQL)
* [RuBQ](https://github.com/vladislavneon/RuBQ)

> The 1999 film '10 Things I Hate About You' is based on which Shakespeare play?

### Extractive QA

* [SQuAD](https://rajpurkar.github.io/SQuAD-explorer/)
* [Natural Questions](https://ai.google.com/research/NaturalQuestions/)
* [NewsQA](https://www.microsoft.com/en-us/research/project/newsqa-dataset/)
* [TyDI-QA](https://github.com/google-research-datasets/tydiqa)

<center>
    <img src="https://rajpurkar.github.io/mlx/qa-and-squad/example-squad.png" width="70%">
</center>

<div style="text-align: right;">
    (from <a href="https://www.aclweb.org/anthology/D16-1264">Rajpurkar et al., 2016</a>)
</div>

## Information retrieval (IR)-based QA

General approach:
1. Retrieve relevant **passage**(s)
2. Machine reading comprehension: extract the **answer**, which can be
    * A text span from the passage
    * Yes/no
    * `NULL` (unanswerable)


### Information retrieval

* Obtain relevant documents/passages given a query
* Example: web search
* Learn more in [NDAK20002U Neural Information Retrieval (NIR)](https://kurser.ku.dk/course/ndak20002u/)

### Machine reading comprehension (MRC)

* Input: (Passage, Question)
* Output: Answer span

<center>
    <img src="qa_figures/squad.png" width="38%">
</center>

<div style="text-align: right;">
    (from <a href="https://www.aclweb.org/anthology/D16-1264">Rajpurkar et al., 2016</a>)
</div>

### [demo](https://demo.allennlp.org/reading-comprehension/transformer-qa)

### MRC modeling

How to model span selection?

* As sequence labeling (for each token, is it part of the answer?)
  * With IOB encoding
  
<center>
    <img src="../img/slqa.jpeg" width=50%/>
</center>
<div style="text-align: right">
    (from <a href="https://doi.org/10.1093/bioinformatics/btac397">Yoon et al., 2022</a>)
</div>

* Or as span selection (find start and end of the answer span)

||||start|end||||
|---|---|---|---|---|---|---|---|
|...|termed|the|interferon|signature|.|Here|...|

What are the advantages and disadvantages of each?

### MRC evaluation

Metrics for binary (yes/no) QA:
* **Accuracy**
* **F-score**

Metrics for ranking:
* **MRR** (mean reciprocal rank)

Test questions have $k$ gold answers by different human annotators (e.g., $k=3$ for SQuAD and TyDI-QA, $k=5$ for NQ).
Metrics for span selection:
* **Exact match** (EM): text is exactly any of the $k$
* **Word-based f-score** (like in some [sequence labeling](sequence_labeling_slides.ipynb) tasks) averaged over the $k$ gold answers
    * Often ignoring punctuation and articles, i.e., `a, an, the`
    * As bag-of-words, not exact positions (because the same answer may appear multiple times)
    * Macro-averaged: calculate f-score for each question and average the f-scores

### SQuAD

<center>
    <a href="slides/cs224n-2020-lecture10-QA.pdf"><img src="../img/squad_leaderboard.png" width=37%></a>
</center>

<div style="text-align: right;">
    (from <a href="https://rajpurkar.github.io/SQuAD-explorer/">SQuAD leaderboard</a>)
</div>

### MRC Models

![model](https://d3i71xaburhd42.cloudfront.net/1b78ce27180c324f3831f5395a2fdf738e143e74/2-Figure1-1.png)

<div style="text-align: right;">
    (from <a href="https://aclanthology.org/2020.aacl-srw.21/">Li et al., 2020</a>)
</div>

### MRC with BERT

<center>
    <img src="../img/bert_qa.png" width=50%/>
</center>

<div style="text-align: right;">
    (from <a href="https://www.aclweb.org/anthology/N19-1423.pdf">Devlin et al., 2019</a>)
</div>

### SpanBERT

<img src="../img/SpanBERT.png" width=100%/>

(from [Joshi et al., 2020](https://aclanthology.org/2020.tacl-1.5))

### MRC skills

<img src="https://d3i71xaburhd42.cloudfront.net/0e6e8274d0dcbc1c3c1ccdbd87f3e5d53fdf62b4/19-Figure3-1.png" width=100%/>

(from [Rogers et al., 2022](https://dl.acm.org/doi/10.1145/3560260))

## Knowledge-based (KB) question answering

Information is already organized in tables, databases and knowledge bases!

1. (Executable) **semantic parsing**: translate natural language question to SQL/SPARQL/logical form **program** (query).
2. **Execute** the program on a database/knowledge-base and return the answer.

![wikidata](https://upload.wikimedia.org/wikipedia/commons/thumb/6/66/Wikidata-logo-en.svg/500px-Wikidata-logo-en.svg.png)

[Which university did Turing go to?](https://query.wikidata.org/#select%20distinct%20%3Fitem%20%3FitemLabel%20where%20%7B%0A%20%20%20%20%3Fitem%20wdt%3AP31%20wd%3AQ15936437.%0A%20%20%20%20wd%3AQ7251%20wdt%3AP69%20%3Fitem.%0A%20%20%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%22%20%7D%0A%7D%0AORDER%20BY%20DESC%28%3Fsitelinks%29)

<img src="../img/mcwq.png" width=90%/>

(from [Cui et al., 2022](https://aclanthology.org/2022.tacl-1.55/))

### Semantic parsing as machine translation

Can be modeled as sequence-to-sequence (seq2seq), see [machine translation](nmt_slides_active.ipynb).

<img src="https://guide.allennlp.org/part3/semantic-parsing/database-interface.svg" width=80%/>

(from [AllenNLP](https://guide.allennlp.org/semantic-parsing-seq2seq))

### Executable semantic parsing to SPARQL

<center>
    <img src="https://d3i71xaburhd42.cloudfront.net/8b65582bcb84b30393c67a2bae71a9e84f45e87c/4-Figure1-1.png" width="100%">
</center>

<div style="text-align: right;">
    (from <a href="https://arxiv.org/pdf/1912.09713.pdf">Keysers et al., 2020</a>)
</div>

### Executable semantic parsing to SQL

<center>
    <img src="https://d3i71xaburhd42.cloudfront.net/37882abaec01eba1bf5bda8a36c904aaea0d5642/6-Table1-1.png" width="80%">
</center>

<div style="text-align: right;">
    (from <a href="https://arxiv.org/pdf/2010.05647.pdf">Oren et al., 2020</a>)
</div>

### Executable semantic parsing to SQL

<center>
    <img src="https://d3i71xaburhd42.cloudfront.net/23474a845ea4b67f38bde7c7f1c4c1bdba22c50c/1-Figure1-1.png" width="80%">
</center>

<div style="text-align: right;">
    (from <a href="https://www.aclweb.org/anthology/P18-1033">Finegan-Dollak et al., 2018</a>)
</div>

### Executable semantic parsing to logical form

<center>
    <img src="https://d3i71xaburhd42.cloudfront.net/b29447ba499507a259ae9d8f685d60cc1597d7d3/1-Figure1-1.png" width="50%">
</center>

<div style="text-align: right;">
    (from <a href="https://www.aclweb.org/anthology/D13-1160.pdf">Berant et al., 2013</a>)
</div>

## Relation extraction via reading comprehension

&nbsp;

<center>
    <a href="slides/zeroshot-relation-extraction-via-reading-comprehension-conll-2017.pdf">
    <img src="https://d3i71xaburhd42.cloudfront.net/fa025e5d117929361bcf798437957762eb5bb6d4/4-Figure2-1.png" width="100%">
    </a>
</center>

<div style="text-align: right;">
    (from <a href="https://www.aclweb.org/anthology/K17-1034">Levy et al., 2017</a>; <a href="https://levyomer.files.wordpress.com/2017/08/zeroshot-relation-extraction-via-reading-comprehension-conll-2017.pptx">slides</a>)
</div>

## Summary

* Relation extraction can be cast as question answering (and vice versa)
* Information retrieval-based question answering require reading comprehension
* Knowledge-based question answering requires semantic parsing

## Background Material

* Jurafky, Dan and Martin, James H. (2016). [Speech and Language Processing, Chapter 23 (Question Answering)](https://web.stanford.edu/~jurafsky/slp3/23.pdf)

## Further Reading

* [Question Answering. Blog post by Vered Shwartz](http://veredshwartz.blogspot.com/2016/11/question-answering.html)
* Rogers et al., 2022. [QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension](https://dl.acm.org/doi/10.1145/3560260)
* Levy et al., 2017. [Zero-Shot Relation Extraction via Reading Comprehension](https://www.aclweb.org/anthology/K17-1034)
* Abdou et al., 2019. [X-WikiRE: A large, multilingual resource for relation extraction as machine comprehension](https://aclanthology.org/D19-6130/)
* [AllenNLP Guide: Semantic Parsing: Intro and Seq2Seq Model](https://guide.allennlp.org/semantic-parsing-seq2seq)