{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "toc": true
   },
   "source": [
    "<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n",
    "<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#原理\" data-toc-modified-id=\"原理-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>原理</a></span></li><li><span><a href=\"#参考-Code\" data-toc-modified-id=\"参考-Code-2\"><span class=\"toc-item-num\">2&nbsp;&nbsp;</span>参考 Code</a></span></li><li><span><a href=\"#过程\" data-toc-modified-id=\"过程-3\"><span class=\"toc-item-num\">3&nbsp;&nbsp;</span>过程</a></span></li><li><span><a href=\"#小结\" data-toc-modified-id=\"小结-4\"><span class=\"toc-item-num\">4&nbsp;&nbsp;</span>小结</a></span></li><li><span><a href=\"#参考\" data-toc-modified-id=\"参考-5\"><span class=\"toc-item-num\">5&nbsp;&nbsp;</span>参考</a></span></li></ul></div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 原理"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$Score = \\sum_{n=1}^N \\{ \\sum_{all\\ w_1...w_n\\ that\\ co-occur} Info(w_1...w_n) / \\sum_{all\\ w_1...w_n\\ in\\ sys\\ output} (1) · \\exp [\\beta \\log^2(min\\{\\frac{L_{sys}}{\\overline{L_{ref}}}, 1\\})] \\} $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$Info(w_1...w_n) = \\log_2 \\frac{the\\ \\#\\ of\\ occurrences\\ of\\ w_1...w_{n-1}}{the\\ \\#\\ of\\ occurrences\\ of\\ w_1...w_{n}}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "其中：\n",
    "- β 是一个常数（经验阈值），使得 Lhyp/Lref = 2/3 时，β 使得长度惩罚系数为 0.5\n",
    "- $\\overline{L_{ref}}$ 是参考译文的平均长度"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "关于 β 的计算如下：\n",
    "\n",
    "$e^{\\beta \\ln^2(2/3)} = 1/2$，两边同时取 ln 为底对数：\n",
    "\n",
    "$\\beta = \\frac{\\ln(1/2)}{\\ln^2(2/3)}$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 参考 Code"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [],
   "source": [
    "hypothese1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'which','ensures', 'that', 'the', 'military', 'always','obeys', 'the', 'commands', 'of', 'the', 'party']\n",
    "hypothese2 = ['It', 'is', 'to', 'insure', 'the', 'troops','forever', 'hearing', 'the', 'activity', 'guidebook','that', 'party', 'direct']\n",
    "reference1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'that','ensures', 'that', 'the', 'military', 'will', 'forever','heed', 'Party', 'commands']\n",
    "reference2 = ['It', 'is', 'the', 'guiding', 'principle', 'which','guarantees', 'the', 'military', 'forces', 'always','being', 'under', 'the', 'command', 'of', 'the','Party']\n",
    "reference3 = ['It', 'is', 'the', 'practical', 'guide', 'for', 'the','army', 'always', 'to', 'heed', 'the', 'directions','of', 'the', 'party']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 199,
   "metadata": {},
   "outputs": [],
   "source": [
    "import math\n",
    "import fractions\n",
    "from collections import Counter\n",
    "\n",
    "import nltk\n",
    "from nltk.util import ngrams"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 计算 nist 分值\n",
    "def sentence_nist(references, hypothesis, n=5):\n",
    "    \"\"\"\n",
    "    Calculate NIST score from\n",
    "    George Doddington. 2002. \"Automatic evaluation of machine translation quality\n",
    "    using n-gram co-occurrence statistics.\" Proceedings of HLT.\n",
    "    Morgan Kaufmann Publishers Inc. http://dl.acm.org/citation.cfm?id=1289189.1289273\n",
    "\n",
    "    DARPA commissioned NIST to develop an MT evaluation facility based on the BLEU\n",
    "    score. The official script used by NIST to compute BLEU and NIST score is\n",
    "    mteval-14.pl. The main differences are:\n",
    "\n",
    "     - BLEU uses geometric mean of the ngram overlaps, NIST uses arithmetic mean.\n",
    "     - NIST has a different brevity penalty\n",
    "     - NIST score from mteval-14.pl has a self-contained tokenizer\n",
    "\n",
    "    Note: The mteval-14.pl includes a smoothing function for BLEU score that is NOT\n",
    "          used in the NIST score computation.\n",
    "\n",
    "    >>> hypothesis1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'which',\n",
    "    ...               'ensures', 'that', 'the', 'military', 'always',\n",
    "    ...               'obeys', 'the', 'commands', 'of', 'the', 'party']\n",
    "\n",
    "    >>> hypothesis2 = ['It', 'is', 'to', 'insure', 'the', 'troops',\n",
    "    ...               'forever', 'hearing', 'the', 'activity', 'guidebook',\n",
    "    ...               'that', 'party', 'direct']\n",
    "\n",
    "    >>> reference1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'that',\n",
    "    ...               'ensures', 'that', 'the', 'military', 'will', 'forever',\n",
    "    ...               'heed', 'Party', 'commands']\n",
    "\n",
    "    >>> reference2 = ['It', 'is', 'the', 'guiding', 'principle', 'which',\n",
    "    ...               'guarantees', 'the', 'military', 'forces', 'always',\n",
    "    ...               'being', 'under', 'the', 'command', 'of', 'the',\n",
    "    ...               'Party']\n",
    "\n",
    "    >>> reference3 = ['It', 'is', 'the', 'practical', 'guide', 'for', 'the',\n",
    "    ...               'army', 'always', 'to', 'heed', 'the', 'directions',\n",
    "    ...               'of', 'the', 'party']\n",
    "\n",
    "    >>> sentence_nist([reference1, reference2, reference3], hypothesis1) # doctest: +ELLIPSIS\n",
    "    3.3709...\n",
    "\n",
    "    >>> sentence_nist([reference1, reference2, reference3], hypothesis2) # doctest: +ELLIPSIS\n",
    "    1.4619...\n",
    "\n",
    "    :param references: reference sentences\n",
    "    :type references: list(list(str))\n",
    "    :param hypothesis: a hypothesis sentence\n",
    "    :type hypothesis: list(str)\n",
    "    :param n: highest n-gram order\n",
    "    :type n: int\n",
    "    \"\"\"\n",
    "    return corpus_nist([references], [hypothesis], n)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 203,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 计算长度惩罚因子\n",
    "# 经验值，使得在 hyp_len/ref_len = 2/3 时，长度惩罚因子值为 0.5\n",
    "def nist_length_penalty(ref_len, hyp_len):\n",
    "    \"\"\"\n",
    "    Calculates the NIST length penalty, from Eq. 3 in Doddington (2002)\n",
    "\n",
    "        penalty = exp( beta * log( min( len(hyp)/len(ref) , 1.0 )))\n",
    "\n",
    "    where,\n",
    "\n",
    "        `beta` is chosen to make the brevity penalty factor = 0.5 when the\n",
    "        no. of words in the system output (hyp) is 2/3 of the average\n",
    "        no. of words in the reference translation (ref)\n",
    "\n",
    "    The NIST penalty is different from BLEU's such that it minimize the impact\n",
    "    of the score of small variations in the length of a translation.\n",
    "    See Fig. 4 in  Doddington (2002)\n",
    "    \"\"\"\n",
    "    ratio = hyp_len / ref_len\n",
    "    if 0 < ratio < 1:\n",
    "        ratio_x, score_x = 2/3, 1/2 # 原 ratio_x 为 1.5\n",
    "        beta = math.log(score_x) / (math.log(ratio_x) ** 2)\n",
    "        return  math.exp(beta * math.log(ratio) ** 2)\n",
    "    else:  # ratio <= 0 or ratio >= 1\n",
    "        return max(min(ratio, 1.0), 0.0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 189,
   "metadata": {},
   "outputs": [],
   "source": [
    "def corpus_nist(list_of_references, hypotheses, n=5):\n",
    "    \"\"\"\n",
    "    Calculate a single corpus-level NIST score (aka. system-level BLEU) for all\n",
    "    the hypotheses and their respective references.\n",
    "\n",
    "    :param references: a corpus of lists of reference sentences, w.r.t. hypotheses\n",
    "    :type references: list(list(list(str))), [[[], []...]]\n",
    "    :param hypotheses: a list of hypothesis sentences\n",
    "    :type hypotheses: list(list(str)), [[]]\n",
    "    :param n: highest n-gram order\n",
    "    :type n: int\n",
    "    \"\"\"\n",
    "    # Before proceeding to compute NIST, perform sanity checks.\n",
    "    assert len(list_of_references) == len(\n",
    "        hypotheses\n",
    "    ), \"The number of hypotheses and their reference(s) should be the same\"\n",
    "\n",
    "    # Collect the ngram coounts from the reference sentences.\n",
    "    ngram_freq = Counter()\n",
    "    total_reference_words = 0\n",
    "    for (references) in list_of_references:  # For each source sent, there's a list of reference sents.\n",
    "        for reference in references:\n",
    "            # For each order of ngram, count the ngram occurrences.\n",
    "            for i in range(1, n + 1):\n",
    "                ngram_freq.update(ngrams(reference, i))\n",
    "            total_reference_words += len(reference)\n",
    "\n",
    "    # Compute the information weights based on the reference sentences.\n",
    "    # Eqn 2 in Doddington (2002):\n",
    "    # Info(w_1 ... w_n) = log_2 [ (# of occurrences of w_1 ... w_n-1) / (# of occurrences of w_1 ... w_n) ]\n",
    "    information_weights = {}\n",
    "    for _ngram in ngram_freq:  # w_1 ... w_n\n",
    "        _mgram = _ngram[:-1]  #  w_1 ... w_n-1\n",
    "        # From https://github.com/moses-smt/mosesdecoder/blob/master/scripts/generic/mteval-v13a.pl#L546\n",
    "        # it's computed as such:\n",
    "        # denominator = ngram_freq[_mgram] if _mgram and _mgram in ngram_freq \n",
    "        # else denominator = total_reference_words\n",
    "        # information_weights[_ngram] = -1 * math.log(ngram_freq[_ngram]/denominator) / math.log(2)\n",
    "        #\n",
    "        # Mathematically, it's equivalent to the our implementation:\n",
    "        if _mgram and _mgram in ngram_freq:\n",
    "            numerator = ngram_freq[_mgram]\n",
    "        else:\n",
    "            # 赋予单个词高权重\n",
    "            numerator = total_reference_words\n",
    "        information_weights[_ngram] = math.log(numerator / ngram_freq[_ngram], 2)\n",
    "    \n",
    "    # Micro-average.\n",
    "    # 计算 n = 「1-5」时，分子分母的值（分子、分母各 5 个值）\n",
    "    nist_precision_numerator_per_ngram = Counter()\n",
    "    nist_precision_denominator_per_ngram = Counter()\n",
    "    l_ref, l_sys = 0, 0\n",
    "    # For each order of ngram.\n",
    "    for i in range(1, n + 1):\n",
    "        # Iterate through each hypothesis and their corresponding references.\n",
    "        for references, hypothesis in zip(list_of_references, hypotheses):\n",
    "            hyp_len = len(hypothesis)\n",
    "\n",
    "            # Find reference with the best NIST score.\n",
    "            nist_score_per_ref = []\n",
    "            for reference in references:\n",
    "                _ref_len = len(reference)\n",
    "                # Counter of ngrams in hypothesis.\n",
    "                hyp_ngrams = (\n",
    "                    Counter(ngrams(hypothesis, i)) if len(hypothesis) >= i else Counter()\n",
    "                )\n",
    "                ref_ngrams = (\n",
    "                    Counter(ngrams(reference, i)) if len(reference) >= i else Counter()\n",
    "                )\n",
    "                # 取交集，取交集会取到较小者\n",
    "                ngram_overlaps = hyp_ngrams & ref_ngrams\n",
    "                # Precision part of the score in Eqn 3\n",
    "                _numerator = sum(\n",
    "                    information_weights[_ngram] * count for _ngram, count in ngram_overlaps.items()\n",
    "                )\n",
    "                _denominator = sum(hyp_ngrams.values())\n",
    "                _precision = 0 if _denominator == 0 else _numerator / _denominator\n",
    "                nist_score_per_ref.append(\n",
    "                    (_precision, _numerator, _denominator, _ref_len)\n",
    "                )\n",
    "            # Best reference.\n",
    "            # max(数组)，从第一个元素开始比较\n",
    "            # 优先级：比值 > 分子 > 分母 > 长度\n",
    "            precision, numerator, denominator, ref_len = max(nist_score_per_ref)\n",
    "            nist_precision_numerator_per_ngram[i] += numerator\n",
    "            nist_precision_denominator_per_ngram[i] += denominator\n",
    "            l_ref += ref_len\n",
    "            l_sys += hyp_len\n",
    "\n",
    "    # Final NIST micro-average mean aggregation.\n",
    "    nist_precision = 0\n",
    "    for i in nist_precision_numerator_per_ngram:\n",
    "        precision = (\n",
    "            nist_precision_numerator_per_ngram[i]\n",
    "            / nist_precision_denominator_per_ngram[i]\n",
    "        )\n",
    "        nist_precision += precision\n",
    "    # Eqn 3 in Doddington(2002)\n",
    "    return nist_precision * nist_length_penalty(l_ref, l_sys)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 过程"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 190,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Reference Ngram Freq\n",
    "ngram_freq= Counter()\n",
    "total_reference_words = 0\n",
    "for (references) in [[reference1, reference2, reference3]]:\n",
    "    for reference in references:\n",
    "        for i in range(1, 5 + 1):\n",
    "            ngram_freq.update(ngrams(reference, i))\n",
    "        total_reference_words += len(reference)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 191,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Information Weights\n",
    "information_weights = {}\n",
    "for _ngram in ngram_freq:\n",
    "    _mgram = _ngram[:-1] \n",
    "    if _mgram and _mgram in ngram_freq:\n",
    "        numerator = ngram_freq[_mgram]\n",
    "    else:\n",
    "        numerator = total_reference_words\n",
    "    information_weights[_ngram] = math.log(numerator / ngram_freq[_ngram], 2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 192,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3.3709935957649324"
      ]
     },
     "execution_count": 192,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "corpus_nist([[reference1, reference2, reference3]], [hypothese1])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 195,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3.3685866013071895"
      ]
     },
     "execution_count": 195,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 下面以 hypothese1 为例来说明：\n",
    "\n",
    "nist_score = (51.71/18 + 6.75/17 + 1.58/16 + 0/15 + 0/14) * nist_length_penalty(16*3+18+18, 18*5)\n",
    "nist_score"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1-Gram\n",
    "\n",
    "| gram     | hyp    | ref1    | overlaps  | ref2    | overlaps  | ref3    | overlaps  |\n",
    "| -------- | ------ | ------- | --------- | ------- | --------- | ------- | --------- |\n",
    "| It       | 1      | 1       | 4.06\\*1   | 1       | 4.06\\*1   | 1       | 4.06\\*1   |\n",
    "| is       | 1      | 1       | 4.06\\*1   | 1       | 4.06\\*1   | 1       | 4.06\\*1   |\n",
    "| a        | 1      | 1       | 5.64\\*1   | 0       | 0         | 0       | 0         |\n",
    "| guide    | 1      | 1       | 4.64\\*1   | 0       | 0         | 1       | 4.64\\*1   |\n",
    "| to       | 1      | 1       | 4.64\\*1   | 0       | 0         | 1       | 4.64\\*1   |\n",
    "| action   | 1      | 1       | 5.64\\*1   | 0       | 0         | 0       | 0         |\n",
    "| which    | 1      | 0       | 0         | 1       | 5.64\\*1   | 0       | 0         |\n",
    "| ensures  | 1      | 1       | 5.64\\*1   | 0       | 0         | 0       | 0         |\n",
    "| that     | 1      | 2       | 4.64\\*1   | 0       | 0         | 0       | 0         |\n",
    "| the      | 3      | 1       | 2.47\\*1   | 4       | 2.47\\*3   | 4       | 2.47\\*3   |\n",
    "| military | 1      | 1       | 4.64\\*1   | 1       | 4.64\\*1   | 0       | 0         |\n",
    "| always   | 1      | 0       | 0         | 1       | 4.64\\*1   | 1       | 4.64\\*1   |\n",
    "| obeys    | 1      | 0       | 0         | 0       | 0         | 0       | 0         |\n",
    "| commands | 1      | 1       | 5.64\\*1   | 0       | 0         | 0       | 0         |\n",
    "| of       | 1      | 0       | 0         | 1       | 4.64\\*1   | 1       | 4.64\\*1   |\n",
    "| party    | 1      | 0       | 0         | 0       | 0         | 1       | 5.64\\*1   |\n",
    "| **SUM**  | **18** | **16**(len) | **51.71** | 18(len) | 35.09 | 16(len) | 39.73 |\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "2-Gram\n",
    "\n",
    "| gram            | hyp    | ref1    | overlaps | ref2    | overlaps | ref3    | overlaps |\n",
    "| --------------- | ------ | ------- | -------- | ------- | -------- | ------- | -------- |\n",
    "| It is           | 1      | 1       | 0\\*1     | 1       | 0\\*1     | 1       | 0\\*1     |\n",
    "| is a            | 1      | 1       | 1.58\\*1  | 0       | 0        | 0       | 0        |\n",
    "| a guide         | 1      | 1       | 0\\*1     | 0       | 0        | 0       | 0        |\n",
    "| guide to        | 1      | 1       | 1\\*1     | 0       | 0        | 0       | 0        |\n",
    "| to action       | 1      | 1       | 1\\*1     | 0       | 0        | 0       | 0        |\n",
    "| action which    | 1      | 0       | 0        | 0       | 0        | 0       | 0        |\n",
    "| which ensures   | 1      | 0       | 0        | 1       | 0        | 0       | 0        |\n",
    "| ensures that    | 1      | 1       | 0\\*1     | 0       | 0        | 0       | 0        |\n",
    "| that the        | 1      | 1       | 1\\*1     | 0       | 0        | 0       | 0        |\n",
    "| the military    | 1      | 1       | 2.17\\*1  | 1       | 2.17\\*1  | 0       | 0        |\n",
    "| military always | 1      | 0       | 0        | 0       | 0        | 0       | 0        |\n",
    "| always obeys    | 1      | 0       | 0        | 0       | 0        | 0       | 0        |\n",
    "| obeys the       | 1      | 0       | 0        | 0       | 0        | 0       | 0        |\n",
    "| the commands    | 1      | 0       | 0        | 0       | 0        | 0       | 0        |\n",
    "| commands of     | 1      | 0       | 0        | 0       | 0        | 0       | 0        |\n",
    "| of the          | 1      | 0       | 0        | 1       | 0\\*1      | 1       | 0\\*1     |\n",
    "| the party       | 1      | 0       | 0        | 0       | 0        | 1       | 3.17\\*1  |\n",
    "| **SUM**         | **17** | **16**(len) | **6.75** | 18(len) | 2.17 | 16(len) | 3.17 |"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "3-Gram\n",
    "\n",
    "| gram                  | hyp    | ref1        | overlaps | ref2    | overlaps | ref3    | overlaps |\n",
    "| --------------------- | ------ | ----------- | -------- | ------- | -------- | ------- | -------- |\n",
    "| It is a               | 1      | 1           | 1.58\\*1  | 0       | 0        | 0       | 0        |\n",
    "| is a guide            | 1      | 1           | 0\\*1     | 0       | 0        | 0       | 0        |\n",
    "| a guide to            | 1      | 1           | 0\\*1     | 0       | 0        | 0       | 0        |\n",
    "| guide to action       | 1      | 1           | 0\\*1     | 0       | 0        | 0       | 0        |\n",
    "| to action which       | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| action which ensures  | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| which ensures that    | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| ensures that the      | 1      | 1           | 0\\*1     | 0       | 0        | 0       | 0        |\n",
    "| that the military     | 1      | 1           | 0\\*1     | 0       | 0        | 0       | 0        |\n",
    "| the military always   | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| military always obeys | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| always obeys the      | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| obeys the commands    | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| the commands of       | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| commands of the       | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| of the party          | 1      | 0           | 0        | 0       | 0        | 1       | 1\\*1     |\n",
    "| **SUM**               | **16** | **16**(len) | **1.58** | 18(len) | 0    | 16(len) | 1   |\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "4-Gram\n",
    "\n",
    "| gram                      | hyp    | ref1        | overlaps | ref2    | overlaps | ref3    | overlaps |\n",
    "| ------------------------- | ------ | ----------- | -------- | ------- | -------- | ------- | -------- |\n",
    "| It is a guide             | 1      | 1           | 0\\*1     | 0       | 0        | 0       | 0        |\n",
    "| is a guide to             | 1      | 1           | 0\\*1     | 0       | 0        | 0       | 0        |\n",
    "| a guide to action         | 1      | 1           | 0\\*1     | 0       | 0        | 0       | 0        |\n",
    "| guide to action which     | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| to action which ensures   | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| action which ensures that | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| which ensures that the    | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| ensures that the military | 1      | 1           | 0\\*1     | 0       | 0        | 0       | 0        |\n",
    "| that the military always  | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| the military always obeys | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| military always obeys the | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| always obeys the commands | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| obeys the commands of     | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| the commands of the       | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| commands of the party     | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| **SUM**                   | **15** | 16(len) | 0    | **18**(len) | **0**    | 16(len) | 0    |"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "5-Gram\n",
    "\n",
    "| gram                               | hyp    | ref1        | overlaps | ref2    | overlaps | ref3    | overlaps |\n",
    "| ---------------------------------- | ------ | ----------- | -------- | ------- | -------- | ------- | -------- |\n",
    "| It is a guide to                   | 1      | 1           | 0\\*1     | 0       | 0        | 0       | 0        |\n",
    "| is a guide to action               | 1      | 1           | 0\\*1     | 0       | 0        | 0       | 0        |\n",
    "| a guide to action which            | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| guide to action which ensures      | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| to action which ensures that       | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| action which ensures that the      | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| which ensures that the military    | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| ensures that the military always   | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| that the military always obeys     | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| the military always obeys the      | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| military always obeys the commands | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| always obeys the commands of       | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| obeys the commands of the          | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| the commands of the party          | 1      | 0           | 0        | 0       | 0        | 0       | 0        |\n",
    "| **SUM**                            | **14** | 16(len) | 0    | **18**(len) | **0**    | 16(len) | 0   |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3.3709935957649324"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sentence_nist([reference1, reference2, reference3], hypothese1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 202,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.5045666840058485"
      ]
     },
     "execution_count": 202,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# bleu\n",
    "nltk.translate.bleu([reference1, reference2, reference3], hypothese1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 141,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1.4619035460750132"
      ]
     },
     "execution_count": 141,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sentence_nist([reference1, reference2, reference3], hypothese2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 201,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5.92086005993801e-155"
      ]
     },
     "execution_count": 201,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# bleu\n",
    "nltk.translate.bleu([reference1, reference2, reference3], hypothese2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 小结\n",
    "\n",
    "通过详细的过程我们可以非常容易地看出与 BLEU 的不同：\n",
    "\n",
    "- 对每个 Ngram，BLEU 分别取不同 gram 下各个参考译文计数的最大值，然后与假设译文该 gram 的值对比取小的；而 NIST 会对译文的每个 gram 乘以一个权重，该权重来自所有译文，使用 Info 公式计算\n",
    "- 对每个 Ngram，BLEU 计算 P 值，使用上一步取出的每个 gram 的较小值求和除以假设译文各 gram 计数和；而 NIST 将乘以权重后的 gram 计数求和除以 gram 长度，然后取所有参考译文中最大的\n",
    "- 对每个 Ngram，BLEU 使用加权求和，NIST 直接求和\n",
    "- 惩罚因子除计算方法不同外，BLEU 只考虑与假设译文最接近的参考译文的长度，而 NIST 则考虑了每个 Ngram"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 参考\n",
    "\n",
    "- [NIST](http://localhost:8888/notebooks/Yam/All4NLP/BLEU/NIST.ipynb)\n",
    "- [Automatic evaluation of machine translation quality using n-gram co-occurrence statistics](http://www.mt-archive.info/HLT-2002-Doddington.pdf)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ">吐槽一下，公式真的很难看懂……"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
   "autoclose": false,
   "autocomplete": true,
   "bibliofile": "biblio.bib",
   "cite_by": "apalike",
   "current_citInitial": 1,
   "eqLabelWithNumbers": true,
   "eqNumInitial": 1,
   "hotkeys": {
    "equation": "Ctrl-E",
    "itemize": "Ctrl-I"
   },
   "labels_anchors": false,
   "latex_user_defs": false,
   "report_style_numbering": false,
   "user_envs_cfg": false
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": true,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}