{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "toc": "true"
   },
   "source": [
    "# Table of Contents\n",
    " <p><div class=\"lev1 toc-item\"><a href=\"#Manual-implementation-of-some-hash-functions\" data-toc-modified-id=\"Manual-implementation-of-some-hash-functions-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>Manual implementation of some hash functions</a></div><div class=\"lev2 toc-item\"><a href=\"#What-is-a-hash-function?\" data-toc-modified-id=\"What-is-a-hash-function?-11\"><span class=\"toc-item-num\">1.1&nbsp;&nbsp;</span>What is a hash function?</a></div><div class=\"lev2 toc-item\"><a href=\"#Common-API-for-the-different-classes\" data-toc-modified-id=\"Common-API-for-the-different-classes-12\"><span class=\"toc-item-num\">1.2&nbsp;&nbsp;</span>Common API for the different classes</a></div><div class=\"lev2 toc-item\"><a href=\"#Checking-the-the-hashlib-module-in-Python-standard-library\" data-toc-modified-id=\"Checking-the-the-hashlib-module-in-Python-standard-library-13\"><span class=\"toc-item-num\">1.3&nbsp;&nbsp;</span>Checking the <a href=\"https://docs.python.org/3/library/hashlib.html\" target=\"_blank\">the <code>hashlib</code> module in Python standard library</a></a></div><div class=\"lev2 toc-item\"><a href=\"#First-stupid-example:-a-stupid-hashing-function\" data-toc-modified-id=\"First-stupid-example:-a-stupid-hashing-function-14\"><span class=\"toc-item-num\">1.4&nbsp;&nbsp;</span>First stupid example: a stupid hashing function</a></div><div class=\"lev2 toc-item\"><a href=\"#First-real-example:-the-MD5-hashing-function\" data-toc-modified-id=\"First-real-example:-the-MD5-hashing-function-15\"><span class=\"toc-item-num\">1.5&nbsp;&nbsp;</span>First real example: the MD5 hashing function</a></div><div class=\"lev3 toc-item\"><a href=\"#Useful-functions-for-the-MD5-algorithm\" data-toc-modified-id=\"Useful-functions-for-the-MD5-algorithm-151\"><span class=\"toc-item-num\">1.5.1&nbsp;&nbsp;</span>Useful functions for the MD5 algorithm</a></div><div class=\"lev3 toc-item\"><a href=\"#The-MD5-class\" data-toc-modified-id=\"The-MD5-class-152\"><span class=\"toc-item-num\">1.5.2&nbsp;&nbsp;</span>The <code>MD5</code> class</a></div><div class=\"lev3 toc-item\"><a href=\"#First-check-on-MD5\" data-toc-modified-id=\"First-check-on-MD5-153\"><span class=\"toc-item-num\">1.5.3&nbsp;&nbsp;</span>First check on MD5</a></div><div class=\"lev3 toc-item\"><a href=\"#A-less-stupid-check-on-MD5\" data-toc-modified-id=\"A-less-stupid-check-on-MD5-154\"><span class=\"toc-item-num\">1.5.4&nbsp;&nbsp;</span>A less stupid check on MD5</a></div><div class=\"lev3 toc-item\"><a href=\"#Trying-1000-random-examples\" data-toc-modified-id=\"Trying-1000-random-examples-155\"><span class=\"toc-item-num\">1.5.5&nbsp;&nbsp;</span>Trying 1000 random examples</a></div><div class=\"lev2 toc-item\"><a href=\"#Second-real-example:-the-SHA-1-hashing-function\" data-toc-modified-id=\"Second-real-example:-the-SHA-1-hashing-function-16\"><span class=\"toc-item-num\">1.6&nbsp;&nbsp;</span>Second real example: the SHA-1 hashing function</a></div><div class=\"lev3 toc-item\"><a href=\"#Useful-functions-the-SHA-1-algorithm\" data-toc-modified-id=\"Useful-functions-the-SHA-1-algorithm-161\"><span class=\"toc-item-num\">1.6.1&nbsp;&nbsp;</span>Useful functions the SHA-1 algorithm</a></div><div class=\"lev3 toc-item\"><a href=\"#The-SHA1-class\" data-toc-modified-id=\"The-SHA1-class-162\"><span class=\"toc-item-num\">1.6.2&nbsp;&nbsp;</span>The <code>SHA1</code> class</a></div><div class=\"lev3 toc-item\"><a href=\"#First-check-on-SHA-1\" data-toc-modified-id=\"First-check-on-SHA-1-163\"><span class=\"toc-item-num\">1.6.3&nbsp;&nbsp;</span>First check on SHA-1</a></div><div class=\"lev3 toc-item\"><a href=\"#A-less-stupid-check-on-SHA-1\" data-toc-modified-id=\"A-less-stupid-check-on-SHA-1-164\"><span class=\"toc-item-num\">1.6.4&nbsp;&nbsp;</span>A less stupid check on SHA-1</a></div><div class=\"lev3 toc-item\"><a href=\"#Trying-1000-random-examples\" data-toc-modified-id=\"Trying-1000-random-examples-165\"><span class=\"toc-item-num\">1.6.5&nbsp;&nbsp;</span>Trying 1000 random examples</a></div><div class=\"lev2 toc-item\"><a href=\"#Third-real-example:-the-SHA-2-hashing-function\" data-toc-modified-id=\"Third-real-example:-the-SHA-2-hashing-function-17\"><span class=\"toc-item-num\">1.7&nbsp;&nbsp;</span>Third real example: the SHA-2 hashing function</a></div><div class=\"lev3 toc-item\"><a href=\"#Useful-functions-the-SHA-2-algorithm\" data-toc-modified-id=\"Useful-functions-the-SHA-2-algorithm-171\"><span class=\"toc-item-num\">1.7.1&nbsp;&nbsp;</span>Useful functions the SHA-2 algorithm</a></div><div class=\"lev3 toc-item\"><a href=\"#The-SHA2-class\" data-toc-modified-id=\"The-SHA2-class-172\"><span class=\"toc-item-num\">1.7.2&nbsp;&nbsp;</span>The <code>SHA2</code> class</a></div><div class=\"lev3 toc-item\"><a href=\"#First-check-on-SHA-2\" data-toc-modified-id=\"First-check-on-SHA-2-173\"><span class=\"toc-item-num\">1.7.3&nbsp;&nbsp;</span>First check on SHA-2</a></div><div class=\"lev3 toc-item\"><a href=\"#Check-on-SHA-2\" data-toc-modified-id=\"Check-on-SHA-2-174\"><span class=\"toc-item-num\">1.7.4&nbsp;&nbsp;</span>Check on SHA-2</a></div><div class=\"lev3 toc-item\"><a href=\"#Trying-1000-random-examples\" data-toc-modified-id=\"Trying-1000-random-examples-175\"><span class=\"toc-item-num\">1.7.5&nbsp;&nbsp;</span>Trying 1000 random examples</a></div><div class=\"lev2 toc-item\"><a href=\"#Comparison-:-MD5-vs-SHA-1-vs-SHA-2\" data-toc-modified-id=\"Comparison-:-MD5-vs-SHA-1-vs-SHA-2-18\"><span class=\"toc-item-num\">1.8&nbsp;&nbsp;</span>Comparison : MD5 vs SHA-1 vs SHA-2</a></div><div class=\"lev2 toc-item\"><a href=\"#Bonus-:-SHA-2-variants\" data-toc-modified-id=\"Bonus-:-SHA-2-variants-19\"><span class=\"toc-item-num\">1.9&nbsp;&nbsp;</span>Bonus : SHA-2 variants</a></div><div class=\"lev3 toc-item\"><a href=\"#SHA-224\" data-toc-modified-id=\"SHA-224-191\"><span class=\"toc-item-num\">1.9.1&nbsp;&nbsp;</span>SHA-224</a></div><div class=\"lev4 toc-item\"><a href=\"#The-SHA224-class\" data-toc-modified-id=\"The-SHA224-class-1911\"><span class=\"toc-item-num\">1.9.1.1&nbsp;&nbsp;</span>The <code>SHA224</code> class</a></div><div class=\"lev4 toc-item\"><a href=\"#Checks-on-SHA-224\" data-toc-modified-id=\"Checks-on-SHA-224-1912\"><span class=\"toc-item-num\">1.9.1.2&nbsp;&nbsp;</span>Checks on SHA-224</a></div><div class=\"lev3 toc-item\"><a href=\"#SHA-512\" data-toc-modified-id=\"SHA-512-192\"><span class=\"toc-item-num\">1.9.2&nbsp;&nbsp;</span>SHA-512</a></div><div class=\"lev4 toc-item\"><a href=\"#Useful-functions-the-SHA-512-algorithm\" data-toc-modified-id=\"Useful-functions-the-SHA-512-algorithm-1921\"><span class=\"toc-item-num\">1.9.2.1&nbsp;&nbsp;</span>Useful functions the SHA-512 algorithm</a></div><div class=\"lev4 toc-item\"><a href=\"#The-SHA512-class\" data-toc-modified-id=\"The-SHA512-class-1922\"><span class=\"toc-item-num\">1.9.2.2&nbsp;&nbsp;</span>The <code>SHA512</code> class</a></div><div class=\"lev4 toc-item\"><a href=\"#Checks-on-SHA-512\" data-toc-modified-id=\"Checks-on-SHA-512-1923\"><span class=\"toc-item-num\">1.9.2.3&nbsp;&nbsp;</span>Checks on SHA-512</a></div><div class=\"lev3 toc-item\"><a href=\"#SHA-384\" data-toc-modified-id=\"SHA-384-193\"><span class=\"toc-item-num\">1.9.3&nbsp;&nbsp;</span>SHA-384</a></div><div class=\"lev4 toc-item\"><a href=\"#The-SHA384-class\" data-toc-modified-id=\"The-SHA384-class-1931\"><span class=\"toc-item-num\">1.9.3.1&nbsp;&nbsp;</span>The <code>SHA384</code> class</a></div><div class=\"lev4 toc-item\"><a href=\"#Checks-on-SHA-384\" data-toc-modified-id=\"Checks-on-SHA-384-1932\"><span class=\"toc-item-num\">1.9.3.2&nbsp;&nbsp;</span>Checks on SHA-384</a></div><div class=\"lev3 toc-item\"><a href=\"#More-comparison\" data-toc-modified-id=\"More-comparison-194\"><span class=\"toc-item-num\">1.9.4&nbsp;&nbsp;</span>More comparison</a></div><div class=\"lev2 toc-item\"><a href=\"#Conclusion\" data-toc-modified-id=\"Conclusion-110\"><span class=\"toc-item-num\">1.10&nbsp;&nbsp;</span>Conclusion</a></div><div class=\"lev3 toc-item\"><a href=\"#Bonus\" data-toc-modified-id=\"Bonus-1101\"><span class=\"toc-item-num\">1.10.1&nbsp;&nbsp;</span>Bonus</a></div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Manual implementation of some hash functions\n",
    "\n",
    "This small [Jupyter notebook](https://www.Jupyter.org/) is a short experiment, to see if I can implement the some basic [Hashing functions](https://en.wikipedia.org/wiki/Hash_function), more specifically [cryptographic hashing functions](https://en.wikipedia.org/wiki/Cryptographic_hash_function), like `MD5`, `SHA1`, `SHA256`, `SHA512` etc\n",
    "\n",
    "And then I want compare my manual implementations with the functions implemented in [the `hashlib` module in Python standard library](https://docs.python.org/3/library/hashlib.html).\n",
    "Ideally, my implementation should work exactly like the reference one, only slower!\n",
    "\n",
    "\n",
    "- *Reference*: Wikipedia pages on [Hash functions](https://en.wikipedia.org/wiki/Hash_function), [MD5](https://en.wikipedia.org/wiki/MD5), [SHA-1](https://en.wikipedia.org/wiki/SHA1) and [SHA-2](https://en.wikipedia.org/wiki/SHA-2), as well as [the `hashlib` module in Python standard library](https://docs.python.org/3/library/hashlib.html).\n",
    "- *Date*: 13 May 2017 (first part about MD5), 19 June 2017 (second part about SHA-1), 19 June 2017 (last part about SHA-2).\n",
    "- *Author*: [Lilian Besson](https://GitHub.com/Naereen/notebooks).\n",
    "- *License*: [MIT Licensed](https://LBesson.MIT-License.org/)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----\n",
    "## What is a hash function?\n",
    "> TL;DR : [Hash functions](https://en.wikipedia.org/wiki/Hash_function) and [cryptographic hashing functions](https://en.wikipedia.org/wiki/Cryptographic_hash_function) on Wikipedia."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----\n",
    "## Common API for the different classes\n",
    "\n",
    "I will copy the API proposed by [the `hashlib` module in Python standard library](https://docs.python.org/3/library/hashlib.html), so it will be very easy to compare my implementations with the one provided with your default [Python](https://www.Python.org/) installation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class Hash(object):\n",
    "    \"\"\" Common class for all hash methods.\n",
    "    \n",
    "    It copies the one of the hashlib module (https://docs.python.org/3.5/library/hashlib.html).\n",
    "    \"\"\"\n",
    "    \n",
    "    def __init__(self, *args, **kwargs):\n",
    "        \"\"\" Create the Hash object.\"\"\"\n",
    "        self.name = self.__class__.__name__  # https://docs.python.org/3.5/library/hashlib.html#hashlib.hash.name\n",
    "        self.byteorder   = 'little'\n",
    "        self.digest_size = 0                # https://docs.python.org/3.5/library/hashlib.html#hashlib.hash.digest_size\n",
    "        self.block_size  = 0                # https://docs.python.org/3.5/library/hashlib.html#hashlib.hash.block_size\n",
    "\n",
    "    def __str__(self):\n",
    "        return self.name\n",
    "        \n",
    "    def update(self, arg):\n",
    "        \"\"\" Update the hash object with the object arg, which must be interpretable as a buffer of bytes.\"\"\"\n",
    "        pass\n",
    "\n",
    "    def digest(self):\n",
    "        \"\"\" Return the digest of the data passed to the update() method so far. This is a bytes object of size digest_size which may contain bytes in the whole range from 0 to 255.\"\"\"\n",
    "        return b\"\"\n",
    "\n",
    "    def hexdigest(self):\n",
    "        \"\"\" Like digest() except the digest is returned as a string object of double length, containing only hexadecimal digits. This may be used to exchange the value safely in email or other non-binary environments.\"\"\"\n",
    "        digest = self.digest()\n",
    "        raw = digest.to_bytes(self.digest_size, byteorder=self.byteorder)\n",
    "        format_str = '{:0' + str(2 * self.digest_size) + 'x}'\n",
    "        return format_str.format(int.from_bytes(raw, byteorder='big'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----\n",
    "## Checking the [the `hashlib` module in Python standard library](https://docs.python.org/3/library/hashlib.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import hashlib"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can check [the available algorithms](https://docs.python.org/3.5/library/hashlib.html#hashlib.algorithms_available), some of them being [guaranteed to be on any platform](https://docs.python.org/3.5/library/hashlib.html#hashlib.algorithms_guaranteed), some are not."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['ripemd160',\n",
       " 'MD4',\n",
       " 'SHA224',\n",
       " 'sha384',\n",
       " 'md4',\n",
       " 'DSA-SHA',\n",
       " 'SHA256',\n",
       " 'SHA384',\n",
       " 'MD5',\n",
       " 'whirlpool',\n",
       " 'sha',\n",
       " 'sha1',\n",
       " 'RIPEMD160',\n",
       " 'DSA',\n",
       " 'dsaWithSHA',\n",
       " 'dsaEncryption',\n",
       " 'sha224',\n",
       " 'SHA',\n",
       " 'sha512',\n",
       " 'md5',\n",
       " 'SHA512',\n",
       " 'ecdsa-with-SHA1',\n",
       " 'SHA1',\n",
       " 'sha256']"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(hashlib.algorithms_available)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "I will need at least these ones:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "assert 'MD5' in hashlib.algorithms_available\n",
    "assert 'SHA1' in hashlib.algorithms_available\n",
    "assert 'SHA256' in hashlib.algorithms_available\n",
    "assert 'SHA224' in hashlib.algorithms_available\n",
    "assert 'SHA512' in hashlib.algorithms_available\n",
    "assert 'SHA384' in hashlib.algorithms_available"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets check that they have the block size and digest size announced:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "For MD5      : the block size is 64  and the digest size is 16.\n",
      "For SHA1     : the block size is 64  and the digest size is 20.\n",
      "For SHA256   : the block size is 64  and the digest size is 32.\n",
      "For SHA224   : the block size is 64  and the digest size is 28.\n",
      "For SHA512   : the block size is 128 and the digest size is 64.\n",
      "For SHA384   : the block size is 128 and the digest size is 48.\n"
     ]
    }
   ],
   "source": [
    "for name, s in zip(\n",
    "        ('MD5', 'SHA1', 'SHA256', 'SHA224', 'SHA512', 'SHA384'),\n",
    "        (hashlib.md5(), hashlib.sha1(), hashlib.sha256(), hashlib.sha224(), hashlib.sha512(), hashlib.sha384())\n",
    "    ):\n",
    "    print(\"For {:<8} : the block size is {:<3} and the digest size is {:<2}.\".format(name, s.block_size, s.digest_size))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----\n",
    "## First stupid example: a stupid hashing function\n",
    "\n",
    "This \"stupid\" hashing function will use `digest_size` of 128 bytes (= 16 bits), and compute it by ... just looking at the first 128 bytes of the input data.\n",
    "\n",
    "This is just to check the API and how to read from a bytes buffer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class HeaderHash(Hash):\n",
    "    \"\"\" This \"stupid\" hashing function will use `digest_size` of 128 bytes, and compute it by ... just looking at the first 128 bytes of the input data.\n",
    "    \"\"\"\n",
    "    \n",
    "    def __init__(self):\n",
    "        # Common part\n",
    "        self.digest_size = 16\n",
    "        self.block_size  = 16\n",
    "        self.name = \"Header\"\n",
    "        # Specific part\n",
    "        self._data = b\"\"\n",
    "\n",
    "    def update(self, arg):\n",
    "        \"\"\" Update the hash object with the object arg, which must be interpretable as a buffer of bytes.\"\"\"\n",
    "        if len(self._data) == 0:\n",
    "            self._data = arg[:self.block_size]\n",
    "\n",
    "    def digest(self):\n",
    "        \"\"\" Return the digest of the data passed to the update() method so far. This is a bytes object of size digest_size which may contain bytes in the whole range from 0 to 255.\"\"\"\n",
    "        return self._data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us try it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "h1 = HeaderHash()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<__main__.HeaderHash at 0x7fc84c200630>"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Header\n"
     ]
    }
   ],
   "source": [
    "h1\n",
    "print(h1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us use some toy data, to test here and after."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "data = b\"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ\" * 100"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "b'0123456789ABCDEF'"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "h1.update(data)\n",
    "h1.digest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Well... It seems to work, even if this first example is stupid."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----\n",
    "## First real example: the MD5 hashing function\n",
    "Let start with a simple one: [the MD5 hashing function](https://en.wikipedia.org/wiki/MD5), from Rivest in 1992.\n",
    "\n",
    "<center><span style=\"font-size: large; color: red;\"><b>Warning</b>: it is considered broken since at least 2012, never use it for security purposes.</span></center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Useful functions for the MD5 algorithm\n",
    "Instead of writing the complete MD5 algorithm in the class below, I preferred to define here some useful functions, using [Bitwise operators](https://wiki.python.org/moin/BitwiseOperators)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def MD5_f1(b, c, d):\n",
    "    \"\"\" First ternary bitwise operation.\"\"\"\n",
    "    return ((b & c) | ((~b) & d)) & 0xFFFFFFFF\n",
    "\n",
    "def MD5_f2(b, c, d):\n",
    "    \"\"\" Second ternary bitwise operation.\"\"\"\n",
    "    return ((b & d) | (c & (~d))) & 0xFFFFFFFF\n",
    "\n",
    "def MD5_f3(b, c, d):\n",
    "    \"\"\" Third ternary bitwise operation.\"\"\"\n",
    "    return (b ^ c ^ d) & 0xFFFFFFFF\n",
    "\n",
    "def MD5_f4(b, c, d):\n",
    "    \"\"\" Forth ternary bitwise operation.\"\"\"\n",
    "    return (c ^ (b | (~d))) & 0xFFFFFFFF"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def leftrotate(x, c):\n",
    "    \"\"\" Left rotate the number x by c bytes.\"\"\"\n",
    "    x &= 0xFFFFFFFF\n",
    "    return ((x << c) | (x >> (32 - c))) & 0xFFFFFFFF"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def leftshift(x, c):\n",
    "    \"\"\" Left shift the number x by c bytes.\"\"\"\n",
    "    return x << c"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from math import floor, sin"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### The `MD5` class"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is a direct implementation of the pseudo-code, as given for instance on the Wikipedia page, or the original research article by Rivest."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class MD5(Hash):\n",
    "    \"\"\"MD5 hashing, see https://en.wikipedia.org/wiki/MD5#Algorithm.\"\"\"\n",
    "    \n",
    "    def __init__(self):\n",
    "        self.name        = \"MD5\"\n",
    "        self.byteorder   = 'little'\n",
    "        self.block_size  = 64\n",
    "        self.digest_size = 16\n",
    "        # Internal data\n",
    "        s = [0] * 64\n",
    "        K = [0] * 64\n",
    "        # Initialize s, s specifies the per-round shift amounts\n",
    "        s[ 0:16] = [7, 12, 17, 22,  7, 12, 17, 22,  7, 12, 17, 22,  7, 12, 17, 22]\n",
    "        s[16:32] = [5,  9, 14, 20,  5,  9, 14, 20,  5,  9, 14, 20,  5,  9, 14, 20]\n",
    "        s[32:48] = [4, 11, 16, 23,  4, 11, 16, 23,  4, 11, 16, 23,  4, 11, 16, 23]\n",
    "        s[48:64] = [6, 10, 15, 21,  6, 10, 15, 21,  6, 10, 15, 21,  6, 10, 15, 21]\n",
    "        # Store it\n",
    "        self._s = s\n",
    "        # Use binary integer part of the sines of integers (Radians) as constants:\n",
    "        for i in range(64):\n",
    "            K[i] = floor(2**32 * abs(sin(i + 1))) & 0xFFFFFFFF\n",
    "        # Store it\n",
    "        self._K = K\n",
    "        # Initialize variables:\n",
    "        a0 = 0x67452301   # A\n",
    "        b0 = 0xefcdab89   # B\n",
    "        c0 = 0x98badcfe   # C\n",
    "        d0 = 0x10325476   # D\n",
    "        self.hash_pieces = [a0, b0, c0, d0]\n",
    "    \n",
    "    def update(self, arg):\n",
    "        s, K = self._s, self._K\n",
    "        a0, b0, c0, d0 = self.hash_pieces\n",
    "        # 1. Pre-processing\n",
    "        data = bytearray(arg)\n",
    "        orig_len_in_bits = (8 * len(data)) & 0xFFFFFFFFFFFFFFFF\n",
    "        # 1.a. Add a single '1' bit at the end of the input bits\n",
    "        data.append(0x80)\n",
    "        # 1.b. Padding with zeros as long as the input bits length ≡ 448 (mod 512)\n",
    "        while len(data) % 64 != 56:\n",
    "            data.append(0)\n",
    "        # 1.c. append original length in bits mod (2 pow 64) to message\n",
    "        data += orig_len_in_bits.to_bytes(8, byteorder='little')\n",
    "        assert len(data) % 64 == 0, \"Error in padding\"\n",
    "        # 2. Computations\n",
    "        # Process the message in successive 512-bit = 64-bytes chunks:\n",
    "        for offset in range(0, len(data), 64):\n",
    "            # 2.a. 512-bits = 64-bytes chunks\n",
    "            chunks = data[offset : offset + 64]\n",
    "            # 2.b. Break chunk into sixteen 32-bit = 4-bytes words M[j], 0 ≤ j ≤ 15\n",
    "            A, B, C, D = a0, b0, c0, d0\n",
    "            # 2.c. Main loop\n",
    "            for i in range(64):\n",
    "                if 0 <= i <= 15:\n",
    "                    F = MD5_f1(B, C, D)\n",
    "                    g = i\n",
    "                elif 16 <= i <= 31:\n",
    "                    F = MD5_f2(B, C, D)\n",
    "                    g = (5 * i + 1) % 16\n",
    "                elif 32 <= i <= 47:\n",
    "                    F = MD5_f3(B, C, D)\n",
    "                    g = (3 * i + 5) % 16\n",
    "                elif 48 <= i <= 63:\n",
    "                    F = MD5_f4(B, C, D)\n",
    "                    g = (7 * i) % 16\n",
    "                # Be wary of the below definitions of A, B, C, D\n",
    "                to_rotate = (A + F + K[i] + int.from_bytes(chunks[4*g : 4*g+4], byteorder='little')) & 0xFFFFFFFF\n",
    "                new_B = (B + leftrotate(to_rotate, s[i])) & 0xFFFFFFFF\n",
    "                A, B, C, D = D, new_B, B, C\n",
    "            # Add this chunk's hash to result so far:\n",
    "            a0 = (a0 + A) & 0xFFFFFFFF\n",
    "            b0 = (b0 + B) & 0xFFFFFFFF\n",
    "            c0 = (c0 + C) & 0xFFFFFFFF\n",
    "            d0 = (d0 + D) & 0xFFFFFFFF\n",
    "        # 3. Conclusion\n",
    "        self.hash_pieces = [a0, b0, c0, d0]\n",
    "\n",
    "    def digest(self):\n",
    "        return sum(leftshift(x, (32 * i)) for i, x in enumerate(self.hash_pieces))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also write a function to directly compute the hex digest from some bytes data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def hash_MD5(data):\n",
    "    \"\"\" Shortcut function to directly receive the hex digest from MD5(data).\"\"\"\n",
    "    h = MD5()\n",
    "    if isinstance(data, str):\n",
    "        data = bytes(data, encoding='utf8')\n",
    "    h.update(data)\n",
    "    return h.hexdigest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div style=\"text-align:right;\"><blockquote> *Note:* [This page helped for debugging](https://rosettacode.org/wiki/MD5/Implementation#Python).</blockquote></div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### First check on MD5\n",
    "\n",
    "Let us try it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<__main__.MD5 at 0x7fc84c46b668>"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "MD5\n"
     ]
    }
   ],
   "source": [
    "h2 = MD5()\n",
    "h2\n",
    "print(h2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "52666558089014014065978771967570616878"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "h2.update(data)\n",
    "h2.digest()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'2e224cd661b6b83e0f3a0a06cb359f27'"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "h2.hexdigest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### A less stupid check on MD5\n",
    "\n",
    "Let try the example from [MD5 Wikipedia page](https://en.wikipedia.org/wiki/MD5#MD5_hashes) :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'9e107d9d372bb6826bd81d3542a419d6'"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_MD5(\"The quick brown fox jumps over the lazy dog\")\n",
    "assert hash_MD5(\"The quick brown fox jumps over the lazy dog\") == '9e107d9d372bb6826bd81d3542a419d6'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Even a small change in the message will (with overwhelming probability) result in a mostly different hash, due to the [**avalanche effect**](https://en.wikipedia.org/wiki/Avalanche_effect). For example, adding a period to the end of the sentence:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'e4d909c290d0fb1ca068ffaddf22cbd0'"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_MD5(\"The quick brown fox jumps over the lazy dog.\")\n",
    "assert hash_MD5(\"The quick brown fox jumps over the lazy dog.\") == 'e4d909c290d0fb1ca068ffaddf22cbd0'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The hash of the zero-length string is:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'d41d8cd98f00b204e9800998ecf8427e'"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_MD5(\"\")\n",
    "assert hash_MD5(\"\") == 'd41d8cd98f00b204e9800998ecf8427e'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\\implies$ We obtained the same result, OK our function works!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Trying 1000 random examples\n",
    "On a small sentence:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'0ad8cb82874690906cf732223adeebbe'"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_MD5(\"My name is Zorro !\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'0ad8cb82874690906cf732223adeebbe'"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "h = hashlib.md5()\n",
    "h.update(b\"My name is Zorro !\")\n",
    "h.hexdigest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It starts to look good."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def true_hash_MD5(data):\n",
    "    h = hashlib.md5()\n",
    "    if isinstance(data, str):\n",
    "        data = bytes(data, encoding='utf8')\n",
    "    h.update(data)\n",
    "    return h.hexdigest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "On some random data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import numpy.random as nr\n",
    "alphabets = \"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\"\n",
    "\n",
    "def random_string(size=10000):\n",
    "    return ''.join(alphabets[nr.randint(len(alphabets))] for _ in range(size))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'3rsYqmZDxE'"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "random_string(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from tqdm import tqdm_notebook as tqdm"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "543521e9ef89409c94620250e345c1d7"
      }
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "CPU times: user 33.1 s, sys: 12 ms, total: 33.1 s\n",
      "Wall time: 33.1 s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "for _ in tqdm(range(1000)):\n",
    "    x = random_string()\n",
    "    assert hash_MD5(x) == true_hash_MD5(x), \"Error: x = {} gave two different MD5 hashes: my implementation = {} != hashlib implementation = {}...\".format(x, hash_MD5(x), true_hash_MD5(x))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----\n",
    "## Second real example: the SHA-1 hashing function"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let now study and implement another hashing function, slightly harder to write but more secure: SHA1, \"Secure Hash Algorithm, version 1\".\n",
    "See [the SHA1 hashing function](https://en.wikipedia.org/wiki/SHA-1) on Wikipedia, if needed.\n",
    "\n",
    "<center><span style=\"font-size: large; color: red;\"><b>Warning</b>: it is considered broken since at least 2011, it is not advised to use it for real security purposes. SHA-2 or SHA-3 is better advised.</span></center>\n",
    "\n",
    "For instance, see [this nice blog post](https://konklone.com/post/why-google-is-hurrying-the-web-to-kill-sha-1)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Useful functions the SHA-1 algorithm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Pretty similar to the ones used for the MD5 algorithm."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def SHA1_f1(b, c, d):\n",
    "    \"\"\" First ternary bitwise operation.\"\"\"\n",
    "    return ((b & c) | ((~b) & d)) & 0xFFFFFFFF\n",
    "\n",
    "def SHA1_f2(b, c, d):\n",
    "    \"\"\" Second ternary bitwise operation.\"\"\"\n",
    "    return (b ^ c ^ d) & 0xFFFFFFFF\n",
    "\n",
    "def SHA1_f3(b, c, d):\n",
    "    \"\"\" Third ternary bitwise operation.\"\"\"\n",
    "    return ((b & c) | (b & d) | (c & d) ) & 0xFFFFFFFF\n",
    "\n",
    "def SHA1_f4(b, c, d):\n",
    "    \"\"\" Forth ternary bitwise operation, = SHA1_f1.\"\"\"\n",
    "    return (b ^ c ^ d) & 0xFFFFFFFF"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is exactly like for MD5."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def leftrotate(x, c):\n",
    "    \"\"\" Left rotate the number x by c bytes.\"\"\"\n",
    "    x &= 0xFFFFFFFF\n",
    "    return ((x << c) | (x >> (32 - c))) & 0xFFFFFFFF"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As SHA-1 plays with big-endian and little-endian integers, and at the end it requires a leftshift to combine the 5 hash pieces into one."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def leftshift(x, c):\n",
    "    \"\"\" Left shift the number x by c bytes.\"\"\"\n",
    "    return x << c"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### The `SHA1` class\n",
    "\n",
    "I will use a simple class, very similar to the class used for the MD5 algorithm (see above).\n",
    "It is a direct implementation of the pseudo-code, as given for instance on the Wikipedia page."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class SHA1(Hash):\n",
    "    \"\"\"SHA1 hashing, see https://en.wikipedia.org/wiki/SHA-1#Algorithm.\"\"\"\n",
    "    \n",
    "    def __init__(self):\n",
    "        self.name        = \"SHA1\"\n",
    "        self.byteorder   = 'big'\n",
    "        self.block_size  = 64\n",
    "        self.digest_size = 20\n",
    "        # Initialize variables\n",
    "        h0 = 0x67452301\n",
    "        h1 = 0xEFCDAB89\n",
    "        h2 = 0x98BADCFE\n",
    "        h3 = 0x10325476\n",
    "        h4 = 0xC3D2E1F0\n",
    "        # Store them\n",
    "        self.hash_pieces = [h0, h1, h2, h3, h4]\n",
    "    \n",
    "    def update(self, arg):\n",
    "        h0, h1, h2, h3, h4 = self.hash_pieces\n",
    "        # 1. Pre-processing, exactly like MD5\n",
    "        data = bytearray(arg)\n",
    "        orig_len_in_bits = (8 * len(data)) & 0xFFFFFFFFFFFFFFFF\n",
    "        # 1.a. Add a single '1' bit at the end of the input bits\n",
    "        data.append(0x80)\n",
    "        # 1.b. Padding with zeros as long as the input bits length ≡ 448 (mod 512)\n",
    "        while len(data) % 64 != 56:\n",
    "            data.append(0)\n",
    "        # 1.c. append original length in bits mod (2 pow 64) to message\n",
    "        data += orig_len_in_bits.to_bytes(8, byteorder='big')\n",
    "        assert len(data) % 64 == 0, \"Error in padding\"\n",
    "        # 2. Computations\n",
    "        # Process the message in successive 512-bit = 64-bytes chunks:\n",
    "        for offset in range(0, len(data), 64):\n",
    "            # 2.a. 512-bits = 64-bytes chunks\n",
    "            chunks = data[offset : offset + 64]\n",
    "            w = [0 for i in range(80)]\n",
    "            # 2.b. Break chunk into sixteen 32-bit = 4-bytes words w[i], 0 ≤ i ≤ 15\n",
    "            for i in range(16):\n",
    "                w[i] = int.from_bytes(chunks[4*i : 4*i + 4], byteorder='big')\n",
    "            # 2.c. Extend the sixteen 32-bit words into eighty 32-bit words\n",
    "            for i in range(16, 80):\n",
    "                w[i] = leftrotate(w[i-3] ^ w[i-8] ^ w[i-14] ^ w[i-16], 1)\n",
    "            # 2.d. Initialize hash value for this chunk\n",
    "            a, b, c, d, e = h0, h1, h2, h3, h4\n",
    "            # 2.e. Main loop, cf. http://www.faqs.org/rfcs/rfc3174.html\n",
    "            for i in range(80):\n",
    "                if 0 <= i <= 19 :\n",
    "                    f = SHA1_f1(b, c, d)\n",
    "                    k = 0x5A827999\n",
    "                elif 20 <= i <= 39 :\n",
    "                    f = SHA1_f2(b, c, d)\n",
    "                    k = 0x6ED9EBA1\n",
    "                elif 40 <= i <= 59 :\n",
    "                    f = SHA1_f3(b, c, d)\n",
    "                    k = 0x8F1BBCDC\n",
    "                elif 60 <= i <= 79 :\n",
    "                    f = SHA1_f4(b, c, d)\n",
    "                    k = 0xCA62C1D6\n",
    "\n",
    "                new_a = leftrotate(a, 5) + f + e + k + w[i] & 0xFFFFFFFF\n",
    "                new_c = leftrotate(b, 30)\n",
    "                # Rotate the 5 variables\n",
    "                a, b, c, d, e = new_a, a, new_c, c, d\n",
    "\n",
    "            # Add this chunk's hash to result so far:\n",
    "            h0 = (h0 + a) & 0xFFFFFFFF\n",
    "            h1 = (h1 + b) & 0xFFFFFFFF\n",
    "            h2 = (h2 + c) & 0xFFFFFFFF\n",
    "            h3 = (h3 + d) & 0xFFFFFFFF\n",
    "            h4 = (h4 + e) & 0xFFFFFFFF\n",
    "        # 3. Conclusion\n",
    "        self.hash_pieces = [h0, h1, h2, h3, h4]\n",
    "\n",
    "    def digest(self):\n",
    "        return sum(leftshift(x, 32 * i) for i, x in enumerate(self.hash_pieces[::-1]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also write a function to directly compute the hex digest from some bytes data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def hash_SHA1(data):\n",
    "    \"\"\" Shortcut function to directly receive the hex digest from SHA1(data).\"\"\"\n",
    "    h = SHA1()\n",
    "    if isinstance(data, str):\n",
    "        data = bytes(data, encoding='utf8')\n",
    "    h.update(data)\n",
    "    return h.hexdigest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### First check on SHA-1\n",
    "\n",
    "Let us try it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<__main__.SHA1 at 0x7fc82832f6a0>"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "SHA1\n"
     ]
    }
   ],
   "source": [
    "h3 = SHA1()\n",
    "h3\n",
    "print(h3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "144539618284518333681008855956641116845695054279"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "h3.update(data)\n",
    "h3.digest()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'0b1100101010001011000010111000111101000100111010111100000100010111111110001010100111110000001000110111111011010111001110100001011101000111111000011000111000111'"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "159"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "digest = h3.digest()\n",
    "bin(digest)\n",
    "len(bin(digest))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'19516171e89d7822ff153e046fdae742e8fc31c7'"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "h3.hexdigest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### A less stupid check on SHA-1\n",
    "\n",
    "Let try the example from [SHA-1 Wikipedia page](https://en.wikipedia.org/wiki/SHA-1#SHA-1_hashes) :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'2fd4e1c67a2d28fced849ee1bb76e7391b93eb12'"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_SHA1(\"The quick brown fox jumps over the lazy dog\")\n",
    "assert hash_SHA1(\"The quick brown fox jumps over the lazy dog\") == '2fd4e1c67a2d28fced849ee1bb76e7391b93eb12'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Even a small change in the message will (with overwhelming probability) result in a mostly different hash, due to the [**avalanche effect**](https://en.wikipedia.org/wiki/Avalanche_effect). For example, changing one letter in the sentence:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'de9f2c7fd25e1b3afad3e85a0bd17d9b100db4b3'"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_SHA1(\"The quick brown fox jumps over the lazy cog\")\n",
    "assert hash_SHA1(\"The quick brown fox jumps over the lazy cog\") == 'de9f2c7fd25e1b3afad3e85a0bd17d9b100db4b3'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The hash of the zero-length string is:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'da39a3ee5e6b4b0d3255bfef95601890afd80709'"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_SHA1(\"\")\n",
    "assert hash_SHA1(\"\") == 'da39a3ee5e6b4b0d3255bfef95601890afd80709'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\\implies$ We obtained the same result, OK our function works!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Trying 1000 random examples\n",
    "On a small sentence:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'1f475bf19d9e7dd6b3714164116392ee1e477ec5'"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_SHA1(\"My name is Zorro !\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'1f475bf19d9e7dd6b3714164116392ee1e477ec5'"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "h = hashlib.sha1()\n",
    "h.update(b\"My name is Zorro !\")\n",
    "h.hexdigest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It starts to look good."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def true_hash_SHA1(data):\n",
    "    h = hashlib.sha1()\n",
    "    if isinstance(data, str):\n",
    "        data = bytes(data, encoding='utf8')\n",
    "    h.update(data)\n",
    "    return h.hexdigest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "On some random data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'lzqCyNkHUQ'"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "random_string(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from tqdm import tqdm_notebook as tqdm"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "15b0ae19b01744b591ef9a86f4ceafa4"
      }
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "CPU times: user 38.4 s, sys: 72 ms, total: 38.4 s\n",
      "Wall time: 38.5 s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "for _ in tqdm(range(1000)):\n",
    "    x = random_string()\n",
    "    assert hash_SHA1(x) == true_hash_SHA1(x), \"Error: x = {} gave two different SHA1 hashes: my implementation = {} != hashlib implementation = {}...\".format(x, hash_SHA1(x), true_hash_SHA1(x))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----\n",
    "## Third real example: the SHA-2 hashing function"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let now study and implement a last hashing function, again slightly harder to write but more secure: SHA-2, \"Secure Hash Algorithm, version 2\".\n",
    "See [the SHA-2 hashing function](https://en.wikipedia.org/wiki/SHA-2) on Wikipedia, if needed.\n",
    "\n",
    "<center><span style=\"font-size: large; color: green;\"><i>Remark</i>: it is not (yet) considered broken, and it is the military standard for security and cryptographic hashing. SHA-3 is preferred for security purposes.</span></center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Useful functions the SHA-2 algorithm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is exactly like for MD5. But SHA-2 requires right-rotate as well."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def leftrotate(x, c):\n",
    "    \"\"\" Left rotate the number x by c bytes.\"\"\"\n",
    "    x &= 0xFFFFFFFF\n",
    "    return ((x << c) | (x >> (32 - c))) & 0xFFFFFFFF\n",
    "\n",
    "def rightrotate(x, c):\n",
    "    \"\"\" Right rotate the number x by c bytes.\"\"\"\n",
    "    x &= 0xFFFFFFFF\n",
    "    return ((x >> c) | (x << (32 - c))) & 0xFFFFFFFF"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As SHA-2 plays with big-endian and little-endian integers, and at the end it requires a leftshift to combine the 5 hash pieces into one."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def leftshift(x, c):\n",
    "    \"\"\" Left shift the number x by c bytes.\"\"\"\n",
    "    return x << c\n",
    "\n",
    "def rightshift(x, c):\n",
    "    \"\"\" Right shift the number x by c bytes.\"\"\"\n",
    "    return x >> c"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### The `SHA2` class\n",
    "\n",
    "I will use a simple class, very similar to the class used for the SHA-1 algorithm (see above).\n",
    "It is a direct implementation of the pseudo-code, as given for instance on the Wikipedia page.\n",
    "\n",
    "I will only implement the simpler one, SHA-256, of digest size of 256 bits. Other variants are SHA-224, SHA-384, SHA-512 (and others include SHA-512/224, SHA-512/256)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class SHA2(Hash):\n",
    "    \"\"\"SHA256 hashing, see https://en.wikipedia.org/wiki/SHA-2#Pseudocode.\"\"\"\n",
    "    \n",
    "    def __init__(self):\n",
    "        self.name        = \"SHA256\"\n",
    "        self.byteorder   = 'big'\n",
    "        self.block_size  = 64\n",
    "        self.digest_size = 32\n",
    "        # Note 2: For each round, there is one round constant k[i] and one entry in the message schedule array w[i], 0 ≤ i ≤ 63\n",
    "        # Note 3: The compression function uses 8 working variables, a through h\n",
    "        # Note 4: Big-endian convention is used when expressing the constants in this pseudocode,\n",
    "        #         and when parsing message block data from bytes to words, for example,\n",
    "        #         the first word of the input message \"abc\" after padding is 0x61626380\n",
    "\n",
    "        # Initialize hash values:\n",
    "        # (first 32 bits of the fractional parts of the square roots of the first 8 primes 2..19):\n",
    "        h0 = 0x6a09e667\n",
    "        h1 = 0xbb67ae85\n",
    "        h2 = 0x3c6ef372\n",
    "        h3 = 0xa54ff53a\n",
    "        h4 = 0x510e527f\n",
    "        h5 = 0x9b05688c\n",
    "        h6 = 0x1f83d9ab\n",
    "        h7 = 0x5be0cd19\n",
    "        \n",
    "        # Initialize array of round constants:\n",
    "        # (first 32 bits of the fractional parts of the cube roots of the first 64 primes 2..311):\n",
    "        self.k = [\n",
    "            0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,\n",
    "            0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174,\n",
    "            0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da,\n",
    "            0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967,\n",
    "            0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85,\n",
    "            0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,\n",
    "            0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3,\n",
    "            0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2\n",
    "        ]\n",
    "\n",
    "        # Store them\n",
    "        self.hash_pieces = [h0, h1, h2, h3, h4, h5, h6, h7]\n",
    "    \n",
    "    def update(self, arg):\n",
    "        h0, h1, h2, h3, h4, h5, h6, h7 = self.hash_pieces\n",
    "        # 1. Pre-processing, exactly like MD5\n",
    "        data = bytearray(arg)\n",
    "        orig_len_in_bits = (8 * len(data)) & 0xFFFFFFFFFFFFFFFF\n",
    "        # 1.a. Add a single '1' bit at the end of the input bits\n",
    "        data.append(0x80)\n",
    "        # 1.b. Padding with zeros as long as the input bits length ≡ 448 (mod 512)\n",
    "        while len(data) % 64 != 56:\n",
    "            data.append(0)\n",
    "        # 1.c. append original length in bits mod (2 pow 64) to message\n",
    "        data += orig_len_in_bits.to_bytes(8, byteorder='big')\n",
    "        assert len(data) % 64 == 0, \"Error in padding\"\n",
    "        # 2. Computations\n",
    "        # Process the message in successive 512-bit = 64-bytes chunks:\n",
    "        for offset in range(0, len(data), 64):\n",
    "            # 2.a. 512-bits = 64-bytes chunks\n",
    "            chunks = data[offset : offset + 64]\n",
    "            w = [0 for i in range(64)]\n",
    "            # 2.b. Break chunk into sixteen 32-bit = 4-bytes words w[i], 0 ≤ i ≤ 15\n",
    "            for i in range(16):\n",
    "                w[i] = int.from_bytes(chunks[4*i : 4*i + 4], byteorder='big')\n",
    "            # 2.c.  Extend the first 16 words into the remaining 48\n",
    "            #       words w[16..63] of the message schedule array:\n",
    "            for i in range(16, 64):\n",
    "                s0 = (rightrotate(w[i-15], 7) ^ rightrotate(w[i-15], 18) ^ rightshift(w[i-15], 3)) & 0xFFFFFFFF\n",
    "                s1 = (rightrotate(w[i-2], 17) ^ rightrotate(w[i-2], 19) ^ rightshift(w[i-2], 10)) & 0xFFFFFFFF\n",
    "                w[i] = (w[i-16] + s0 + w[i-7] + s1) & 0xFFFFFFFF\n",
    "            # 2.d. Initialize hash value for this chunk\n",
    "            a, b, c, d, e, f, g, h = h0, h1, h2, h3, h4, h5, h6, h7\n",
    "            # 2.e. Main loop, cf. https://tools.ietf.org/html/rfc6234\n",
    "            for i in range(64):\n",
    "                S1 = (rightrotate(e, 6) ^ rightrotate(e, 11) ^ rightrotate(e, 25)) & 0xFFFFFFFF\n",
    "                ch = ((e & f) ^ ((~e) & g)) & 0xFFFFFFFF\n",
    "                temp1 = (h + S1 + ch + self.k[i] + w[i]) & 0xFFFFFFFF\n",
    "                S0 = (rightrotate(a, 2) ^ rightrotate(a, 13) ^ rightrotate(a, 22)) & 0xFFFFFFFF\n",
    "                maj = ((a & b) ^ (a & c) ^ (b & c)) & 0xFFFFFFFF\n",
    "                temp2 = (S0 + maj) & 0xFFFFFFFF\n",
    "\n",
    "                new_a = (temp1 + temp2) & 0xFFFFFFFF\n",
    "                new_e = (d + temp1) & 0xFFFFFFFF\n",
    "                # Rotate the 8 variables\n",
    "                a, b, c, d, e, f, g, h = new_a, a, b, c, new_e, e, f, g\n",
    "\n",
    "            # Add this chunk's hash to result so far:\n",
    "            h0 = (h0 + a) & 0xFFFFFFFF\n",
    "            h1 = (h1 + b) & 0xFFFFFFFF\n",
    "            h2 = (h2 + c) & 0xFFFFFFFF\n",
    "            h3 = (h3 + d) & 0xFFFFFFFF\n",
    "            h4 = (h4 + e) & 0xFFFFFFFF\n",
    "            h5 = (h5 + f) & 0xFFFFFFFF\n",
    "            h6 = (h6 + g) & 0xFFFFFFFF\n",
    "            h7 = (h7 + h) & 0xFFFFFFFF\n",
    "        # 3. Conclusion\n",
    "        self.hash_pieces = [h0, h1, h2, h3, h4, h5, h6, h7]\n",
    "\n",
    "    def digest(self):\n",
    "        # h0 append h1 append h2 append h3 append h4 append h5 append h6 append h7\n",
    "        return sum(leftshift(x, 32 * i) for i, x in enumerate(self.hash_pieces[::-1]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also write a function to directly compute the hex digest from some bytes data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def hash_SHA2(data):\n",
    "    \"\"\" Shortcut function to directly receive the hex digest from SHA2(data).\"\"\"\n",
    "    h = SHA2()\n",
    "    if isinstance(data, str):\n",
    "        data = bytes(data, encoding='utf8')\n",
    "    h.update(data)\n",
    "    return h.hexdigest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### First check on SHA-2\n",
    "\n",
    "Let us try it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<__main__.SHA2 at 0x7fc82727f470>"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "SHA256\n"
     ]
    }
   ],
   "source": [
    "h4 = SHA2()\n",
    "h4\n",
    "print(h4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "72202089257263067108821672097406107534247682087137282803352466770222342186230"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "h4.update(data)\n",
    "h4.digest()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'0b1001111110100000111011110010111110100111101111001011110000000101110110010001110010001110000100110000100010010011011010011010001000100000100010011100101100000010000011011001000001010000011011101110100010000011010011001100011000001001110011111010010011110110'"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "258"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "digest = h4.digest()\n",
    "bin(digest)\n",
    "len(bin(digest))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'9fa0ef2fa7bcbc05d91c8e13089369a22089cb020d90506ee8834cc609cfa4f6'"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "h4.hexdigest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Check on SHA-2\n",
    "\n",
    "Let try the example from [SHA-2 Wikipedia page](https://en.wikipedia.org/wiki/SHA-2#Test_vectors) :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'d7a8fbb307d7809469ca9abcb0082e4f8d5651e46d3cdb762d02d0bf37c9e592'"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_SHA2(\"The quick brown fox jumps over the lazy dog\")\n",
    "assert hash_SHA2(\"The quick brown fox jumps over the lazy dog\") == 'd7a8fbb307d7809469ca9abcb0082e4f8d5651e46d3cdb762d02d0bf37c9e592'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Even a small change in the message will (with overwhelming probability) result in a mostly different hash, due to the [**avalanche effect**](https://en.wikipedia.org/wiki/Avalanche_effect). For example, adding a period at the end of the sentence:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'ef537f25c895bfa782526529a9b63d97aa631564d5d789c2b765448c8635fb6c'"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_SHA2(\"The quick brown fox jumps over the lazy dog.\")\n",
    "assert hash_SHA2(\"The quick brown fox jumps over the lazy dog.\") == 'ef537f25c895bfa782526529a9b63d97aa631564d5d789c2b765448c8635fb6c'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The hash of the zero-length string is:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'"
      ]
     },
     "execution_count": 58,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_SHA2(\"\")\n",
    "assert hash_SHA2(\"\") == 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\\implies$ We obtained the same result, OK our function works!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Trying 1000 random examples\n",
    "On a small sentence:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'10a6dec377c28d2d34001c103760339f8d6ab02660d200d6014329b86253552c'"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_SHA2(\"My name is Zorro !\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'10a6dec377c28d2d34001c103760339f8d6ab02660d200d6014329b86253552c'"
      ]
     },
     "execution_count": 60,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "h = hashlib.sha256()\n",
    "h.update(b\"My name is Zorro !\")\n",
    "h.hexdigest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It starts to look good."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def true_hash_SHA2(data):\n",
    "    h = hashlib.sha256()\n",
    "    if isinstance(data, str):\n",
    "        data = bytes(data, encoding='utf8')\n",
    "    h.update(data)\n",
    "    return h.hexdigest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "On some random data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'CpGQBPn3dP'"
      ]
     },
     "execution_count": 62,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "random_string(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from tqdm import tqdm_notebook as tqdm"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "dd08b930c4f04376a0827792342e12df"
      }
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "CPU times: user 1min 1s, sys: 52 ms, total: 1min 1s\n",
      "Wall time: 1min 1s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "for _ in tqdm(range(1000)):\n",
    "    x = random_string()\n",
    "    assert hash_SHA2(x) == true_hash_SHA2(x), \"Error: x = {} gave two different SHA2 hashes: my implementation = {} != hashlib implementation = {}...\".format(x, hash_SHA2(x), true_hash_SHA2(x))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----\n",
    "## Comparison : MD5 vs SHA-1 vs SHA-2\n",
    "\n",
    "It can be interesting to compare each hash functions, with respect to its time complexity."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "36 ms ± 4.58 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
     ]
    }
   ],
   "source": [
    "def test_MD5():\n",
    "    x = random_string()\n",
    "    return hash_MD5(x) == true_hash_MD5(x)\n",
    "\n",
    "%timeit test_MD5()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "40.2 ms ± 2.22 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
     ]
    }
   ],
   "source": [
    "def test_SHA1():\n",
    "    x = random_string()\n",
    "    return hash_SHA1(x) == true_hash_SHA1(x)\n",
    "\n",
    "%timeit test_SHA1()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "61.4 ms ± 2.65 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
     ]
    }
   ],
   "source": [
    "def test_SHA2():\n",
    "    x = random_string()\n",
    "    return hash_SHA2(x) == true_hash_SHA2(x)\n",
    "\n",
    "%timeit test_SHA2()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As expected, the MD5 hash is the fastest, and SHA-2 is slower than SHA-1.\n",
    "\n",
    "$\\implies$ The more secure, the slowest. Of course."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----\n",
    "## Bonus : SHA-2 variants\n",
    "\n",
    "Now that we have implemented SHA-256, it should be quick to add the SHA-224 variant, which is simply the SHA-256 with different initial values and a shorter digest.\n",
    "\n",
    "The SHA-512 variant is working with 64-bits words and 1024-bits chunks, instead of 32-bits words and 512-bits chunks, and the SHA-384 is very similar to SHA-512 with different initial values and a shorter digest.\n",
    "\n",
    "> Sorry about the length of this part, I know all these variants are very similar, but I wanted to write them all."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### SHA-224\n",
    "\n",
    "As said on the Wikipedia page:\n",
    "> SHA-224 is identical to SHA-256, except that:\n",
    "> - the initial hash values `h0` through `h7` are different, and\n",
    "> - the output is constructed by omitting `h7`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### The `SHA224` class"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class SHA224(Hash):\n",
    "    \"\"\"SHA224 hashing, see https://en.wikipedia.org/wiki/SHA-2#Pseudocode.\"\"\"\n",
    "    \n",
    "    def __init__(self):\n",
    "        self.name        = \"SHA224\"\n",
    "        self.byteorder   = 'big'\n",
    "        self.block_size  = 64\n",
    "        self.digest_size = 28\n",
    "        # Note 2: For each round, there is one round constant k[i] and one entry in the message schedule array w[i], 0 ≤ i ≤ 63\n",
    "        # Note 3: The compression function uses 8 working variables, a through h\n",
    "        # Note 4: Big-endian convention is used when expressing the constants in this pseudocode,\n",
    "        #         and when parsing message block data from bytes to words, for example,\n",
    "        #         the first word of the input message \"abc\" after padding is 0x61626380\n",
    "\n",
    "        # Initialize hash values:\n",
    "        # (The second 32 bits of the fractional parts of the square roots of the 9th through 16th primes 23..53)\n",
    "        h0 = 0xc1059ed8\n",
    "        h1 = 0x367cd507\n",
    "        h2 = 0x3070dd17\n",
    "        h3 = 0xf70e5939\n",
    "        h4 = 0xffc00b31\n",
    "        h5 = 0x68581511\n",
    "        h6 = 0x64f98fa7\n",
    "        h7 = 0xbefa4fa4\n",
    "        \n",
    "        # Initialize array of round constants:\n",
    "        # (first 32 bits of the fractional parts of the cube roots of the first 64 primes 2..311):\n",
    "        self.k = [\n",
    "            0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,\n",
    "            0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174,\n",
    "            0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da,\n",
    "            0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967,\n",
    "            0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85,\n",
    "            0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,\n",
    "            0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3,\n",
    "            0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2\n",
    "        ]\n",
    "\n",
    "        # Store them\n",
    "        self.hash_pieces = [h0, h1, h2, h3, h4, h5, h6, h7]\n",
    "    \n",
    "    def update(self, arg):\n",
    "        h0, h1, h2, h3, h4, h5, h6, h7 = self.hash_pieces\n",
    "        # 1. Pre-processing, exactly like MD5\n",
    "        data = bytearray(arg)\n",
    "        orig_len_in_bits = (8 * len(data)) & 0xFFFFFFFFFFFFFFFF\n",
    "        # 1.a. Add a single '1' bit at the end of the input bits\n",
    "        data.append(0x80)\n",
    "        # 1.b. Padding with zeros as long as the input bits length ≡ 448 (mod 512)\n",
    "        while len(data) % 64 != 56:\n",
    "            data.append(0)\n",
    "        # 1.c. append original length in bits mod (2 pow 64) to message\n",
    "        data += orig_len_in_bits.to_bytes(8, byteorder='big')\n",
    "        assert len(data) % 64 == 0, \"Error in padding\"\n",
    "        # 2. Computations\n",
    "        # Process the message in successive 512-bit = 64-bytes chunks:\n",
    "        for offset in range(0, len(data), 64):\n",
    "            # 2.a. 512-bits = 64-bytes chunks\n",
    "            chunks = data[offset : offset + 64]\n",
    "            w = [0 for i in range(64)]\n",
    "            # 2.b. Break chunk into sixteen 32-bit = 4-bytes words w[i], 0 ≤ i ≤ 15\n",
    "            for i in range(16):\n",
    "                w[i] = int.from_bytes(chunks[4*i : 4*i + 4], byteorder='big')\n",
    "            # 2.c.  Extend the first 16 words into the remaining 48\n",
    "            #       words w[16..63] of the message schedule array:\n",
    "            for i in range(16, 64):\n",
    "                s0 = (rightrotate(w[i-15], 7) ^ rightrotate(w[i-15], 18) ^ rightshift(w[i-15], 3)) & 0xFFFFFFFF\n",
    "                s1 = (rightrotate(w[i-2], 17) ^ rightrotate(w[i-2], 19) ^ rightshift(w[i-2], 10)) & 0xFFFFFFFF\n",
    "                w[i] = (w[i-16] + s0 + w[i-7] + s1) & 0xFFFFFFFF\n",
    "            # 2.d. Initialize hash value for this chunk\n",
    "            a, b, c, d, e, f, g, h = h0, h1, h2, h3, h4, h5, h6, h7\n",
    "            # 2.e. Main loop, cf. https://tools.ietf.org/html/rfc6234\n",
    "            for i in range(64):\n",
    "                S1 = (rightrotate(e, 6) ^ rightrotate(e, 11) ^ rightrotate(e, 25)) & 0xFFFFFFFF\n",
    "                ch = ((e & f) ^ ((~e) & g)) & 0xFFFFFFFF\n",
    "                temp1 = (h + S1 + ch + self.k[i] + w[i]) & 0xFFFFFFFF\n",
    "                S0 = (rightrotate(a, 2) ^ rightrotate(a, 13) ^ rightrotate(a, 22)) & 0xFFFFFFFF\n",
    "                maj = ((a & b) ^ (a & c) ^ (b & c)) & 0xFFFFFFFF\n",
    "                temp2 = (S0 + maj) & 0xFFFFFFFF\n",
    "\n",
    "                new_a = (temp1 + temp2) & 0xFFFFFFFF\n",
    "                new_e = (d + temp1) & 0xFFFFFFFF\n",
    "                # Rotate the 8 variables\n",
    "                a, b, c, d, e, f, g, h = new_a, a, b, c, new_e, e, f, g\n",
    "\n",
    "            # Add this chunk's hash to result so far:\n",
    "            h0 = (h0 + a) & 0xFFFFFFFF\n",
    "            h1 = (h1 + b) & 0xFFFFFFFF\n",
    "            h2 = (h2 + c) & 0xFFFFFFFF\n",
    "            h3 = (h3 + d) & 0xFFFFFFFF\n",
    "            h4 = (h4 + e) & 0xFFFFFFFF\n",
    "            h5 = (h5 + f) & 0xFFFFFFFF\n",
    "            h6 = (h6 + g) & 0xFFFFFFFF\n",
    "            h7 = (h7 + h) & 0xFFFFFFFF\n",
    "        # 3. Conclusion\n",
    "        self.hash_pieces = [h0, h1, h2, h3, h4, h5, h6, h7]\n",
    "\n",
    "    def digest(self):\n",
    "        # h0 append h1 append h2 append h3 append h4 append h5 append h6\n",
    "        pieces_without_h7 = self.hash_pieces[:-1]\n",
    "        return sum(leftshift(x, 32 * i) for i, x in enumerate(pieces_without_h7[::-1]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also write a function to directly compute the hex digest from some bytes data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def hash_SHA224(data):\n",
    "    \"\"\" Shortcut function to directly receive the hex digest from SHA224(data).\"\"\"\n",
    "    h = SHA224()\n",
    "    if isinstance(data, str):\n",
    "        data = bytes(data, encoding='utf8')\n",
    "    h.update(data)\n",
    "    return h.hexdigest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Checks on SHA-224\n",
    "\n",
    "Let try the example from [SHA-2 Wikipedia page](https://en.wikipedia.org/wiki/SHA-2#Test_vectors) :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def true_hash_SHA224(data):\n",
    "    h = hashlib.sha224()\n",
    "    if isinstance(data, str):\n",
    "        data = bytes(data, encoding='utf8')\n",
    "    h.update(data)\n",
    "    return h.hexdigest()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'730e109bd7a8a32b1cb9d9a09aa2325d2430587ddbc0c38bad911525'"
      ]
     },
     "execution_count": 71,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_SHA224(\"The quick brown fox jumps over the lazy dog\")\n",
    "assert hash_SHA224(\"The quick brown fox jumps over the lazy dog\") == '730e109bd7a8a32b1cb9d9a09aa2325d2430587ddbc0c38bad911525'\n",
    "assert hash_SHA224(\"The quick brown fox jumps over the lazy dog\") == true_hash_SHA224(\"The quick brown fox jumps over the lazy dog\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Even a small change in the message will (with overwhelming probability) result in a mostly different hash, due to the [**avalanche effect**](https://en.wikipedia.org/wiki/Avalanche_effect). For example, adding a period at the end of the sentence:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'619cba8e8e05826e9b8c519c0a5c68f4fb653e8a3d8aa04bb2c8cd4c'"
      ]
     },
     "execution_count": 72,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_SHA224(\"The quick brown fox jumps over the lazy dog.\")\n",
    "assert hash_SHA224(\"The quick brown fox jumps over the lazy dog.\") == '619cba8e8e05826e9b8c519c0a5c68f4fb653e8a3d8aa04bb2c8cd4c'\n",
    "assert hash_SHA224(\"The quick brown fox jumps over the lazy dog.\") == true_hash_SHA224(\"The quick brown fox jumps over the lazy dog.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The hash of the zero-length string is:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'d14a028c2a3a2bc9476102bb288234c415a2b01f828ea62ac5b3e42f'"
      ]
     },
     "execution_count": 73,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_SHA224(\"\")\n",
    "assert hash_SHA224(\"\") == 'd14a028c2a3a2bc9476102bb288234c415a2b01f828ea62ac5b3e42f'\n",
    "assert hash_SHA224(\"\") == true_hash_SHA224(\"\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\\implies$ We obtained the same result, OK our function works!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### SHA-512\n",
    "\n",
    "As said on the Wikipedia page:\n",
    "\n",
    "> SHA-512 is identical in structure to SHA-256, but:\n",
    "> \n",
    "> - the message is broken into 1024-bit chunks,\n",
    "> - the initial hash values and round constants are extended to 64 bits,\n",
    "> - there are 80 rounds instead of 64,\n",
    "> - the message schedule array w has 80 64-bit words instead of 64 32-bit words,\n",
    "> - to extend the message schedule array w, the loop is from 16 to 79 instead of from 16 to 63,\n",
    "> - the round constants are based on the first 80 primes 2..409,\n",
    "> - the word size used for calculations is 64 bits long,\n",
    "> - the appended length of the message (before pre-processing), in bits, is a 128-bit big-endian integer, and\n",
    "> - the shift and rotate amounts used are different.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Useful functions the SHA-512 algorithm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is exactly like for SHA-256, except we work with 64-bits words instead of 32-bits."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def leftrotate_64(x, c):\n",
    "    \"\"\" Left rotate the number x by c bytes, for 64-bits numbers.\"\"\"\n",
    "    x &= 0xFFFFFFFFFFFFFFFF\n",
    "    return ((x << c) | (x >> (64 - c))) & 0xFFFFFFFFFFFFFFFF\n",
    "\n",
    "def rightrotate_64(x, c):\n",
    "    \"\"\" Right rotate the number x by c bytes, for 64-bits numbers.\"\"\"\n",
    "    x &= 0xFFFFFFFFFFFFFFFF\n",
    "    return ((x >> c) | (x << (64 - c))) & 0xFFFFFFFFFFFFFFFF"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### The `SHA512` class"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class SHA512(Hash):\n",
    "    \"\"\"SHA384 hashing, see https://en.wikipedia.org/wiki/SHA-2#Pseudocode.\"\"\"\n",
    "    \n",
    "    def __init__(self):\n",
    "        self.name        = \"SHA512\"\n",
    "        self.byteorder   = 'big'\n",
    "        self.block_size  = 128\n",
    "        self.digest_size = 64\n",
    "        # Note 2: For each round, there is one round constant k[i] and one entry in the message schedule array w[i], 0 ≤ i ≤ 79\n",
    "        # Note 3: The compression function uses 8 working variables, a through h\n",
    "        # Note 4: Big-endian convention is used when expressing the constants in this pseudocode\n",
    "\n",
    "        # Initialize hash values:\n",
    "        # (The second 64 bits of the fractional parts of the square roots of the first 8 primes 2..19)\n",
    "        h0 = 0x6a09e667f3bcc908\n",
    "        h1 = 0xbb67ae8584caa73b\n",
    "        h2 = 0x3c6ef372fe94f82b\n",
    "        h3 = 0xa54ff53a5f1d36f1\n",
    "        h4 = 0x510e527fade682d1\n",
    "        h5 = 0x9b05688c2b3e6c1f\n",
    "        h6 = 0x1f83d9abfb41bd6b\n",
    "        h7 = 0x5be0cd19137e2179\n",
    "\n",
    "        # Initialize array of round constants:\n",
    "        # (first 64 bits of the fractional parts of the cube roots of the first 80 primes 2..409):\n",
    "        self.k = [\n",
    "            0x428a2f98d728ae22, 0x7137449123ef65cd, 0xb5c0fbcfec4d3b2f, 0xe9b5dba58189dbbc, 0x3956c25bf348b538, \n",
    "            0x59f111f1b605d019, 0x923f82a4af194f9b, 0xab1c5ed5da6d8118, 0xd807aa98a3030242, 0x12835b0145706fbe, \n",
    "            0x243185be4ee4b28c, 0x550c7dc3d5ffb4e2, 0x72be5d74f27b896f, 0x80deb1fe3b1696b1, 0x9bdc06a725c71235, \n",
    "            0xc19bf174cf692694, 0xe49b69c19ef14ad2, 0xefbe4786384f25e3, 0x0fc19dc68b8cd5b5, 0x240ca1cc77ac9c65, \n",
    "            0x2de92c6f592b0275, 0x4a7484aa6ea6e483, 0x5cb0a9dcbd41fbd4, 0x76f988da831153b5, 0x983e5152ee66dfab, \n",
    "            0xa831c66d2db43210, 0xb00327c898fb213f, 0xbf597fc7beef0ee4, 0xc6e00bf33da88fc2, 0xd5a79147930aa725, \n",
    "            0x06ca6351e003826f, 0x142929670a0e6e70, 0x27b70a8546d22ffc, 0x2e1b21385c26c926, 0x4d2c6dfc5ac42aed, \n",
    "            0x53380d139d95b3df, 0x650a73548baf63de, 0x766a0abb3c77b2a8, 0x81c2c92e47edaee6, 0x92722c851482353b, \n",
    "            0xa2bfe8a14cf10364, 0xa81a664bbc423001, 0xc24b8b70d0f89791, 0xc76c51a30654be30, 0xd192e819d6ef5218, \n",
    "            0xd69906245565a910, 0xf40e35855771202a, 0x106aa07032bbd1b8, 0x19a4c116b8d2d0c8, 0x1e376c085141ab53, \n",
    "            0x2748774cdf8eeb99, 0x34b0bcb5e19b48a8, 0x391c0cb3c5c95a63, 0x4ed8aa4ae3418acb, 0x5b9cca4f7763e373, \n",
    "            0x682e6ff3d6b2b8a3, 0x748f82ee5defb2fc, 0x78a5636f43172f60, 0x84c87814a1f0ab72, 0x8cc702081a6439ec, \n",
    "            0x90befffa23631e28, 0xa4506cebde82bde9, 0xbef9a3f7b2c67915, 0xc67178f2e372532b, 0xca273eceea26619c, \n",
    "            0xd186b8c721c0c207, 0xeada7dd6cde0eb1e, 0xf57d4f7fee6ed178, 0x06f067aa72176fba, 0x0a637dc5a2c898a6, \n",
    "            0x113f9804bef90dae, 0x1b710b35131c471b, 0x28db77f523047d84, 0x32caab7b40c72493, 0x3c9ebe0a15c9bebc, \n",
    "            0x431d67c49c100d4c, 0x4cc5d4becb3e42b6, 0x597f299cfc657e2a, 0x5fcb6fab3ad6faec, 0x6c44198c4a475817\n",
    "        ]\n",
    "\n",
    "        # Store them\n",
    "        self.hash_pieces = [h0, h1, h2, h3, h4, h5, h6, h7]\n",
    "    \n",
    "    def update(self, arg):\n",
    "        h0, h1, h2, h3, h4, h5, h6, h7 = self.hash_pieces\n",
    "        # 1. Pre-processing, exactly like MD5\n",
    "        data = bytearray(arg)\n",
    "        orig_len_in_bits = (8 * len(data)) & 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF\n",
    "        # 1.a. Add a single '1' bit at the end of the input bits\n",
    "        data.append(0x80)\n",
    "        # 1.b. Padding with zeros as long as the input bits length ≡ 896 (mod 1024)\n",
    "        while len(data) % 128 != 112:\n",
    "            data.append(0)\n",
    "        # 1.c. append original length in bits mod (2 pow 128) to message\n",
    "        data += orig_len_in_bits.to_bytes(16, byteorder='big')\n",
    "        assert len(data) % 128 == 0, \"Error in padding\"\n",
    "        # 2. Computations\n",
    "        # Process the message in successive 1024-bit = 128-bytes chunks:\n",
    "        for offset in range(0, len(data), 128):\n",
    "            # 2.a. 1024-bits = 128-bytes chunks\n",
    "            chunks = data[offset : offset + 128]\n",
    "            w = [0 for i in range(80)]\n",
    "            # 2.b. Break chunk into sixteen 128-bit = 8-bytes words w[i], 0 ≤ i ≤ 15\n",
    "            for i in range(16):\n",
    "                w[i] = int.from_bytes(chunks[8*i : 8*i + 8], byteorder='big')\n",
    "            # 2.c.  Extend the first 16 words into the remaining 64\n",
    "            #       words w[16..79] of the message schedule array:\n",
    "            for i in range(16, 80):\n",
    "                s0 = (rightrotate_64(w[i-15], 1) ^ rightrotate_64(w[i-15], 8) ^ rightshift(w[i-15], 7)) & 0xFFFFFFFFFFFFFFFF\n",
    "                s1 = (rightrotate_64(w[i-2], 19) ^ rightrotate_64(w[i-2], 61) ^ rightshift(w[i-2], 6)) & 0xFFFFFFFFFFFFFFFF\n",
    "                w[i] = (w[i-16] + s0 + w[i-7] + s1) & 0xFFFFFFFFFFFFFFFF\n",
    "            # 2.d. Initialize hash value for this chunk\n",
    "            a, b, c, d, e, f, g, h = h0, h1, h2, h3, h4, h5, h6, h7\n",
    "            # 2.e. Main loop, cf. https://tools.ietf.org/html/rfc6234\n",
    "            for i in range(80):\n",
    "                S1 = (rightrotate_64(e, 14) ^ rightrotate_64(e, 18) ^ rightrotate_64(e, 41)) & 0xFFFFFFFFFFFFFFFF\n",
    "                ch = ((e & f) ^ ((~e) & g)) & 0xFFFFFFFFFFFFFFFF\n",
    "                temp1 = (h + S1 + ch + self.k[i] + w[i]) & 0xFFFFFFFFFFFFFFFF\n",
    "                S0 = (rightrotate_64(a, 28) ^ rightrotate_64(a, 34) ^ rightrotate_64(a, 39)) & 0xFFFFFFFFFFFFFFFF\n",
    "                maj = ((a & b) ^ (a & c) ^ (b & c)) & 0xFFFFFFFFFFFFFFFF\n",
    "                temp2 = (S0 + maj) & 0xFFFFFFFFFFFFFFFF\n",
    "\n",
    "                new_a = (temp1 + temp2) & 0xFFFFFFFFFFFFFFFF\n",
    "                new_e = (d + temp1) & 0xFFFFFFFFFFFFFFFF\n",
    "                # Rotate the 8 variables\n",
    "                a, b, c, d, e, f, g, h = new_a, a, b, c, new_e, e, f, g\n",
    "\n",
    "            # Add this chunk's hash to result so far:\n",
    "            h0 = (h0 + a) & 0xFFFFFFFFFFFFFFFF\n",
    "            h1 = (h1 + b) & 0xFFFFFFFFFFFFFFFF\n",
    "            h2 = (h2 + c) & 0xFFFFFFFFFFFFFFFF\n",
    "            h3 = (h3 + d) & 0xFFFFFFFFFFFFFFFF\n",
    "            h4 = (h4 + e) & 0xFFFFFFFFFFFFFFFF\n",
    "            h5 = (h5 + f) & 0xFFFFFFFFFFFFFFFF\n",
    "            h6 = (h6 + g) & 0xFFFFFFFFFFFFFFFF\n",
    "            h7 = (h7 + h) & 0xFFFFFFFFFFFFFFFF\n",
    "        # 3. Conclusion\n",
    "        self.hash_pieces = [h0, h1, h2, h3, h4, h5, h6, h7]\n",
    "\n",
    "    def digest(self):\n",
    "        # h0 append h1 append h2 append h3 append h4 append h5 append h6 append h7\n",
    "        return sum(leftshift(x, 64 * i) for i, x in enumerate(self.hash_pieces[::-1]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also write a function to directly compute the hex digest from some bytes data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def hash_SHA512(data):\n",
    "    \"\"\" Shortcut function to directly receive the hex digest from SHA512(data).\"\"\"\n",
    "    h = SHA512()\n",
    "    if isinstance(data, str):\n",
    "        data = bytes(data, encoding='utf8')\n",
    "    h.update(data)\n",
    "    return h.hexdigest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Checks on SHA-512\n",
    "\n",
    "Let try the example from [SHA-2 Wikipedia page](https://en.wikipedia.org/wiki/SHA-2#Test_vectors) :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def true_hash_SHA512(data):\n",
    "    h = hashlib.sha512()\n",
    "    if isinstance(data, str):\n",
    "        data = bytes(data, encoding='utf8')\n",
    "    h.update(data)\n",
    "    return h.hexdigest()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'07e547d9586f6a73f73fbac0435ed76951218fb7d0c8d788a309d785436bbb642e93a252a954f23912547d1e8a3b5ed6e1bfd7097821233fa0538f3db854fee6'"
      ]
     },
     "execution_count": 78,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_SHA512(\"The quick brown fox jumps over the lazy dog\")\n",
    "assert hash_SHA512(\"The quick brown fox jumps over the lazy dog\") == true_hash_SHA512(\"The quick brown fox jumps over the lazy dog\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Even a small change in the message will (with overwhelming probability) result in a mostly different hash, due to the [**avalanche effect**](https://en.wikipedia.org/wiki/Avalanche_effect). For example, adding a period at the end of the sentence:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'91ea1245f20d46ae9a037a989f54f1f790f0a47607eeb8a14d12890cea77a1bbc6c7ed9cf205e67b7f2b8fd4c7dfd3a7a8617e45f3c463d481c7e586c39ac1ed'"
      ]
     },
     "execution_count": 79,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_SHA512(\"The quick brown fox jumps over the lazy dog.\")\n",
    "assert hash_SHA512(\"The quick brown fox jumps over the lazy dog.\") == true_hash_SHA512(\"The quick brown fox jumps over the lazy dog.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The hash of the zero-length string is:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 80,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e'"
      ]
     },
     "execution_count": 80,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_SHA512(\"\")\n",
    "assert hash_SHA512(\"\") == 'cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e'\n",
    "assert hash_SHA512(\"\") == true_hash_SHA512(\"\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\\implies$ We obtained the same result, OK our function works!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### SHA-384\n",
    "\n",
    "As said on the Wikipedia page:\n",
    "\n",
    "> SHA-384 is identical to SHA-512, except that:\n",
    "> - the initial hash values h0 through h7 are different (taken from the 9th through 16th primes), and\n",
    "> - the output is constructed by omitting h6 and h7."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### The `SHA384` class"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class SHA384(Hash):\n",
    "    \"\"\"SHA384 hashing, see https://en.wikipedia.org/wiki/SHA-2#Pseudocode.\"\"\"\n",
    "    \n",
    "    def __init__(self):\n",
    "        self.name        = \"SHA384\"\n",
    "        self.byteorder   = 'big'\n",
    "        self.block_size  = 96\n",
    "        self.digest_size = 48\n",
    "        # Note 2: For each round, there is one round constant k[i] and one entry in the message schedule array w[i], 0 ≤ i ≤ 79\n",
    "        # Note 3: The compression function uses 8 working variables, a through h\n",
    "        # Note 4: Big-endian convention is used when expressing the constants in this pseudocode\n",
    "\n",
    "        # Initialize hash values:\n",
    "        # (The second 64 bits of the fractional parts of the square roots of the first 9th through 16th primes 23..53)\n",
    "        h0 = 0xcbbb9d5dc1059ed8\n",
    "        h1 = 0x629a292a367cd507\n",
    "        h2 = 0x9159015a3070dd17\n",
    "        h3 = 0x152fecd8f70e5939\n",
    "        h4 = 0x67332667ffc00b31\n",
    "        h5 = 0x8eb44a8768581511\n",
    "        h6 = 0xdb0c2e0d64f98fa7\n",
    "        h7 = 0x47b5481dbefa4fa4\n",
    " \n",
    "\n",
    "        # Initialize array of round constants:\n",
    "        # (first 64 bits of the fractional parts of the cube roots of the first 80 primes 2..409):\n",
    "        self.k = [\n",
    "            0x428a2f98d728ae22, 0x7137449123ef65cd, 0xb5c0fbcfec4d3b2f, 0xe9b5dba58189dbbc, 0x3956c25bf348b538, \n",
    "            0x59f111f1b605d019, 0x923f82a4af194f9b, 0xab1c5ed5da6d8118, 0xd807aa98a3030242, 0x12835b0145706fbe, \n",
    "            0x243185be4ee4b28c, 0x550c7dc3d5ffb4e2, 0x72be5d74f27b896f, 0x80deb1fe3b1696b1, 0x9bdc06a725c71235, \n",
    "            0xc19bf174cf692694, 0xe49b69c19ef14ad2, 0xefbe4786384f25e3, 0x0fc19dc68b8cd5b5, 0x240ca1cc77ac9c65, \n",
    "            0x2de92c6f592b0275, 0x4a7484aa6ea6e483, 0x5cb0a9dcbd41fbd4, 0x76f988da831153b5, 0x983e5152ee66dfab, \n",
    "            0xa831c66d2db43210, 0xb00327c898fb213f, 0xbf597fc7beef0ee4, 0xc6e00bf33da88fc2, 0xd5a79147930aa725, \n",
    "            0x06ca6351e003826f, 0x142929670a0e6e70, 0x27b70a8546d22ffc, 0x2e1b21385c26c926, 0x4d2c6dfc5ac42aed, \n",
    "            0x53380d139d95b3df, 0x650a73548baf63de, 0x766a0abb3c77b2a8, 0x81c2c92e47edaee6, 0x92722c851482353b, \n",
    "            0xa2bfe8a14cf10364, 0xa81a664bbc423001, 0xc24b8b70d0f89791, 0xc76c51a30654be30, 0xd192e819d6ef5218, \n",
    "            0xd69906245565a910, 0xf40e35855771202a, 0x106aa07032bbd1b8, 0x19a4c116b8d2d0c8, 0x1e376c085141ab53, \n",
    "            0x2748774cdf8eeb99, 0x34b0bcb5e19b48a8, 0x391c0cb3c5c95a63, 0x4ed8aa4ae3418acb, 0x5b9cca4f7763e373, \n",
    "            0x682e6ff3d6b2b8a3, 0x748f82ee5defb2fc, 0x78a5636f43172f60, 0x84c87814a1f0ab72, 0x8cc702081a6439ec, \n",
    "            0x90befffa23631e28, 0xa4506cebde82bde9, 0xbef9a3f7b2c67915, 0xc67178f2e372532b, 0xca273eceea26619c, \n",
    "            0xd186b8c721c0c207, 0xeada7dd6cde0eb1e, 0xf57d4f7fee6ed178, 0x06f067aa72176fba, 0x0a637dc5a2c898a6, \n",
    "            0x113f9804bef90dae, 0x1b710b35131c471b, 0x28db77f523047d84, 0x32caab7b40c72493, 0x3c9ebe0a15c9bebc, \n",
    "            0x431d67c49c100d4c, 0x4cc5d4becb3e42b6, 0x597f299cfc657e2a, 0x5fcb6fab3ad6faec, 0x6c44198c4a475817\n",
    "        ]\n",
    "\n",
    "        # Store them\n",
    "        self.hash_pieces = [h0, h1, h2, h3, h4, h5, h6, h7]\n",
    "    \n",
    "    def update(self, arg):\n",
    "        h0, h1, h2, h3, h4, h5, h6, h7 = self.hash_pieces\n",
    "        # 1. Pre-processing, exactly like MD5\n",
    "        data = bytearray(arg)\n",
    "        orig_len_in_bits = (8 * len(data)) & 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF\n",
    "        # 1.a. Add a single '1' bit at the end of the input bits\n",
    "        data.append(0x80)\n",
    "        # 1.b. Padding with zeros as long as the input bits length ≡ 896 (mod 1024)\n",
    "        while len(data) % 128 != 112:\n",
    "            data.append(0)\n",
    "        # 1.c. append original length in bits mod (2 pow 128) to message\n",
    "        data += orig_len_in_bits.to_bytes(16, byteorder='big')\n",
    "        assert len(data) % 128 == 0, \"Error in padding\"\n",
    "        # 2. Computations\n",
    "        # Process the message in successive 1024-bit = 128-bytes chunks:\n",
    "        for offset in range(0, len(data), 128):\n",
    "            # 2.a. 1024-bits = 128-bytes chunks\n",
    "            chunks = data[offset : offset + 128]\n",
    "            w = [0 for i in range(80)]\n",
    "            # 2.b. Break chunk into sixteen 128-bit = 8-bytes words w[i], 0 ≤ i ≤ 15\n",
    "            for i in range(16):\n",
    "                w[i] = int.from_bytes(chunks[8*i : 8*i + 8], byteorder='big')\n",
    "            # 2.c.  Extend the first 16 words into the remaining 64\n",
    "            #       words w[16..79] of the message schedule array:\n",
    "            for i in range(16, 80):\n",
    "                s0 = (rightrotate_64(w[i-15], 1) ^ rightrotate_64(w[i-15], 8) ^ rightshift(w[i-15], 7)) & 0xFFFFFFFFFFFFFFFF\n",
    "                s1 = (rightrotate_64(w[i-2], 19) ^ rightrotate_64(w[i-2], 61) ^ rightshift(w[i-2], 6)) & 0xFFFFFFFFFFFFFFFF\n",
    "                w[i] = (w[i-16] + s0 + w[i-7] + s1) & 0xFFFFFFFFFFFFFFFF\n",
    "            # 2.d. Initialize hash value for this chunk\n",
    "            a, b, c, d, e, f, g, h = h0, h1, h2, h3, h4, h5, h6, h7\n",
    "            # 2.e. Main loop, cf. https://tools.ietf.org/html/rfc6234\n",
    "            for i in range(80):\n",
    "                S1 = (rightrotate_64(e, 14) ^ rightrotate_64(e, 18) ^ rightrotate_64(e, 41)) & 0xFFFFFFFFFFFFFFFF\n",
    "                ch = ((e & f) ^ ((~e) & g)) & 0xFFFFFFFFFFFFFFFF\n",
    "                temp1 = (h + S1 + ch + self.k[i] + w[i]) & 0xFFFFFFFFFFFFFFFF\n",
    "                S0 = (rightrotate_64(a, 28) ^ rightrotate_64(a, 34) ^ rightrotate_64(a, 39)) & 0xFFFFFFFFFFFFFFFF\n",
    "                maj = ((a & b) ^ (a & c) ^ (b & c)) & 0xFFFFFFFFFFFFFFFF\n",
    "                temp2 = (S0 + maj) & 0xFFFFFFFFFFFFFFFF\n",
    "\n",
    "                new_a = (temp1 + temp2) & 0xFFFFFFFFFFFFFFFF\n",
    "                new_e = (d + temp1) & 0xFFFFFFFFFFFFFFFF\n",
    "                # Rotate the 8 variables\n",
    "                a, b, c, d, e, f, g, h = new_a, a, b, c, new_e, e, f, g\n",
    "\n",
    "            # Add this chunk's hash to result so far:\n",
    "            h0 = (h0 + a) & 0xFFFFFFFFFFFFFFFF\n",
    "            h1 = (h1 + b) & 0xFFFFFFFFFFFFFFFF\n",
    "            h2 = (h2 + c) & 0xFFFFFFFFFFFFFFFF\n",
    "            h3 = (h3 + d) & 0xFFFFFFFFFFFFFFFF\n",
    "            h4 = (h4 + e) & 0xFFFFFFFFFFFFFFFF\n",
    "            h5 = (h5 + f) & 0xFFFFFFFFFFFFFFFF\n",
    "            h6 = (h6 + g) & 0xFFFFFFFFFFFFFFFF\n",
    "            h7 = (h7 + h) & 0xFFFFFFFFFFFFFFFF\n",
    "        # 3. Conclusion\n",
    "        self.hash_pieces = [h0, h1, h2, h3, h4, h5, h6, h7]\n",
    "\n",
    "    def digest(self):\n",
    "        # h0 append h1 append h2 append h3 append h4 append h5\n",
    "        hash_pieces_without_67 = self.hash_pieces[:-2]\n",
    "        return sum(leftshift(x, 64 * i) for i, x in enumerate(hash_pieces_without_67[::-1]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also write a function to directly compute the hex digest from some bytes data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def hash_SHA384(data):\n",
    "    \"\"\" Shortcut function to directly receive the hex digest from SHA384(data).\"\"\"\n",
    "    h = SHA384()\n",
    "    if isinstance(data, str):\n",
    "        data = bytes(data, encoding='utf8')\n",
    "    h.update(data)\n",
    "    return h.hexdigest()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Checks on SHA-384\n",
    "\n",
    "Let try the example from [SHA-2 Wikipedia page](https://en.wikipedia.org/wiki/SHA-2#Test_vectors) :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def true_hash_SHA384(data):\n",
    "    h = hashlib.sha384()\n",
    "    if isinstance(data, str):\n",
    "        data = bytes(data, encoding='utf8')\n",
    "    h.update(data)\n",
    "    return h.hexdigest()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 84,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'ca737f1014a48f4c0b6dd43cb177b0afd9e5169367544c494011e3317dbf9a509cb1e5dc1e85a941bbee3d7f2afbc9b1'"
      ]
     },
     "execution_count": 84,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_SHA384(\"The quick brown fox jumps over the lazy dog\")\n",
    "assert hash_SHA384(\"The quick brown fox jumps over the lazy dog\") == true_hash_SHA384(\"The quick brown fox jumps over the lazy dog\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Even a small change in the message will (with overwhelming probability) result in a mostly different hash, due to the [**avalanche effect**](https://en.wikipedia.org/wiki/Avalanche_effect). For example, adding a period at the end of the sentence:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 85,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'ed892481d8272ca6df370bf706e4d7bc1b5739fa2177aae6c50e946678718fc67a7af2819a021c2fc34e91bdb63409d7'"
      ]
     },
     "execution_count": 85,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_SHA384(\"The quick brown fox jumps over the lazy dog.\")\n",
    "assert hash_SHA384(\"The quick brown fox jumps over the lazy dog.\") == true_hash_SHA384(\"The quick brown fox jumps over the lazy dog.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The hash of the zero-length string is:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 86,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'38b060a751ac96384cd9327eb1b1e36a21fdb71114be07434c0cc7bf63f6e1da274edebfe76f65fbd51ad2f14898b95b'"
      ]
     },
     "execution_count": 86,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash_SHA384(\"\")\n",
    "assert hash_SHA384(\"\") == '38b060a751ac96384cd9327eb1b1e36a21fdb71114be07434c0cc7bf63f6e1da274edebfe76f65fbd51ad2f14898b95b'\n",
    "assert hash_SHA384(\"\") == true_hash_SHA384(\"\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\\implies$ We obtained the same result, OK our function works!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### More comparison"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 87,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "63.2 ms ± 3.52 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
     ]
    }
   ],
   "source": [
    "def test_SHA224():\n",
    "    x = random_string()\n",
    "    return hash_SHA224(x) == true_hash_SHA224(x)\n",
    "\n",
    "%timeit test_SHA224()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 88,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "42.7 ms ± 465 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
     ]
    }
   ],
   "source": [
    "def test_SHA512():\n",
    "    x = random_string()\n",
    "    return hash_SHA512(x) == true_hash_SHA512(x)\n",
    "\n",
    "%timeit test_SHA512()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 89,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "42.8 ms ± 474 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
     ]
    }
   ],
   "source": [
    "def test_SHA384():\n",
    "    x = random_string()\n",
    "    return hash_SHA384(x) == true_hash_SHA384(x)\n",
    "\n",
    "%timeit test_SHA384()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`SHA512` and `SHA384` are slower than `SHA256` obviously, but it's weird that `SHA224` is slower than the 64-bits versions..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----\n",
    "## Conclusion"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Well, it was fun and interesting to implement these hashing functions, manually.\n",
    "Using [Python](https://www.Python.org) made it easy!\n",
    "\n",
    "> Note that a Python 2 library implementing manually all these hashing functions already exist: [`pysha2`](https://github.com/thomdixon/pysha2), by [@thomdixon](https://github.com/thomdixon).\n",
    "> (I discovered it *after* writing this notebook!)\n",
    "\n",
    "[![made-with-jupyter](https://img.shields.io/badge/Made%20for-Jupyter%20notebook-1f425f.svg)](https://www.jupyter.org/)\n",
    "[![GitHub license](https://img.shields.io/github/license/Naereen/notebooks.svg)](https://github.com/Naereen/notebooks/blob/master/LICENSE.txt)\n",
    "[![forthebadge made-with-python](http://ForTheBadge.com/images/badges/made-with-python.svg)](https://www.python.org/) \n",
    "[![ForTheBadge built-with-science](http://ForTheBadge.com/images/badges/built-with-science.svg)](https://GitHub.com/Naereen/)\n",
    "[![ForTheBadge powered-by-electricity](http://ForTheBadge.com/images/badges/powered-by-electricity.svg)](http://ForTheBadge.com)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Bonus\n",
    "\"SHA\" is pronouced like the French word \"chat\", which means *cat*.\n",
    "\n",
    "![a cat playing on a computer](https://media.giphy.com/media/JIX9t2j0ZTN9S/giphy.gif)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> See [my GitHub `notebooks` project](https://GitHub.com/Naereen/notebooks/) for others notebooks."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.3"
  },
  "notify_time": "5",
  "toc": {
   "colors": {
    "hover_highlight": "#DAA520",
    "running_highlight": "#FF0000",
    "selected_highlight": "#FFD700"
   },
   "moveMenuLeft": true,
   "nav_menu": {
    "height": "303px",
    "width": "251px"
   },
   "navigate_menu": true,
   "number_sections": true,
   "sideBar": true,
   "threshold": 4,
   "toc_cell": true,
   "toc_position": {
    "height": "552px",
    "left": "0px",
    "right": "1164px",
    "top": "117px",
    "width": "212px"
   },
   "toc_section_display": "block",
   "toc_window_display": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}