{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Classical ciphers\n", "\n", "## Hill cipher\n", "\n", "We will try to decrypt the ciphertext `rbqdoobweruuis` if we know that the Hill cipher was used and the plaintext starts with `help`.\n", "\n", "Let us first define some helper functions." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "encode = lambda s: list(ord(x) - ord('a') for x in s)\n", "decode = lambda v: ''.join(chr(c + ord('a')) for c in v)\n", "add = lambda v1, v2: [(x + y) % 26 for x, y in zip(v1, v2)]\n", "mul = lambda x, v: [(x * y) % 26 for y in v]\n", "inv = lambda x: next(i for i in range(26) if (x * i) % 26 == 1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now encode the ciphertext and the known portion of the plaintext." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "c = 'rbqdoobweruuis'\n", "p = 'help'" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[17, 1, 16, 3, 14, 14, 1, 22, 4, 17, 20, 20, 8, 18]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ec = encode(c)\n", "ec" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[7, 4, 11, 15]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ep = encode(p)\n", "ep" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Assuming that a $2 \\times 2$ matrix $A$ has been used to encrypt the original plaintext, we look for its inverse $A^{-1}$ such that $A^{-1} C = P$, where\n", "$$\n", "C = \\begin{bmatrix} 17 & 16 \\\\ 1 & 3 \\end{bmatrix}\n", "\\quad \\text{and} \\quad\n", "P = \\begin{bmatrix} 7 & 11 \\\\ 4 & 15 \\end{bmatrix}.\n", "$$\n", "We will first compute $C^{-1}$. To do so, we will use Gaussian elimination on\n", "$$\n", "\\begin{bmatrix} 17 & 16 & | & 1 & 0 \\\\ 1 & 3 & | & 0 & 1 \\end{bmatrix}.\n", "$$" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "23" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "inv(17)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 4, 23, 0]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mul(23, [17, 16, 1, 0])" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 0, 9, 4]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "add([1, 4, 23, 0], mul(-4, [0, 1, 23, 25]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using the above computations, we derive\n", "$$\n", "\\begin{bmatrix} 17 & 16 & | & 1 & 0 \\\\ 1 & 3 & | & 0 & 1 \\end{bmatrix}\n", "\\sim\n", "\\begin{bmatrix} 1 & 4 & | & 23 & 0 \\\\ 1 & 3 & | & 0 & 1 \\end{bmatrix}\n", "\\sim\n", "\\begin{bmatrix} 1 & 4 & | & 23 & 0 \\\\ 0 & 1 & | & 23 & 25 \\end{bmatrix}\n", "\\sim\n", "\\begin{bmatrix} 1 & 0 & | & 9 & 4 \\\\ 0 & 1 & | & 23 & 25 \\end{bmatrix}.\n", "$$\n", "We thus have\n", "$$\n", "C^{-1} = \\begin{bmatrix} 9 & 4 \\\\ 23 & 25 \\end{bmatrix}.\n", "$$\n", "We may now compute $A^{-1} = PC^{-1}$." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(7*9 + 11*23) % 26" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "17" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(7*4 + 11*25) % 26" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "17" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(4*9 + 15*23) % 26" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(4*4 + 15*25) % 26" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We obtain\n", "$$\n", "A^{-1} = \\begin{bmatrix} 4 & 17 \\\\ 17 & 1 \\end{bmatrix}.\n", "$$\n", "Let us now decrypt the ciphertext." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'helpisontheway'" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = [[4, 17], [17, 1]]\n", "decode((r[0]*ec[i] + r[1]*ec[i+1]) % 26 for i in range(0, len(ec), 2) for r in A)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Vigenère cipher\n", "\n", "The *index of coincidence* tells us the probability that a randomly chosen pair of letters in a string will coincide." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "IC = lambda s: sum(f*(f-1) for f in map(lambda c: s.count(c), set(s))) / (len(s) * (len(s) - 1))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us first encrypt a short string with the Vigenère cipher." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "s = 'ljubljana'\n", "k = 'fri'" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'qacgcrfei'" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from itertools import repeat\n", "sc = decode((a+b) % 26 for a, b in zip(encode(s), (c for kk in repeat(encode(k)) for c in kk)))\n", "sc" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We compute the index of coincidence of the plaintext and the ciphertext." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.08333333333333333" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "IC(s)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.027777777777777776" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "IC(sc)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We note that the plaintext has a considerably larger index of coincidence.\n", "\n", "Next, we try to determine the key length of the Vigenère cipher which was used to obtain the next ciphertext. We use the Kasiski method - we notice that the substrings `NKA` and `BYI` repeat, so let us find the indices of their occurences." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0, 30]\n", "[6, 33]\n" ] } ], "source": [ "ss = 'NKASF BBYIY PWZCW TBIYK PFKUF KBJIA NKABY IYPWZ JMJ'.lower().replace(' ', '')\n", "print([i for i in range(len(ss)) if ss[i:i+3] == 'nka'])\n", "print([i for i in range(len(ss)) if ss[i:i+3] == 'byi'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We may deduce that the key length divides $\\gcd(30-0, 33-6) = 3$. Let us compute the indices of coincidence of the ciphertext and its substrings containing every third letter." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.058693244739756366" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "IC(ss)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0.0761904761904762, 0.13186813186813187, 0.10989010989010989]" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[IC(ss[i::3]) for i in range(3)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We note that the indices of coincidence for the substrings are significantly higher than the value for the entire ciphertext. This supports the suspicion given by the Kasiski method that the key length is $3$.\n", "\n", "We may now compare the letter frequencies with those of English (assuming that the plaintext really is in English) to find the key. Here, we just try using the same key as before." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'itsnotwhatyoulookatthatmattersitswhatyousee'" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "decode((a-b) % 26 for a, b in zip(encode(ss), (c for kk in repeat(encode(k)) for c in kk)))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" } }, "nbformat": 4, "nbformat_minor": 4 }