{ "cells": [ { "cell_type": "markdown", "id": "5003fda0-263c-4ebb-944f-ef9a8fd68713", "metadata": {}, "source": [ "## Prerequisites\n", "- Homomorphic encryption pdf\n", "- Modular arithmetic\n", "- Lattices" ] }, { "cell_type": "markdown", "id": "eedb3efb-5a5b-4c57-b5e6-69d0c791f944", "metadata": {}, "source": [ "# Theory" ] }, { "cell_type": "markdown", "id": "de8dc1a3-b5e2-457b-ae96-5695941b1d5a", "metadata": {}, "source": [ "**Homomorphic encryption -- Definition** \n", "Let `(KeyGen, Encrypt, Decrypt, Evaluate)` be a tuple of procedures $(KeyGen, E, D, V)$ \n", "Let $f \\in \\mathcal F$ be a function in a family of functions.\n", "- $(sk, pk) \\gets KeyGen(1^\\lambda, 1^d)$ - key generation, $\\lambda$ is a security parameter, $d$ is a functionality parameter (degree of the polynomial / multiplicative depth allowed). \n", "- $c_i \\gets E(pk, m_i)$ - Encryption of a message $m_i$. -- known as **fresh ciphertexts**. \n", "- $c^* = V(pk, f, c_1, ... c_n)$ - Evaluate the function $V$ on the ciphertexts. -- known as **evaluated ciphertexts**\n", "\n", "\n", "**Corectness** \n", "Correctly decrypt both fresh and evaluated ciphertexts\n", "$$D(sk, c^*) = m^* = f(m_1, ..., m_n)$$\n", "\n", "\n", "The HE can be classified based on the type of functions $f$ that it supports. \n", "1. **Partially homomorphic encryption** \n", " Given $E(m_1)$, and $E(m_2)$ you can do limited operations. Ex: Only addition or multiplication.\n", "2. **Somewhat homomorphic encryption** \n", " Limited number of multiplications (Circuits of a maximum depth) \n", " Given $E(m_1), ..., E(m_n)$ you can compute $E(f(m_1, ... m_n))$ where $f$ is a polynomial of a limited degree. \n", "3. **Fully homomorphic encryption** \n", " Unlimited multiplications and additions. " ] }, { "cell_type": "markdown", "id": "420bd3ff-be69-4526-bb52-55f0bf81f07d", "metadata": {}, "source": [ "## SHE over integers\n", "\n", "- [DGHV paper](https://eprint.iacr.org/2009/616.pdf) - read and follow the paper for details\n", "- [Public key compression for DGHV paper](https://eprint.iacr.org/2011/440.pdf)\n", "- [FHE over integers with shorter public keys](https://eprint.iacr.org/2011/441.pdf)\n", "- [An implementation](https://github.com/coron/fhe)" ] }, { "cell_type": "markdown", "id": "9a68cf39-6e56-4e02-8f2e-018c54fab2d9", "metadata": {}, "source": [ "### Symmetric version\n", "**Key generation** \n", "The secret key is an $\\eta$-bit odd integer $p$\n", "\n", "**Encryption** \n", "Let $m \\in \\{0, 1\\}$ be a bit. The ciphertext is an int whose residue mod $p$ which has the same bit parity as the plaintext. Randomly choose 2 integers, large random $q$ and small random $r$\n", "$$E(m) = pq + 2r + m$$\n", "\n", "**Decryption**\n", "$$D(c) = (c \\bmod p) \\bmod 2$$\n", "\n", "\n", "When the noise $r$ is sufficiently smaller than the secret key $p$ this encryption scheme is **somewhat homomorphic** - It can be used to evaluate low degree polynomias over encrypted data. If parameters are chosen corectly ($r \\approx 2^\\sqrt \\eta, q \\approx 2^{\\eta^3}$) the scheme may even be secure.\n", "\n", "**Evaluation** \n", "Given a multivariate function (circuit) that takes in ciphertexts, apply the addition and multiplication over the integers, and return the resulting integer. " ] }, { "cell_type": "markdown", "id": "7de8d153-56bd-4464-8584-154517d5f732", "metadata": {}, "source": [ "**Corectness**\n", "$$D(E(m)) = (c \\bmod p) \\bmod 2 = (pq + 2r + m \\bmod p) \\bmod 2 = 2r + m \\bmod 2 = m $$\n", "\n", "Notice that this decryption can happen only if $2r \\bmod p = 2r \\iff 2r < p$\n", "\n", "For the addition we have\n", "$$\n", "\\begin{align*}\n", "D(c_1 + c_2) &= (pq_1 + pq_2 + 2r_1 + 2r_2 + m_1 + m_2 \\bmod p) \\bmod 2 \\\\\n", "& = (2r_1 + 2r_2 + m_1 + m_2) \\bmod 2 \\\\\n", "&= m_1 + m_2 \\bmod 2 \n", "\\end{align*}$$\n", "with the similar constraint that $2r_1 + 2r_2 \\leq p$\n", "\n", "And for multiplication we have:\n", "$$\n", "\\begin{align*}\n", "D(c_1 \\cdot c_2) &= ((pq_1 + 2r_1 + m_1)(pq_2 + 2r_2 + m_2) \\bmod p) \\bmod 2 \\\\\n", "&= 4r_1r_2 + 2r_1m_2 + 2r_2m_1 + m_1m_2 \\bmod 2 \\\\\n", "&= m_1m_2 \n", "\\end{align*}\n", "$$\n", "with a similar constraint.\n", "\n", "**Remarks**:\n", "- We have to be careful how we bound the noise, because it will decide how many circuits we can evaluate and decrypt correctly after. If the final noise becomes bigger than the secret key $p$ the decryption will no longer be correct.\n", "- The noise doubles when doing additions but grows exponentially when doing multiplications. \n" ] }, { "cell_type": "markdown", "id": "d3894ffa-1dcf-48ac-8196-60d2dfb0a5fa", "metadata": {}, "source": [ "### PK version\n", "\n", "Turning the secret key scheme into a public key scheme is easy. The main idea is to base the PK security on the hardness of the [approximate integer GCD](https://martinralbrecht.wordpress.com/2020/03/21/the-approximate-gcd-problem/). The public key will consist of many \"encryptions of zero\":\n", "$$x_i = q_i \\cdot p + 2 \\cdot r_i$$\n", "where $q_i, r_i$ are chosen as above. The number of encryptions is denoted by $\\tau$ and is a security parameter. So we generate a set $\\{x_0, ... x_\\tau\\}$.Then reorder the set such that $x_0$ is the maxmimum from the set.\n", "\n", "Then we encrypt $m$ by adding a **subset-sum** of $x_i$'s instead of $pq$. So we choose a random subset from $S \\subseteq \\{1, 2, ..., \\tau\\}$ and we compute:\n", "$$c = m + 2r + 2 \\cdot \\sum_{i\\in S} x_i \\bmod x_0$$" ] }, { "cell_type": "markdown", "id": "426adf6f-b12c-44a2-aed9-94faa95d1265", "metadata": {}, "source": [ "### Security\n", "\n", "**Security reduction** \n", "The scheme security can be reduced to solving the [approximate integer GCD](https://martinralbrecht.wordpress.com/2020/03/21/the-approximate-gcd-problem/): \n", "- Given many an integer $p$ and many $x_i = q_i + p + 2r_i$ find $p$\n", "\n", "*Intuition for the assumption*: If you sample lots of numbers that are close to multiples of $p$ but not exactly multipes of $p$ they are indistinguishable from random. \n", "\n", "**Security parameters**\n", "\n", "To construct the SHE scheme the paper proposes many security parameters:\n", "- $\\eta$ is the bit-length of the secret key $p$ (which is the hidden approximate-gcd of all the public-key integers),\n", "- $\\gamma$ is the bit-length of the integers in the public key $x_i$,\n", "- $\\tau$ is the number of integers in the public key\n", "- $\\rho$ is the bit-length of the noise $r_i$ (i.e., the distance between the public key elements and the nearest multiples of the secret key) \n", "- $\\rho^\\prime$ secondary noise parameter used in encryption\n", "\n", "\n", "These parameters are set under various constraints to stop different attacks and support HE. \n", "\n", "\n", "**Circuit / function homomorphic bounds** \n", "Consider a multivariate function $f$. Denote $|\\vec f|$ its $l_1$ norm of the coefficient vector of $f$. Denote $d = \\deg(f)$. $f$ can be evaluated if the following condition is satisfied:\n", "$$d \\leq \\dfrac {\\eta - 4 - \\log(|\\vec f|)} {\\rho^\\prime + 2}$$\n", "\n", "*Proof*: \n", "- \"fresh\" ciphertexts have nosie at most $2^{\\rho^\\prime + 2}$. A poly with degree $d$ will raise this noise to $(2^{\\rho^\\prime + 2})^d$\n", "- ciphertext output after `Evaluate` has noise **at most** $2^{\\eta - 4} < \\frac p 8$. Although $2^{\\eta - 2} < \\frac p 2$ is sufficient, later the former bound will be used to allow decryption with a shallow circuit (function)\n", "- So we want the following condition: $|\\vec f| \\cdot (2^{\\rho^\\prime + 2})^d \\leq 2^{\\eta - 4}$. Applying the logarithm we get to the inequality above. " ] }, { "cell_type": "markdown", "id": "06154087-51ef-4d03-9762-bbc5904bb796", "metadata": {}, "source": [ "### Making the scheme more efficient\n", "\n", "There are a few problems with our scheme \n", "1. impractically big public keys \n", "2. ciphertext size doubling with each multiplication\n", "\n", "\n", "The following papers attempt to make the scheme more efficient:\n", "- [Public key compression for this scheme paper](https://eprint.iacr.org/2011/440.pdf)\n", "- [FHE over integers with shorter public keys](https://eprint.iacr.org/2011/441.pdf)" ] }, { "cell_type": "markdown", "id": "5c556f4e-476d-4143-af63-3c39003b426a", "metadata": {}, "source": [ "## Batching \n", "\n", "[Batch FHE over the integers](https://www.iacr.org/archive/eurocrypt2013/78810313/78810313.pdf)" ] }, { "cell_type": "markdown", "id": "49501d34-cdb6-422d-92a7-38bf52f6dd1d", "metadata": {}, "source": [ "## Making the scheme FHE\n", "\n", "In this transformation, we add to the public key some extra information about the secret key, and use this extra information to “post process” the ciphertext. The post-processed ciphertext can be decrypted more efficiently than the original ciphertext, thus making the scheme bootstrappable. \n", "\n", "We pay for this saving by having a larger ciphertext, and also by introducing another hardness assumption (basically assuming that the extra information in the public key does not help an attacker break the scheme).\n", "\n", "More details can be found in the [paper](https://eprint.iacr.org/2009/616.pdf)" ] }, { "cell_type": "markdown", "id": "de93127f-222a-4304-bfb4-4d98656d45d2", "metadata": {}, "source": [ "# Code" ] }, { "cell_type": "code", "execution_count": 116, "id": "0f296a3a-d9f1-45aa-adbf-3813825043b1", "metadata": {}, "outputs": [], "source": [ "import math\n", "import random\n", "\n", "import numpy as np\n", "from Crypto.Util.number import getPrime, getRandomInteger, getRandomNBitInteger\n", "from tqdm.notebook import tqdm" ] }, { "cell_type": "code", "execution_count": 91, "id": "d7730784-7839-4d7e-a986-c28b9fd4c84d", "metadata": {}, "outputs": [], "source": [ "# Utils\n", "def randodd(n: int) -> int:\n", " \"\"\"Generate n bit long odd number\"\"\"\n", " return 2 * (2 ** (n - 2) + random.randint(0, 2 ** (n - 2)) - 1) + 1\n", "\n", "\n", "def quot_near(a: int, b: int) -> int:\n", " \"\"\"nearest integer to a / b\"\"\"\n", " return (2 * a + b) // (2 * b)\n", "\n", "\n", "def mod_near(a: int, b: int) -> int:\n", " \"\"\"r = a mod b with r in [-b/2, b/2]\"\"\"\n", " return a - b * quot_near(a, b)" ] }, { "cell_type": "markdown", "id": "5d734b9b-c86b-48f1-9de8-8c9d289d3849", "metadata": {}, "source": [ "## SHE Scheme" ] }, { "cell_type": "code", "execution_count": 92, "id": "b0ed2c4e-491b-4372-8839-0bb1fe91c290", "metadata": {}, "outputs": [], "source": [ "def generate_x(p: int, gamma: int, rho: int) -> int:\n", " # 2**gamma // p - 1 in bits is gamma - eta becuase p is eta bits long\n", " q = randodd(gamma - eta)\n", " # q = random.randint(0, 2**gamma // p - 1)\n", " r = random.randint(-(2**rho) + 1, 2**rho - 1)\n", " x = p * q + r\n", " return x\n", "\n", "\n", "def generate_pk(p: int, gamma: int, rho: int, tau: int, trials=100) -> list[int]:\n", " while True:\n", " xs = [generate_x(p, gamma, rho) for _ in range(tau)]\n", " xs.sort(reverse=True)\n", " mod_x0 = mod_near(xs[0], p)\n", " if xs[0] % 2 == 1 and mod_x0 % 2 == 0:\n", " break\n", " return xs" ] }, { "cell_type": "code", "execution_count": 93, "id": "119b8638-d9a7-49db-9162-c0152e1a4c09", "metadata": {}, "outputs": [], "source": [ "def encrypt_sk(m: int, p: int, gamma: int, eta: int, rho: int) -> int:\n", " q = random.randint(1, 2 ** (gamma - eta - 1))\n", " r = random.randint(-(2**rho) + 1, 2**rho - 1)\n", " c = p * q + 2 * r + m\n", " return c\n", "\n", "\n", "def encrypt_pk(m: int, xs: list[int], rho_: int) -> int:\n", " r = random.randint(-(2**rho_) + 1, 2**rho_ - 1)\n", " x0 = xs[0]\n", "\n", " # Select a random subset\n", " selected = [random.choice([True, False]) for _ in range(len(xs) - 1)]\n", "\n", " # Compute the sum of the selected subset\n", " sum_selected = 0\n", " for xi, t in zip(xs, selected):\n", " if t:\n", " sum_selected = (sum_selected + xi) % x0\n", "\n", " c = (m + 2 * r + 2 * sum_selected) % x0\n", "\n", " return c\n", "\n", "\n", "def decrypt(c: int, p: int) -> int:\n", " # return (c % p) % 2\n", " return mod_near(c, p) % 2" ] }, { "cell_type": "code", "execution_count": 235, "id": "8496dcd4-a13b-46a7-a84a-7447feb98c1f", "metadata": {}, "outputs": [], "source": [ "lam = 42\n", "eta = 988 # Secret key p bit length\n", "rho = 26 # bit length of the noise\n", "rho_ = 42 # Secondary noise parameter\n", "\n", "gamma = 147456 # Bit length of integers in the public key\n", "tau = 158 # number of integers in the public key" ] }, { "cell_type": "code", "execution_count": 95, "id": "635e92f1-3c0e-402f-a862-e03022d53d6a", "metadata": {}, "outputs": [], "source": [ "p = getPrime(eta)" ] }, { "cell_type": "code", "execution_count": 112, "id": "912dc71a-92b2-4c28-9c82-1eea59e9d61a", "metadata": {}, "outputs": [], "source": [ "m0 = 0\n", "m1 = 1\n", "\n", "# Generate secret key encrypions\n", "c0 = encrypt_sk(m0, p, gamma, eta, rho)\n", "c1 = encrypt_sk(m1, p, gamma, eta, rho)" ] }, { "cell_type": "code", "execution_count": 113, "id": "aa73e195-fc44-4d01-9898-878a822769ef", "metadata": {}, "outputs": [], "source": [ "# basic decryptions\n", "assert decrypt(c0, p) == 0\n", "assert decrypt(c1, p) == 1\n", "\n", "# Homomorphic addition\n", "assert decrypt(c0 + c0, p) == 0\n", "assert decrypt(c0 + c1, p) == 1\n", "assert decrypt(c1 + c1, p) == 0\n", "\n", "# Homomorphic multiplication\n", "assert decrypt(c0 * c0, p) == 0\n", "assert decrypt(c0 * c1, p) == 0\n", "assert decrypt(c1 * c1, p) == 1" ] }, { "cell_type": "code", "execution_count": 114, "id": "f66ccd7a-464a-41dd-92f6-2fc777045c0f", "metadata": {}, "outputs": [], "source": [ "# Where does it break?\n", "n_iter = 0\n", "c_sum = c0\n", "while True:\n", " n_iter += 1\n", " c_sum = c_sum * c1\n", " if decrypt(c_sum, p) != 0:\n", " break" ] }, { "cell_type": "code", "execution_count": 115, "id": "7f58ecbc-d759-4bb5-b94b-7ffebfb7c63b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "38" ] }, "execution_count": 115, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Scheme didnt work after n_iter\n", "n_iter" ] }, { "cell_type": "code", "execution_count": 107, "id": "87bcf2fa-1e49-42f3-b55a-50775bb0e962", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "22.363636363636363" ] }, "execution_count": 107, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(eta - 4) / (rho_ + 2)" ] }, { "cell_type": "code", "execution_count": 175, "id": "6579579e-3298-4a12-86b4-2ff9e95ca1f4", "metadata": {}, "outputs": [], "source": [ "m0 = 0\n", "m1 = 1\n", "\n", "# Generate public key.\n", "xs = generate_pk(p, gamma, rho, tau)\n", "\n", "# Generate public key encryptions\n", "c0 = encrypt_pk(m0, xs, rho_)\n", "c1 = encrypt_pk(m1, xs, rho_)" ] }, { "cell_type": "code", "execution_count": 176, "id": "bdcbff90-4cee-4e14-b53d-8bf331ad5401", "metadata": {}, "outputs": [], "source": [ "# basic decryptions\n", "assert decrypt(c0, p) == 0\n", "assert decrypt(c1, p) == 1\n", "\n", "# Homomorphic addition\n", "assert decrypt(c0 + c0, p) == 0\n", "assert decrypt(c0 + c1, p) == 1\n", "assert decrypt(c1 + c1, p) == 0\n", "\n", "# Homomorphic multiplication\n", "assert decrypt(c0 * c0, p) == 0\n", "assert decrypt(c0 * c1, p) == 0\n", "assert decrypt(c1 * c1, p) == 1" ] }, { "cell_type": "code", "execution_count": 177, "id": "4891811e-680a-4346-a6c0-b50d8d79346b", "metadata": {}, "outputs": [], "source": [ "# Where does it break?\n", "n_iter = 0\n", "c_sum = c0\n", "while True:\n", " n_iter += 1\n", " c_sum = c_sum * c1\n", " if decrypt(c_sum, p) != 0:\n", " break" ] }, { "cell_type": "code", "execution_count": 178, "id": "cb381bc1-aeef-4d8f-a3fe-06cb77c6dd58", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "27" ] }, "execution_count": 178, "metadata": {}, "output_type": "execute_result" } ], "source": [ "n_iter" ] }, { "cell_type": "code", "execution_count": 227, "id": "63e511e4-34a1-427a-99ce-eb6be546b23e", "metadata": {}, "outputs": [], "source": [ "# Test the bound\n", "\n", "# Consider the following poly:\n", "# f(c0, c1) = a0 * c0**d + a1 * c1\n", "\n", "a0 = 10\n", "a1 = 20\n", "coef_norm = a0 + a1\n", "d = 24 # Degree of poly\n", "\n", "# Evaluate poly\n", "c_ = a0 * c0**d + a1 * c1\n", "m_ = (a0 * m0**d + a1 * m1) % 2" ] }, { "cell_type": "code", "execution_count": 228, "id": "dc3e9bca-596d-4c2b-bddd-65d3b7a77030", "metadata": {}, "outputs": [], "source": [ "def degree_bound(eta: int, rho_: int, coef_norm: int):\n", " return (eta - 4 - math.log2(coef_norm)) / (rho_ + 2)" ] }, { "cell_type": "code", "execution_count": 229, "id": "852e4017-17c3-4a9e-afa3-c787995cde26", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "22.25211612282708" ] }, "execution_count": 229, "metadata": {}, "output_type": "execute_result" } ], "source": [ "degree_bound(eta, rho_, coef_norm)" ] }, { "cell_type": "code", "execution_count": 230, "id": "3bc6f7ee-7e41-4c33-ac3d-49d74c3a05e4", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 230, "metadata": {}, "output_type": "execute_result" } ], "source": [ "decrypt(c_, p) == m_" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.4" } }, "nbformat": 4, "nbformat_minor": 5 }