{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "2LtjTw3rU7jX" }, "source": [ "# Problem 5\n", "Let $ f(z) = 1 / \\Gamma(z)$ where $ \\Gamma(z) $ is the gamma function, and let $p(z)$ be the cubic polynomial that best approximates $f(z)$ on the unit disk in the supremum norm $\\Vert{\\cdot}\\Vert_{\\infty}$. What is $\\Vert{f-p}\\Vert_{\\infty}$?" ] }, { "cell_type": "markdown", "metadata": { "id": "bhKq_qnBWtQb" }, "source": [ "### Some mathematical preliminaries\n", "\n", "Recall that the supremum norm of a function $f$ defined on domain $D$ is defined by $\\Vert{f}\\Vert_\\infty := \\max_{x \\in D} |f(x)| $. \n", "\n", "If we let our optimal polynomial $p(x) := ax^3 + bx^2 + cx + d $, we have that $$ \\epsilon := \\Vert f-p \\Vert_\\infty = \\max_{x \\in D} |f(x)-p(x)| $$ \n", "\n", "where $D$ is the unit disk.\n", "\n", "Now, note that because $1/\\Gamma(x)$ and $p(x)$ are entire functions, we have that, by the [maximum principle](https://en.wikipedia.org/wiki/Maximum_modulus_principle) that the maximum lies on the boundary of $D$, i.e. the unit circle.\n", "\n", "We therefore have \n", "$$\\epsilon = \\min_{a,b,c,d \\in \\mathbb{R}} \\max_{\\theta \\in [0,2 \\pi)} \\left|\\frac{1}{\\Gamma(e^{i\\theta})}- \\left(a e^{3i\\theta} + b e^{2i\\theta} + c e^{i \\theta} + d \\right) \\right| $$\n", "\n", "### Brute force\n", "Firstly, we attempt out-of-the-box minimization routines. For an initial guess, we use the coefficients of the Maclaurin series of $f(z)$, namely:\n", "\n", "$$ f(z) \\approx f_0(z) = f(0) + f'(0) z + \\frac{f''(0)}{2!} z^2 + \\frac{f'''(0)}{3!} z^3$$\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/latex": [ "$\\displaystyle z + \\gamma z^{2} + z^{3} \\left(- \\frac{\\pi^{2}}{12} + \\frac{\\gamma^{2}}{2}\\right) + O\\left(z^{4}\\right)$" ], "text/plain": [ "z + EulerGamma*z**2 + z**3*(-pi**2/12 + EulerGamma**2/2) + O(z**4)" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import sympy\n", "from sympy import gamma as sp_gamma\n", "from sympy.abc import z\n", "\n", "sympy.series(1/sp_gamma(z), z, n=4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have that:\n", "\n", "$$(a_0, b_0, c_0, d_0) = \\left( \\frac{\\gamma^2}{2} - \\frac{\\pi^2}{12} , \\gamma, 1, 0\\right) \\approx (-0.655878, 0.577216, 1, 0) $$\n", "\n", "We first use [differential evolution](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.differential_evolution.html) to find a rough global minimum of the error:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ " fun: 0.2143352172675715\n", " jac: array([0.86039537, 0.64650538, 0.09199849, 0.63088437])\n", " message: 'Optimization terminated successfully.'\n", " nfev: 7550\n", " nit: 121\n", " success: True\n", " x: array([-0.60337732, 0.62514893, 1.01969876, 0.0055078 ])" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from scipy.special import gamma\n", "from scipy.optimize import differential_evolution, minimize\n", "from numpy import cos, sin, pi\n", "from tqdm import tqdm_notebook as tqdm\n", "import numpy as np\n", "\n", "cis = lambda t: cos(t) + 1j*sin(t)\n", "\n", "a0, b0, c0, d0 = (-0.655878, 0.577216, 1, 0)\n", "\n", "g = lambda a,b,c,d: lambda t: abs( a*cis(3*t) + b*cis(2*t) + c*cis(t) + d - 1 / gamma(cis(t)))\n", "\n", "def fn_to_minimize(args, N=1000):\n", " # print(args)\n", " a,b,c,d = args\n", " X = np.linspace(0, 2*pi, N)\n", " Y = g(a,b,c,d)(X)\n", " return Y.max()\n", "\n", "rough_min = differential_evolution(fn_to_minimize, \n", " bounds = [(-1.1,1.1) for i in range(4)], \n", " tol=1e-6,\n", " x0=(a0,b0,c0,d0),\n", " seed = 42)\n", "\n", "rough_min" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To refine this estimate, we then use [local minimization techniques](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html) to get a better answer:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/tmp/ipykernel_11995/992212994.py:17: TqdmDeprecationWarning: This function will be removed in tqdm==5.0.0\n", "Please use `tqdm.notebook.tqdm` instead of `tqdm.tqdm_notebook`\n", " iterations = tqdm()\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "71d85a3cd28342d9bb55d0bc6e985109", "version_major": 2, "version_minor": 0 }, "text/plain": [ "0it [00:00, ?it/s]" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "minimum F seen: [-0.60337732 0.62514893 1.01969876 0.0055078 ] 0.2143355714038389\n", "minimum F seen: [-0.60337732 0.62514893 1.01969878 0.0055078 ] 0.21433556169152873\n", "minimum F seen: [-0.60337732 0.62514893 1.01969876 0.00550781] 0.2143355586581085\n", "minimum F seen: [-0.60337752 0.62514878 1.01969892 0.00550799] 0.21433538060117682\n", "minimum F seen: [-0.60337752 0.62514878 1.01969892 0.00550799] 0.21433538060117652\n", "minimum F seen: [-0.60337752 0.62514878 1.01969892 0.00550799] 0.2143353806011763\n" ] }, { "data": { "text/plain": [ " fun: 0.21433538060117682\n", " hess_inv: array([[ 1.06212648, 0.14313859, 0.29009927, 0.16611956],\n", " [ 0.14313859, 1.17835123, 0.14218146, 0.02721326],\n", " [ 0.29009927, 0.14218146, 0.52621387, -0.45964763],\n", " [ 0.16611956, 0.02721326, -0.45964763, 0.60953961]])\n", " jac: array([0.24456594, 0.4963477 , 0.90874575, 0.4963477 ])\n", " message: 'Desired error not necessarily achieved due to precision loss.'\n", " nfev: 272\n", " nit: 1\n", " njev: 52\n", " status: 2\n", " success: False\n", " x: array([-0.60337752, 0.62514878, 1.01969892, 0.00550799])" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from tqdm import tqdm_notebook as tqdm\n", "\n", "def maximize_function(F, a, b):\n", " F = np.vectorize(F)\n", " i = 0\n", " while b-a > 1e-16:\n", " X = np.linspace(a, b, 1000)\n", " Y = F(X)\n", " X_max = X[Y.argmax()]\n", " a, b = X_max - (b-a)/3, X_max + (b-a)/3\n", " if (a == X_max - (b-a)/3): break\n", " i += 1\n", " # print(i)\n", " return (F(a) + F(b)) / 2\n", "\n", "min_F_seen = 10000\n", "iterations = tqdm()\n", "\n", "def fn_to_minimize2(args):\n", " global min_F_seen, iterations\n", " a,b,c,d = args\n", " F = g(a,b,c,d)\n", " out = maximize_function(F, 0, 2*pi)\n", " if out < min_F_seen:\n", " print(\"minimum F seen: \", args, out)\n", " min_F_seen = out\n", " iterations.update()\n", " return out\n", "\n", "minimize(fn_to_minimize2, x0 = rough_min.x, tol=1e-12) # 0.2143352345" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With these techniques, we are able to get 6 accurate digits. Note that this is especially difficult because we are trying to find the minimum of a maximum of a function - therefore, any numerical error will get compounded rapidly" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Better maximization\n", "\n", "We next follow the solution submitted by [Kim McInturff and Peter Simon](https://web.archive.org/web/20040131081114/http://www.vcnet.com/~simonp/solutions.pdf). Namely, this approach gets rid of the numerical instability of calculating the inner maximum above.\n", "\n", "Let $f(z) = \\sum_{k=0}^N c_k z^k$ and $p(z) = \\sum_{k=0}^3 a_k z^k $. We therefore have that the error $ E(z) = f(z) - p(z) = \\sum_{k=0}^N E_k z^k $ is a polynomial in $z$.\n", "\n", "We find the maximum of $|E(z)|^2$ analytically (and therefore $|E(z)|$ since it is a non-negative quantity). We have: \n", "\n", "$$ \\begin{align*}\n", "|E(z)|^2 &= E(z)\\overline{E(z)} \\\\\n", "&= \\left(\\sum_{k=0}^N E_k z^k \\right) \\overline{\\left( \\sum_{k=0}^N E_k z^k \\right)} \\\\\n", "&= \\left(\\sum_{k=0}^N E_k z^k \\right) \\left( \\sum_{k=0}^N E_k \\left(\\overline{z}\\right)^k \\right) \\\\\n", "&= \\sum_{0\\le p,q \\le N} E_p E_q z^p \\left(\\overline{z}\\right)^q \\\\\n", "&= \\sum_{0\\le p,q \\le N} E_p E_q e^{(p-q)i\\theta} & (z \\mapsto e^{i\\theta}) \n", "\\end{align*}\n", "$$\n", "\n", "Therefore, a maximum must occur when $\\frac{d}{d\\theta} |E(z)|^2 = 0$. Taking the derivative, we have:\n", "\n", "$$ \\begin{align*}\n", "\\frac{d}{d\\theta} |E(z)|^2 &= \\frac{d}{d\\theta} \\left( \\sum_{0\\le p,q \\le N} E_p E_q e^{(p-q)i\\theta} \\right) \\\\\n", "&= \\sum_{0\\le p,q \\le N} E_p E_q \\left(\\frac{d}{d\\theta} e^{(p-q)i\\theta} \\right) \\\\\n", "&= \\sum_{0\\le p,q \\le N} E_p E_q \\left(i (p-q) e^{(p-q)i\\theta}\\right) \\\\\n", "&= \\sum_{0\\le p,q \\le N} E_p E_q \\left(i (p-q) z^{p-q}\\right) & (e^{i\\theta} \\mapsto z)\\\\\n", "\\end{align*}\n", "$$ \n", "\n", "We therefore have:\n", "\n", "$$ \\frac{d}{d\\theta} |E(z)|^2 = 0 \\iff \\sum_{0\\le p,q \\le N} E_p E_q (p-q) z^{p-q} = 0$$\n", "\n", "Note that this series has some negative coefficients of $z^i$, so we multiply through by $z^N$. We can solve for the roots of this polynomial $P(z) \\equiv \\sum_{0\\le p,q \\le N} E_p E_q (p-q) z^{N+p-q} $ for the critical points of $|E(z)|^2$ and then locate the maximum. This avoids the floating point issues of finding the maximum we were dealing with earlier.\n", "\n", "We find the coefficients $c_k$ for $f(z)$ via [the following recurrence](https://en.wikipedia.org/wiki/Reciprocal_gamma_function#Taylor_series):\n", "\n", "$$\\begin{cases}\n", "c_0, c_1, c_2 = 0, 1, \\gamma \\\\\n", "c_n = \\frac{c_2 c_{n-1} + \\sum_{j=2}^{n-1} (-1)^{j+1} \\zeta(j) c_{n-j}}{n-1}\n", "\\end{cases}\n", "$$" ] }, { "cell_type": "code", "execution_count": 134, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "new minimum: E([-0.61709254 0.63484243 1.04677988 0.01128302]) = 0.2354690226759797\n", "new minimum: E([-0.61392552 0.60879034 1.02295489 0.01537967]) = 0.23177731011348132\n", "new minimum: E([-0.6154001 0.61806856 1.00238181 -0.00354295]) = 0.22808068663267253\n", "new minimum: E([-0.6059147 0.61938119 1.02002916 0.00268617]) = 0.2207281019321637\n", "new minimum: E([-0.60481986 0.62270918 1.01769605 0.00498882]) = 0.2148275797479694\n", "new minimum: E([-0.60376176 0.62427757 1.01880487 0.0051676 ]) = 0.21440289839988855\n", "new minimum: E([-0.60325694 0.62537509 1.0199267 0.00562997]) = 0.21433660211622288\n", "new minimum: E([-0.60328167 0.62532152 1.01987165 0.00560401]) = 0.21433556274375193\n", "new minimum: E([-0.60334874 0.62520381 1.01975375 0.00553643]) = 0.2143352441053902\n", "new minimum: E([-0.60334201 0.62521689 1.01976683 0.00554315]) = 0.21433523787921516\n", "new minimum: E([-0.60334067 0.62521834 1.01976828 0.0055445 ]) = 0.2143352362675276\n", "new minimum: E([-0.60334324 0.62521188 1.01976182 0.00554194]) = 0.2143352345948142\n", "new minimum: E([-0.60334318 0.62521201 1.01976195 0.00554199]) = 0.21433523459104564\n", "new minimum: E([-0.60334322 0.62521191 1.01976185 0.00554195]) = 0.21433523459059878\n" ] }, { "data": { "text/plain": [ " fun: 0.21433523459048126\n", " message: 'Optimization terminated successfully.'\n", " nfev: 17145\n", " nit: 283\n", " success: True\n", " x: array([-0.60334323, 0.6252119 , 1.01976184, 0.00554194])" ] }, "execution_count": 134, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import mpmath\n", "import numpy as np\n", "from numpy.polynomial import Polynomial\n", "from collections import defaultdict\n", "\n", "N = 30\n", "\n", "mpmath.mp.dps = 50\n", "\n", "gamma = mpmath.mp.euler\n", "coeffs = [0, 1, gamma]\n", "for n in range(3, N+1):\n", " next_coeff = (coeffs[2]*coeffs[n-1] + sum([pow(-1, j+1)*mpmath.zeta(j)*coeffs[n-j] for j in range(2,n)])) / (n-1)\n", " coeffs.append(next_coeff)\n", "f_coeffs = [np.longdouble(str(i)) for i in coeffs]\n", "\n", "a0, b0, c0, d0 = f_coeffs[:4][::-1]\n", "\n", "min_F_seen = 10000\n", "DIGITS = 0\n", "\n", "def fn_to_minimize3(args):\n", " global min_F_seen, DIGITS\n", " p = list(args[::-1]) + [0 for i in range(N-3)]\n", " # Calculate coefficients of error\n", " E_coeffs = [f_i-p_i for f_i,p_i in zip(f_coeffs,p)]\n", " # Calculate p\n", " P_arr = np.zeros(2*N+1, dtype=float)\n", " for p in range(N+1):\n", " for q in range(N+1):\n", " P_arr[N+p-q] += E_coeffs[p]*E_coeffs[q]*(p-q)\n", " # Get roots of P\n", " p = Polynomial(P_arr)\n", " roots = p.roots()\n", " # Get max from critical points\n", " E = lambda z: sum([coeff*pow(z,n) for n,coeff in enumerate(E_coeffs)])\n", " potential_maxs = [ abs(E(r)) for r in roots if 0.95<=abs(r)<=1.05]\n", " out = max(potential_maxs)\n", " if out < min_F_seen:\n", " if abs(min_F_seen - out) <= pow(10, -(DIGITS+1)):\n", " print(f\"new minimum: E({args}) = {min_F_seen}\")\n", " DIGITS += 1\n", " min_F_seen = out\n", " return out\n", "\n", "differential_evolution(fn_to_minimize3, \n", " bounds=[(-1.1,1.1) for i in range(4)], \n", " x0=(a0,b0,c0,d0), \n", " tol=1e-12,\n", " seed=42)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With this approach, we are able to get more than 10 digits of the answer.\n", "\n", "### Addendum\n", "\n", "Note that this problem is a general case of finding the best approximation to a function in the supremum norm - in such cases, usually the [Remez algorithm](https://en.wikipedia.org/wiki/Remez_algorithm) is used. However, I was unable to find a Python implementation of this algorithm and my own implementation of the algorithm failed to work." ] } ], "metadata": { "colab": { "authorship_tag": "ABX9TyMeneRFVZdwa0IPauLZJO+E", "name": "Problem 5", "version": "" }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" } }, "nbformat": 4, "nbformat_minor": 1 }