{ "cells": [ { "cell_type": "markdown", "metadata": { "toc": "true" }, "source": [ "# Table of Contents\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Mémoïsation, en Python et en OCaml\n", "\n", "Ce document montre deux exemples d'implémentations d'un procédé générique (mais basique) de [mémoïsation](https://fr.wikipedia.org/wiki/M%C3%A9mo%C3%AFsation) en [Python](https://python.org/) et en [OCaml](https://ocaml.org/)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## En Python" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exemples de fonctions à mémoïser\n", "On commence avec des fonctions inutilement lentes :" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from time import sleep" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def f1(n):\n", " sleep(3)\n", " return n + 3\n", "\n", "def f2(n):\n", " sleep(4)\n", " return n * n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3 s ± 1.05 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" ] } ], "source": [ "%timeit f1(10)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4 s ± 1.25 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" ] } ], "source": [ "%timeit f2(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Mémoïsation générique, non typée\n", "C'est étrangement court !" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "def memo(f):\n", " memoire = {} # dictionnaire vide, {} ou dict()\n", " def memo_f(n): # nouvelle fonction\n", " if n not in memoire: # verification\n", " memoire[n] = f(n) # stockage\n", " return memoire[n] # lecture\n", " return memo_f # ==> f memoisée !" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Essais" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3 secondes...\n", "13\n", "0 secondes !\n", "13\n", "3 secondes...\n", "13\n", "3 secondes...\n", "13\n" ] } ], "source": [ "memo_f1 = memo(f1)\n", "\n", "print(\"3 secondes...\")\n", "print(memo_f1(10)) # 13, 3 secondes après\n", "print(\"0 secondes !\")\n", "print(memo_f1(10)) # instantanné !\n", "\n", "# différent de ces deux lignes !\n", "\n", "print(\"3 secondes...\")\n", "print(memo(f1)(10))\n", "print(\"3 secondes...\")\n", "print(memo(f1)(10)) # 3 secondes aussi !" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "122 ns ± 5.67 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)\n" ] } ], "source": [ "%timeit memo_f1(10) # instantanné !" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Et :" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4 secondes...\n", "100\n", "0 secondes !\n", "100\n" ] } ], "source": [ "memo_f2 = memo(f2)\n", "\n", "print(\"4 secondes...\")\n", "print(memo_f2(10)) # 100, 4 secondes après\n", "print(\"0 secondes !\")\n", "print(memo_f2(10)) # instantanné !" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "120 ns ± 3.43 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)\n" ] } ], "source": [ "%timeit memo_f2(10) # instantanné !" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Mémoïsation générique et typée\n", "Ce n'est pas tellement plus compliquée de typer la mémoïsation." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "def memo_avec_type(f):\n", " memoire = {} # dictionnaire vide, {} ou dict()\n", " def memo_f_avec_type(n):\n", " if (type(n), n) not in memoire:\n", " memoire[(type(n), n)] = f(n)\n", " return memoire[(type(n), n)]\n", " return memo_f_avec_type" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Avantage, on obtiens un résultat plus cohérent \"au niveau de la reproducibilité des résultats\", par exemple :" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "def fonction_sur_entiers_ou_flottants(n):\n", " if isinstance(n, int):\n", " return 'Int'\n", " elif isinstance(n, float):\n", " return 'Float'\n", " else:\n", " return '?'" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Int\n", "Float\n", "?\n" ] } ], "source": [ "test0 = fonction_sur_entiers_ou_flottants\n", "print(test0(1))\n", "print(test0(1.0)) # résultat correct !\n", "print(test0(\"1\"))" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Int\n", "Int\n", "?\n" ] } ], "source": [ "test1 = memo(fonction_sur_entiers_ou_flottants)\n", "print(test1(1))\n", "print(test1(1.0)) # résultat incorrect !\n", "print(test1(\"1\"))" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Int\n", "Float\n", "?\n" ] } ], "source": [ "test2 = memo_avec_type(fonction_sur_entiers_ou_flottants)\n", "print(test2(1))\n", "print(test2(1.0)) # résultat correct !\n", "print(test2(\"1\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Bonus : on peut utiliser la syntaxe d'un décorateur en Python" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Test de fibo() non mémoisée :\n", "F_0 = 1\n", "F_1 = 1\n", "F_2 = 2\n", "F_3 = 3\n", "F_4 = 5\n", "F_5 = 8\n", "F_6 = 13\n", "F_7 = 21\n", "F_8 = 34\n", "F_9 = 55\n" ] } ], "source": [ "def fibo(n):\n", " if n <= 1: return 1\n", " else: return fibo(n-1) + fibo(n-2)\n", "\n", "print(\"Test de fibo() non mémoisée :\")\n", "for n in range(10):\n", " print(\"F_{} = {}\".format(n, fibo(n)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Cette fonction récursive est terriblement lente !" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3.15 s ± 74.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" ] } ], "source": [ "%timeit fibo(35)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Test de fibo() mémoisée (plus rapide) :\n", "F_0 = 1\n", "F_1 = 1\n", "F_2 = 2\n", "F_3 = 3\n", "F_4 = 5\n", "F_5 = 8\n", "F_6 = 13\n", "F_7 = 21\n", "F_8 = 34\n", "F_9 = 55\n" ] } ], "source": [ "# version plus rapide !\n", "@memo\n", "def fibo2(n):\n", " if n <= 1: return 1\n", " else: return fibo2(n-1) + fibo2(n-2)\n", "\n", "print(\"Test de fibo() mémoisée (plus rapide) :\")\n", "for n in range(10):\n", " print(\"F_{} = {}\".format(n, fibo2(n)))" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "118 ns ± 2.59 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)\n" ] } ], "source": [ "%timeit fibo2(35)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Autre exemple, ou le gain de temps est moins significatif." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Test de factorielle() non mémoisée :\n", "0! = 0\n", "1! = 1\n", "2! = 2\n", "3! = 6\n", "4! = 24\n", "5! = 120\n", "6! = 720\n", "7! = 5040\n", "8! = 40320\n", "9! = 362880\n" ] } ], "source": [ "def factorielle(n):\n", " if n <= 0: return 0\n", " elif n == 1: return 1\n", " else: return n * factorielle(n-1)\n", "\n", "print(\"Test de factorielle() non mémoisée :\")\n", "for n in range(10):\n", " print(\"{}! = {}\".format(n, factorielle(n)))" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4.53 µs ± 214 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n" ] } ], "source": [ "%timeit factorielle(30)" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Test de factorielle() mémoisée :\n", "0! = 0\n", "1! = 1\n", "2! = 2\n", "3! = 6\n", "4! = 24\n", "5! = 120\n", "6! = 720\n", "7! = 5040\n", "8! = 40320\n", "9! = 362880\n" ] } ], "source": [ "@memo\n", "def factorielle2(n):\n", " if n <= 0: return 0\n", " elif n == 1: return 1\n", " else: return n * factorielle2(n-1)\n", "\n", "print(\"Test de factorielle() mémoisée :\")\n", "for n in range(10):\n", " print(\"{}! = {}\".format(n, factorielle2(n)))" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "124 ns ± 6.32 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)\n" ] } ], "source": [ "%timeit factorielle2(30)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Conclusion\n", "En Python, c'est facile, avec des dictionnaires génériques et une syntaxe facilitée avec un décorateur.\n", "\n", "Bonus : ce décorateur est dans la [bibliothèque standard](https://docs.python.org/3/library/functools.html#functools.lru_cache) dans le [module functools](https://docs.python.org/3/library/functools.html) !" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "from functools import lru_cache # lru = least recently updated" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Test de fibo() mémoisée avec functools.lru_cache (plus rapide) :\n", "F_0 = 1\n", "F_1 = 1\n", "F_2 = 2\n", "F_3 = 3\n", "F_4 = 5\n", "F_5 = 8\n", "F_6 = 13\n", "F_7 = 21\n", "F_8 = 34\n", "F_9 = 55\n" ] } ], "source": [ "@lru_cache(maxsize=None)\n", "def fibo3(n):\n", " if n <= 1: return 1\n", " else: return fibo3(n-1) + fibo3(n-2)\n", "\n", "print(\"Test de fibo() mémoisée avec functools.lru_cache (plus rapide) :\")\n", "for n in range(10):\n", " print(\"F_{} = {}\".format(n, fibo3(n)))" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "115 ns ± 2.83 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)\n" ] } ], "source": [ "%timeit fibo2(35)" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "86.6 ns ± 1.62 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)\n" ] } ], "source": [ "%timeit fibo3(35)" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "117 ns ± 1.44 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)\n" ] } ], "source": [ "%timeit fibo2(70)" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "86.2 ns ± 1.12 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)\n" ] } ], "source": [ "%timeit fibo3(70)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(On obtient presque les mêmes performances que notre implémentation manuelle)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "----\n", "## En OCaml\n", "\n", "Je traite exactement les mêmes exemples.\n", "\n", "> J'expérimente l'utilisation de deux kernels Jupyter différents pour afficher des exemples de codes écrits dans deux langages dans le même notebook... Ce n'est pas très propre mais *ça marche*." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Préliminaires\n", "\n", "Quelques fonctions nécessaires pour ces exemples :" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val print : ('a, Format.formatter, unit) format -> 'a =