{ "cells": [ { "cell_type": "markdown", "metadata": { "toc": "true" }, "source": [ "# Table of Contents\n", "

1  BitSets en OCaml
1.1  Type abstrait
1.2  Implémentation
1.3  Exemples
1.3.1  Tests des opérations unaires
1.3.2  Tests des opérations binaires
1.4  Comparaison
1.4.1  Mesure un temps d'éxecution
1.4.2  Suite de test pour bit_set_32
1.4.3  Suite de test pour bool array
1.4.4  Suite de test pour Set.Make(Int)
1.4.5  Mesurer les temps d'exécution
1.5  Conclusion
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# BitSets en OCaml\n", "\n", "Let *BitSets* sont une implémentation efficace pour représenter, stocker et manipuler des ensembles de *petits* entiers.\n", "\n", "Avec un seul entier 32 bits, on peut représenter *tous* les ensembles d'entiers de $\\{0,\\dots,31\\}$.\n", "\n", "On utilise les [opérations bit à bit (\"bitwise\")](caml.inria.fr/pub/docs/manual-ocaml/libref/Pervasives.html#VALmin_int) de OCaml pour effectuer les opérations de base sur les ensembles.\n", "\n", "Cette implémentation est inspiré de [celle ci](http://ocaml-lib.sourceforge.net/doc/BitSet.html)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Type abstrait\n", "Je vais faire ça à l'arrache, mais on pourrait faire soit un module qui contient un type interne (caché), soit avec un champ d'enregistrement `{ n : int }`.\n", "\n", "Cette approche est la plus simple, mais montre un peu le fonctionnement interne (on pourrait vouloir l'éviter)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "type bit_set_32 = int;;" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "let max_size_bit_set_32 = 32;;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Avec [Int64](http://caml.inria.fr/pub/docs/manual-ocaml/libref/Int64.html) on pourrait faire la même chose avec des entiers sur 64 bits. La flemme." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## Implémentation\n", "Les ensembles ne seront pas mutables dynamiquement : comme une liste, toute fonction qui modifie un `bit_set_32` renvoit un nouveau `bit_set_32`." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "let empty () : bit_set_32 = 0\n", ";;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Le singleton $\\{i\\}$ est codé comme $2^i$ donc $1$ au bit en $i$ème position.\n", "En Python, c'est `1 << i` (ou `2**i`), et en OCaml c'est `1 lsl i`." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val create : int -> bit_set_32 = \n" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let create (i : int) : bit_set_32 =\n", " assert ((0 <= i) && (i < max_size_bit_set_32));\n", " 1 lsl i\n", ";;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Si on voulait utiliser la syntaxe Python, on pourrait faire ça :" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val ( << ) : int -> int -> int = \n" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "val ( >> ) : int -> int -> int = \n" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let ( << ) = ( lsl );;\n", "let ( >> ) = ( lsr );;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "La copie ne fait pas beaucoup de sens comme la stucture n'est pas mutable... Mais soit." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val copy : bit_set_32 -> bit_set_32 = \n" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "val clone : bit_set_32 -> bit_set_32 = \n" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let copy (b : bit_set_32) : bit_set_32 =\n", " b + 0 (* enough ? *)\n", ";;\n", "let clone = copy;;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`set b n` permet d'ajouter `n` dans l'ensemble `b`, et `unset b n` permet de l'enlever (peu importe s'il était présent ou non)." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val set : bit_set_32 -> int -> bit_set_32 = \n" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "val add : bit_set_32 -> int -> bit_set_32 = \n" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let set (b : bit_set_32) (n : int) : bit_set_32 =\n", " b lor (1 lsl n)\n", ";;\n", "\n", "let add = set;;" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val unset : bit_set_32 -> int -> bit_set_32 = \n" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "val remove : bit_set_32 -> int -> bit_set_32 = \n" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let unset (b : bit_set_32) (n : int) : bit_set_32 =\n", " b land (lnot (1 lsl n))\n", ";;\n", "\n", "let remove = unset;;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On peut combiner `set` et `unset` pour faire :" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val put : bit_set_32 -> bool -> int -> bit_set_32 = \n" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let put (b : bit_set_32) (p : bool) (n : int) : bit_set_32 =\n", " if p then set b n else unset b n\n", ";;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ces deux autres opérations sont rapides :" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val is_set : bit_set_32 -> int -> bool = \n" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "val contains_int : bit_set_32 -> int -> bool = \n" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "val is_in : bit_set_32 -> int -> bool = \n" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let is_set (b : bit_set_32) (n : int) : bool =\n", " let i = (b land (1 lsl n)) lsr n in\n", " i = 1\n", ";;\n", "\n", "let contains_int = is_set;;\n", "let is_in = is_set;;" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val toggle : bit_set_32 -> int -> bit_set_32 = \n" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let toggle (b : bit_set_32) (n : int) : bit_set_32 =\n", " put b (not (is_set b n)) n\n", ";;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "La comparaison et le test d'égalité sont évidents." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val compare : bit_set_32 -> bit_set_32 -> int = \n" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let compare (b1 : bit_set_32) (b2 : bit_set_32) : int =\n", " Pervasives.compare b1 b2\n", ";;" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val equals : bit_set_32 -> bit_set_32 -> bool = \n" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let equals (b1 : bit_set_32) (b2 : bit_set_32) : bool =\n", " b1 = b2\n", ";;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On peut chercher à être plus efficace que cette implémentation impérative et naïve. Essayez !" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val count : bit_set_32 -> int = \n" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "val length : bit_set_32 -> int = \n" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let count (b : bit_set_32) : int =\n", " let s = ref 0 in\n", " for n = 0 to max_size_bit_set_32 - 1 do\n", " if is_set b n then incr s\n", " done;\n", " !s\n", ";;\n", "\n", "let length = count;;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pour la création des exemples, on va aussi se permettre de convertir un `bit_set_32` en une `int list`, et inversement de créer un `bit_set_32` depuis une `int list`." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val bit_set_32_to_list : bit_set_32 -> int list = \n" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let bit_set_32_to_list (b : bit_set_32) : (int list) =\n", " let list_of_b = ref [] in\n", " for n = 0 to max_size_bit_set_32 - 1 do\n", " if is_set b n then\n", " list_of_b := n :: !list_of_b\n", " done;\n", " List.rev (!list_of_b)\n", ";;" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val bit_set_32_from_list : int list -> bit_set_32 = \n" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let bit_set_32_from_list (list_of_b : int list) : bit_set_32 =\n", " let b = ref (empty()) in\n", " List.iter (fun i -> b := add !b i) list_of_b;\n", " !b\n", ";;" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "ename": "error", "evalue": "compile_error", "output_type": "error", "traceback": [ "\u001b[32mFile \"[17]\", line 2, characters 17-40:\n\u001b[31mError: Unbound value bit_set_32_iter_to_list\n\u001b[36m 1: \u001b[30mlet bit_set_32_iter (f : int -> unit) (b : bit_set_32) : unit =\n\u001b[36m 2: \u001b[30m List.iter f (\u001b[4mbit_set_32_iter_to_list\u001b[0m\u001b[30m b)\n\u001b[36m 3: \u001b[30m;;\u001b[0m\n" ] } ], "source": [ "let bit_set_32_iter (f : int -> unit) (b : bit_set_32) : unit =\n", " List.iter f (bit_set_32_iter_to_list b)\n", ";;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pour l'affichage des exemples, on va aussi se permettre de convertir un `bit_set_32` en une `string`." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val bit_set_32_to_string : bit_set_32 -> string = \n" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let bit_set_32_to_string (b : bit_set_32) : string =\n", " let list_of_b = bit_set_32_to_list b in\n", " match List.length list_of_b with\n", " | 0 -> \"{}\"\n", " | 1 -> \"{\" ^ (string_of_int (List.hd list_of_b)) ^ \"}\"\n", " | _ -> begin\n", " let s = ref (\"{\" ^ (string_of_int (List.hd list_of_b))) in\n", " List.iter (fun i -> s := !s ^ \", \" ^ (string_of_int i)) (List.tl list_of_b);\n", " !s ^ \"}\"\n", " end\n", ";;" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val print_bit_set_32 : bit_set_32 -> unit = \n" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let print_bit_set_32 (b : bit_set_32) : unit =\n", " print_endline (bit_set_32_to_string b)\n", ";;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Comme le domaine est fixé (à $\\{0,\\dots,31\\}$), on peut prendre le complémentaire." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val complementaire : bit_set_32 -> bit_set_32 = \n" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let complementaire (b : bit_set_32) : bit_set_32 =\n", " lnot b\n", ";;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Les opérations intersection, union, différence et différence symétrique sont très faciles." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val intersection : bit_set_32 -> bit_set_32 -> bit_set_32 = \n" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let intersection (b1 : bit_set_32) (b2 : bit_set_32) : bit_set_32 =\n", " b1 land b2\n", ";;" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val union : bit_set_32 -> bit_set_32 -> bit_set_32 = \n" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let union (b1 : bit_set_32) (b2 : bit_set_32) : bit_set_32 =\n", " b1 lor b2\n", ";;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Avec l'union on peut facilement tester si `b1` est contenu dans `b2` ($b_1 \\subset b_2 \\equiv (b_1 \\cup b_2) = b_2$)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val contains : bit_set_32 -> bit_set_32 -> bool = \n" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let contains (b1 : bit_set_32) (b2 : bit_set_32) : bool =\n", " (union b1 b2) = b2\n", ";;" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val difference : bit_set_32 -> bit_set_32 -> bit_set_32 = \n" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let difference (b1 : bit_set_32) (b2 : bit_set_32) : bit_set_32 =\n", " intersection b1 (complementaire b2)\n", ";;" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val difference_sym : bit_set_32 -> bit_set_32 -> bit_set_32 = \n" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let difference_sym (b1 : bit_set_32) (b2 : bit_set_32) : bit_set_32 =\n", " b1 lxor b2\n", ";;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## Exemples" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print_bit_set_32 (empty());;" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val b1 : bit_set_32 = 4097\n", "val b2 : bit_set_32 = 74\n", "val b3 : bit_set_32 = 131145\n" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "{0, 12}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "{1, 3, 6}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "{0, 3, 6, 17}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let b1 = bit_set_32_from_list [0; 12]\n", "and b2 = bit_set_32_from_list [1; 3; 6]\n", "and b3 = bit_set_32_from_list [0; 3; 6; 17]\n", ";;\n", "\n", "print_bit_set_32 b1;;\n", "print_bit_set_32 b2;;\n", "print_bit_set_32 b3;;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tests des opérations unaires" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Tester l'appartenance :" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "- : bool = false\n" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "- : bool = true\n" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "- : bool = true\n" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(is_in b1 3);;\n", "(is_in b2 3);;\n", "(is_in b3 3);;" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "- : bool = true\n" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "- : bool = false\n" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "- : bool = true\n" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(is_in b1 0);;\n", "(is_in b2 0);;\n", "(is_in b3 0);;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On peut ajouter une valeur :" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{0, 12, 20}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "{1, 3, 6, 20}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "{0, 3, 6, 17, 20}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print_bit_set_32 (add b1 20);;\n", "print_bit_set_32 (add b2 20);;\n", "print_bit_set_32 (add b3 20);;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ou l'enlever :" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{0, 12}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "{1, 6}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "{0, 6, 17}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print_bit_set_32 (remove b1 3);;\n", "print_bit_set_32 (remove b2 3);;\n", "print_bit_set_32 (remove b3 3);;" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "- : int = 2\n" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "- : int = 3\n" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "- : int = 4\n" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "length b1;;\n", "length b2;;\n", "length b3;;" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "{0, 2, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "{1, 2, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "{2, 4, 5, 7, 8, 9, 10, 11, 13, 14, 15, 16, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print_bit_set_32 (complementaire b1);;\n", "print_bit_set_32 (complementaire b2);;\n", "print_bit_set_32 (complementaire b3);;\n", "\n", "print_bit_set_32 (complementaire (union (union b1 b2) b3));;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tests des opérations binaires" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "{0, 1, 3, 6, 12}\n", "{0, 3, 6, 12, 17}\n", "{0, 1, 3, 6, 17}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print_bit_set_32 (union b1 b2);;\n", "print_bit_set_32 (union b1 b3);;\n", "print_bit_set_32 (union b2 b3);;" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "- : bool = false\n" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "- : bool = false\n" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "- : bool = false\n" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "- : bool = true\n" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "- : bool = true\n" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "- : bool = false\n" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "contains b1 b2;;\n", "contains b1 b3;;\n", "contains b1 (intersection b1 b3);;\n", "contains (intersection b1 b3) b1;;\n", "contains b1 (union b1 b3);;\n", "contains b2 b3;;" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "{0}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "{3, 6}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print_bit_set_32 (intersection b1 b2);;\n", "print_bit_set_32 (intersection b1 b3);;\n", "print_bit_set_32 (intersection b2 b3);;" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{0, 12}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "{12}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "{1}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print_bit_set_32 (difference b1 b2);;\n", "print_bit_set_32 (difference b1 b3);;\n", "print_bit_set_32 (difference b2 b3);;" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{0, 1, 3, 6, 12}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "{3, 6, 12, 17}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" }, { "name": "stdout", "output_type": "stream", "text": [ "{0, 1, 17}\n" ] }, { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print_bit_set_32 (difference_sym b1 b2);;\n", "print_bit_set_32 (difference_sym b1 b3);;\n", "print_bit_set_32 (difference_sym b2 b3);;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## Comparaison\n", "On va essayer de comparer notre implémentation avec une implémentation naïve utilisant des `bool array` et une utilisant le [module `Set`](http://caml.inria.fr/pub/docs/manual-ocaml/libref/Set.S.html) de la bibliothèque standard." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Mesure un temps d'éxecution" ] }, { "cell_type": "code", "execution_count": 132, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val time : int -> (unit -> 'a) -> float = \n" ] }, "execution_count": 132, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let time (n : int) (f : unit -> 'a) : float =\n", " let t = Sys.time() in\n", " for _ = 0 to n-1 do\n", " ignore (f ());\n", " done;\n", " let delta_t = Sys.time() -. t in\n", " let t_moyen = delta_t /. (float_of_int n) in\n", " Printf.printf \" Temps en secondes: %fs\\n\" t_moyen;\n", " flush_all ();\n", " t_moyen\n", ";;" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "- : unit = ()\n" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "val random_int_0_31 : unit -> int = \n" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Random.self_init();;\n", "\n", "let random_int_0_31 () =\n", " Random.int 32\n", ";;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Suite de test pour `bit_set_32`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notre test va consister à créer un ensemble vide, et ajouter 100 fois de suite des valeurs aléatoires, en enlever d'autres etc." ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val test_bit_set_32 : int -> int -> int -> unit -> int = \n" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let test_bit_set_32 n n1 n2 () =\n", " let b = ref (empty ()) in\n", " for _ = 0 to n do\n", " let nb_ajout = Random.int n1 in\n", " let nb_retrait = Random.int n2 in\n", " for i = 0 to nb_ajout + nb_retrait do\n", " let n = random_int_0_31 () in\n", " if i mod 2 = 0 then\n", " b := add !b n\n", " else\n", " b := remove !b n;\n", " done\n", " done;\n", " length !b\n", ";;" ] }, { "cell_type": "code", "execution_count": 91, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "- : int = 13\n" ] }, "execution_count": 91, "metadata": {}, "output_type": "execute_result" } ], "source": [ "test_bit_set_32 100 20 10 ();;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Suite de test pour `bool array`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Avec des `bool array`, on a l'avantage d'avoir une structure dynamique." ] }, { "cell_type": "code", "execution_count": 99, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val test_boolarray : int -> int -> int -> unit -> int = \n" ] }, "execution_count": 99, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let test_boolarray n n1 n2 () =\n", " let b = Array.make max_size_bit_set_32 false in\n", " for _ = 0 to n do\n", " let nb_ajout = Random.int n1 in\n", " let nb_retrait = Random.int n2 in\n", " for i = 0 to nb_ajout + nb_retrait do\n", " let n = random_int_0_31 () in\n", " if i mod 2 = 0 then\n", " b.(n) <- true\n", " else\n", " b.(n) <- false\n", " done;\n", " done;\n", " Array.fold_left (+) 0 (Array.map (fun x -> if x then 1 else 0) b)\n", ";;" ] }, { "cell_type": "code", "execution_count": 127, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "- : int = 15\n" ] }, "execution_count": 127, "metadata": {}, "output_type": "execute_result" } ], "source": [ "test_boolarray 100 20 10 ();;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Suite de test pour `Set.Make(Int)`" ] }, { "cell_type": "code", "execution_count": 116, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "module Int :\n", " sig type t = int val compare : bit_set_32 -> bit_set_32 -> int end\n" ] }, "execution_count": 116, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "module Int32Set :\n", " sig\n", " type elt = Int.t\n", " type t = Set.Make(Int).t\n", " val empty : t\n", " val is_empty : t -> bool\n", " val mem : elt -> t -> bool\n", " val add : elt -> t -> t\n", " val singleton : elt -> t\n", " val remove : elt -> t -> t\n", " val union : t -> t -> t\n", " val inter : t -> t -> t\n", " val diff : t -> t -> t\n", " val compare : t -> t -> int\n", " val equal : t -> t -> bool\n", " val subset : t -> t -> bool\n", " val iter : (elt -> unit) -> t -> unit\n", " val map : (elt -> elt) -> t -> t\n", " val fold : (elt -> 'a -> 'a) -> t -> 'a -> 'a\n", " val for_all : (elt -> bool) -> t -> bool\n", " val exists : (elt -> bool) -> t -> bool\n", " val filter : (elt -> bool) -> t -> t\n", " val partition : (elt -> bool) -> t -> t * t\n", " val cardinal : t -> int\n", " val elements : t -> elt list\n", " val min_elt : t -> elt\n", " val max_elt : t -> elt\n", " val choose : t -> elt\n", " val split : elt -> t -> t * bool * t\n", " val find : elt -> t -> elt\n", " val of_list : elt list -> t\n", " end\n" ] }, "execution_count": 116, "metadata": {}, "output_type": "execute_result" } ], "source": [ "module Int = struct\n", " type t = int\n", " let compare = compare\n", " end;;\n", "\n", "module Int32Set = Set.Make(Int);;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Avec des `Set`, on a l'avantage d'avoir une structure dynamique, mais moins facile à manipuler." ] }, { "cell_type": "code", "execution_count": 125, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "val test_Int32Set : int -> int -> int -> unit -> int = \n" ] }, "execution_count": 125, "metadata": {}, "output_type": "execute_result" } ], "source": [ "let test_Int32Set n n1 n2 () =\n", " let b = ref (Int32Set.empty) in\n", " for _ = 0 to n do\n", " let nb_ajout = Random.int n1 in\n", " let nb_retrait = Random.int n2 in\n", " for i = 0 to nb_ajout + nb_retrait do\n", " let n = random_int_0_31 () in\n", " if i mod 2 = 0 then\n", " b := Int32Set.add n !b\n", " else\n", " b := Int32Set.remove n !b\n", " done;\n", " done;\n", " Int32Set.cardinal !b\n", ";;" ] }, { "cell_type": "code", "execution_count": 126, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "- : int = 14\n" ] }, "execution_count": 126, "metadata": {}, "output_type": "execute_result" } ], "source": [ "test_Int32Set 100 20 10 ();;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Mesurer les temps d'exécution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On va faire 500 répétitions de ces tests aléatoires, chacun avec 1000 fois des ajouts de 30 valeurs et des retraits de 20 valeurs." ] }, { "cell_type": "code", "execution_count": 134, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " Temps en secondes: 0.004636s\n" ] }, { "data": { "text/plain": [ "- : float = 0.00463642199999999768\n" ] }, "execution_count": 134, "metadata": {}, "output_type": "execute_result" } ], "source": [ "time 500 (test_bit_set_32 1000 30 20);;" ] }, { "cell_type": "code", "execution_count": 135, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " Temps en secondes: 0.004257s\n" ] }, { "data": { "text/plain": [ "- : float = 0.00425716600000000146\n" ] }, "execution_count": 135, "metadata": {}, "output_type": "execute_result" } ], "source": [ "time 500 (test_boolarray 1000 30 20);;" ] }, { "cell_type": "code", "execution_count": 136, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " Temps en secondes: 0.011854s\n" ] }, { "data": { "text/plain": [ "- : float = 0.0118542480000000013\n" ] }, "execution_count": 136, "metadata": {}, "output_type": "execute_result" } ], "source": [ "time 500 (test_Int32Set 1000 30 20);;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pour un second et dernier test, on va faire 500 répétitions de ces tests aléatoires, chacun avec 500 fois des ajouts de 100 valeurs et des retraits de 110 valeurs." ] }, { "cell_type": "code", "execution_count": 140, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " Temps en secondes: 0.009721s\n" ] }, { "data": { "text/plain": [ "- : float = 0.00972052400000000469\n" ] }, "execution_count": 140, "metadata": {}, "output_type": "execute_result" } ], "source": [ "time 500 (test_bit_set_32 500 100 110);;" ] }, { "cell_type": "code", "execution_count": 141, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " Temps en secondes: 0.008604s\n" ] }, { "data": { "text/plain": [ "- : float = 0.00860422200000000685\n" ] }, "execution_count": 141, "metadata": {}, "output_type": "execute_result" } ], "source": [ "time 500 (test_boolarray 500 100 110);;" ] }, { "cell_type": "code", "execution_count": 142, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " Temps en secondes: 0.024973s\n" ] }, { "data": { "text/plain": [ "- : float = 0.024972699999999997\n" ] }, "execution_count": 142, "metadata": {}, "output_type": "execute_result" } ], "source": [ "time 500 (test_Int32Set 500 100 110);;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## Conclusion\n", "\n", "Tout ça n'a pas servi à grand chose, on a réussi à montrer que pour des petits entiers, utiliser un `bool array` de taille 32 (fixe) est plus efficace qu'utiliser ces `bit sets`.\n", "\n", "Ça suffit pour aujourd'hui !\n", "\n", "> Allez voir [d'autres notebooks](https://github.com/Naereen/notebooks/) que j'ai écrit." ] } ], "metadata": { "kernelspec": { "display_name": "OCaml 4.04.2", "language": "OCaml", "name": "ocaml-jupyter" }, "language_info": { "codemirror_mode": "text/x-ocaml", "file_extension": ".ml", "mimetype": "text/x-ocaml", "name": "OCaml", "nbconverter_exporter": null, "pygments_lexer": "OCaml", "version": "4.04.2" }, "toc": { "colors": { "hover_highlight": "#DAA520", "running_highlight": "#FF0000", "selected_highlight": "#FFD700" }, "moveMenuLeft": true, "nav_menu": { "height": "109px", "width": "250px" }, "navigate_menu": true, "number_sections": true, "sideBar": false, "threshold": 4, "toc_cell": true, "toc_position": { "height": "430px", "left": "1331px", "right": "48.6667px", "top": "168px", "width": "220px" }, "toc_section_display": "block", "toc_window_display": true }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }