{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Mathematical Preliminaries\n", "\n", "This chapter presents some basic mathematical notions, along with some more advanced results from probability, focusing on useful inequalities.\n", "\n", "It illustrates some of the mathematical notions using R, in some cases illustrating different algorithmic approaches\n", "to the same problem.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sets\n", "A set is a collection of objects, called members or elements of the set, without\n", "regard for their order.\n", "$a \\in A$, pronounced \"$a$ is an element of $A$,\" \"$a$ is in $A$,\" or \"$a$ is a member of\n", "$A$\" means that $a$ is an element of the set $A$.\n", "This is the same as writing $A \\ni a$, which is pronounced \"$A$ contains $a$.\"\n", "If $a$ is not an element of $A$, we write $a \\not \\in A$.\n", "Sets may be described explicitly by listing their contents, or implicitly\n", "by specifying a property that all elements of the set share, or a condition\n", "that they satisfy.\n", "The contents of sets are enclosed in curly braces: $\\{ \\}$.\n", "Examples:\n", "+ $A = \\{a, b, c, d \\}$: the set containing the four elements $a$, $b$, $c$, and $d$.\n", "+ $\\emptyset = \\{ \\}$: the empty set, the set that contains no elements.\n", "+ ${\\mathbf Z} \\equiv \\{ \\ldots, -2, -1, 0, 1, 2, \\ldots \\}$: the integers.\n", "+ ${\\mathbf N} \\equiv \\{1, 2, 3, \\ldots \\}$: the natural (counting) numbers.\n", "+ $\\Re \\equiv (-\\infty, \\infty)$: the real numbers.\n", "+ $\\Re^+ \\equiv [-\\infty, \\infty]$: the extended real numbers.\n", "+ ${\\mathbf C} \\equiv \\{ a + bi: a, b \\in \\Re \\}$, where $i = \\sqrt{-1}$: the complex numbers.\n", "+ ${\\mathbf Q} \\equiv \\{ a/b: a, b \\in {\\mathbf Z}, b \\ne 0\\}$: the rational numbers.\n", "\n", "\n", "$B$ is a _subset_ of $A$, written $B \\subset A$ or $A \\supset B$, if every element\n", "of the set $B$ is also an element of the set $A$.\n", "Thus ${\\mathbf N} \\subset {\\mathbf Z} \\subset {\\mathbf Q} \\subset \\Re \\subset {\\mathbf C}$.\n", "The empty set $\\emptyset$ is a subset of every set.\n", "If $A \\subset B$ and $B \\subset A$, $A$ and $B$ are the same set, and we write $A = B$.\n", "If $B$ is not a subset of $A$, we write $B \\not \\subset A$ or $A \\not \\supset B$.\n", "$B$ is a _proper subset_ of $A$ if $B \\subset A$ but\n", "$A \\not \\subset B$.\n", "\n", "The _complement_ of $A$ (with respect to the universe ${\\mathcal X}$), written $A^c$ or $A'$,\n", "is the set of all objects under consideration (${\\mathcal X}$) that are not elements of $A$.\n", "That is, $A^c \\equiv \\{ a \\in {\\mathcal X} : a \\not \\in A\\}$.\n", "\n", "The _intersection_ of $A$ and $B$, written $A \\cap B$ or $AB$, is the set of all objects that\n", "are elements of both $A$ and $B$:\n", "$$\n", "A \\cap B \\equiv \\{a: a \\in A \\mbox{ and } a \\in B \\}.\n", "$$\n", "If $A \\cap B = \\emptyset$, we say $A$ and $B$ are _disjoint_ or _mutually\n", "exclusive_.\n", "\n", "The _union_ of $A$ and $B$, written $A \\cup B$, is the set of all objects that\n", "are elements of $A$ or of $B$ (or both):\n", "$$\n", "A \\cup B \\equiv \\{a: a \\in A \\mbox{ or } a \\in B \\mbox{ or both} \\}.\n", "$$\n", "\n", "The _difference_ or _set difference_ of $A$ and $B$, $A \\setminus B$, pronounced \"$A$ minus $B$,\" is\n", "the set of all elements of $A$ that are not elements of $B$:\n", "$$\n", "A \\setminus B \\equiv \\{a \\in A : a \\not \\in B \\} = A \\cap B^c.\n", "$$\n", "\n", "_Intervals_ are special subsets of ${\\mathbf R}$:\n", "$$\n", "[a, b] \\equiv \\{x \\in \\Re : a \\le x \\le b\\}\n", "$$\n", "$$\n", "(a, b] \\equiv \\{x \\in \\Re : a < x \\le b\\}\n", "$$\n", "$$\n", "[a, b) \\equiv \\{x \\in \\Re : a \\le x < b\\}\n", "$$\n", "$$\n", "(a, b) \\equiv \\{x \\in \\Re : a < x < b\\}.\n", "$$\n", "Sometimes we have a collection of sets, indexed by elements of another\n", "set: $\\{A_\\beta : \\beta \\in B \\}$.\n", "Then $B$ is called an _index set_.\n", "If $B$ is a subset of the integers ${\\mathbf Z}$, usually we write $A_i$ or $A_j$, etc.,\n", "rather than $A_\\beta$.\n", "If $B = {\\mathbf N}$, we usually write $\\{A_j\\}_{j=1}^\\infty$ rather than\n", "$\\{A_\\beta : \\beta \\in {\\mathbf N} \\}$.\n", "$$\n", "\\bigcap_{\\beta \\in B} A_\\beta \\equiv \\{a: a \\in A_\\beta \\;\\;\\forall \\beta \\in B \\}.\n", "$$\n", "($\\forall$ means \"for all.\")\n", "If $B = \\{1, 2, \\ldots, n\\}$, we usually write $\\bigcap_{j=1}^n A_j$ rather than\n", "$\\bigcap_{j \\in \\{1, 2, \\ldots, n\\}} A_j$.\n", "The notation $\\bigcup_{\\beta \\in B} A_\\beta$ and $\\bigcup_{j=1}^n A_j$ are defined analogously.\n", "\n", "A collection of sets $\\{A_\\beta : \\beta \\in B \\}$ is _pairwise disjoint_\n", "if $A_\\beta \\cap A_{\\beta'} = \\emptyset$ whenever $\\beta \\ne \\beta'$.\n", "The collection $\\{A_\\beta : \\beta \\in B\\}$ _exhausts_ or _covers_\n", "the set $A$ if $A \\subset \\bigcup_{\\beta \\in B} A_\\beta$.\n", "The collection $\\{A_\\beta : \\beta \\in B \\}$ is a _partition_ of the\n", "set $A$ if $A = \\cup_{\\beta \\in B} A_\\beta$ and the sets $\\{A_\\beta : \\beta \\in B \\}$\n", "are pairwise disjoint.\n", "If $\\{A_\\beta : \\beta \\in B \\}$ are pairwise disjoint and exhaust $A$, then\n", "$\\{A_\\beta \\cap A : \\beta \\in B \\}$ is a partition of $A$.\n", "\n", "A set is _countable_ if its elements can be put in one-to-one correspondence with\n", "a subset of ${\\mathbf N}$.\n", "A set is _finite_ if its elements can be put in one-to-one correspondence with\n", "$\\{1, 2, \\ldots, n\\}$ for some $n \\in {\\mathbf N}$.\n", "If a set is not finite, it is infinite.\n", "${\\mathbf N}$, ${\\mathbf Z}$, and ${\\mathbf Q}$ are infinite but countable;\n", "${\\mathbf R}$ is infinite and uncountable.\n", "\n", "The notation $\\# A$, pronounced \"the cardinality of $A$\" is the size of the set $A$.\n", "If $A$ is finite, $\\# A$ is the number of elements in $A$.\n", "If $A$ is not finite but $A$ is countable (if its elements can be put in one-to-one\n", "correspondence with the elements of ${\\mathbf N}$), then $\\# A = \\aleph_0$ (aleph-null).\n", "If the elements of $A$ can be put in one-to-one correspondence with the elements of $\\mathbf{R}$, then\n", "$\\#A = c$, _the cardinality of the continuum_.\n", "\n", "The _power set_ of a set $A$, denoted $2^A$, is the set of all subsets of the set $A$.\n", "For example, the power set of $\\{a, b, c\\}$ is\n", "$$\n", "\\{ \\emptyset, \\{a\\}, \\{b\\}, \\{c\\}, \\{a, b\\}, \\{a, c\\}, \\{b, c\\}, \\{a, b, c\\} \\}.\n", "$$\n", "If $A$ is a finite set, the cardinality of the power set of $A$ is $2^{\\# A}$.\n", "This can be seen as follows: suppose $\\# A = n$ is finite.\n", "Consider the elements of $A$ to be written in some canonical order.\n", "We can specify an element of the power set by an $n$-digit binary number.\n", "The first digit is 1 if the first element of $A$ is in the subset, and 0 otherwise.\n", "The second digit is 1 if the second element of $A$ is in the subset, and 0 otherwise,\n", "etc.\n", "There are $2^n$ $n$-digit binary numbers, so there are $2^n$ subsets.\n", "The cardinality of the power set of ${\\mathbf N}$ is not $\\aleph_0$.\n", "\n", "If $A$ is a finite set, $B$ is a countable set\n", "and $\\{A_j : \\beta \\in B \\}$ is a partition of $A$, then\n", "$$\n", "\\# A = \\sum_{\\beta \\in B} \\# A_\\beta.\n", "$$\n", "\n", "Suppose $A = A_1 \\cup A_2$, but that possibly $A_1 A_2 \\ne \\emptyset$,\n", "so $\\{A_1, A_2\\}$ might not be a partition of $A$, because\n", "$A_1$ and $A_2$ are not necessarily disjoint.\n", "Then still \n", "$$\\#A = \\#A_1 + \\#A_2 - \\#A_1A_2.$$\n", "\n", "This is seen most easily using a Venn diagram, and can be proved\n", "by constructing a partition of $A$,\n", "$A = A_1A_2^c \\cup A_1^cA_2 \\cup A_1A_2$, and noting that\n", "$\\#A_1 + \\#A_2 = \\#A_1A_2^c + \\#A_1^cA_2 + 2\\#A_1A_2$.\n", "\n", "If \n", "$A = A_1 \\cup A_2 \\cup A_3$ but $\\{A_1, A_2, A_3\\}$ are not necessarily disjoint,\n", "then still\n", "\n", "$$\\#A = \\#A_1 + \\#A_2 + \\#A_3 - \\#A_1A_2 - \\#A_1A_3 - \\#A_2A_3 + \\#A_1A_2A_3.$$\n", "\n", "More generally, if $A = \\cup_{i=1}^n A_i$, then the _Inclusion-Exclusion Principle_ holds:\n", "\n", "$$ \\#A = \\sum_{i=1}^n \\#A_i - \\sum_{1 \\le i_1 < i_2 \\le n} \\#(A_{i_1}A_{i_2}) +\n", "\\sum_{1 \\le i_1 < i_2 < i_3 \\le n} \\#(A_{i_1}A_{i_2}A_{i_3}) - \\cdots \n", "+(-1)^{k-1} \\sum_{1 \\le i_1 < i_2 < \\cdots < i_k \\le n} \\# (A_{i_1}A_{i_2} \\cdots A_{i_k}) + \\cdots\n", "$$\n", "\n", "### Some useful set relations\n", "\n", "+ If $A\\subset B$ and $B\\subset A$ then $A = B$. (If two sets are subsets of each other, they are the same set.)\n", "\n", "+ $\\emptyset \\subset A$. (The empty set is a subset of every set.)\n", "\n", "+ $\\emptyset \\cup A = A$.\n", "(The union of the empty set and any other set is that set.)\n", "\n", "+ $\\emptyset \\cap A = \\emptyset$.\n", "(The intersection of the empty set and any other set is empty.)\n", "\n", "+ If $A \\subset B$ and $B \\subset C$ then $A \\subset C$. (The subset relation is transitive.)\n", "\n", "+ If $A \\subset B$ then $B^c \\subset A^c$.\n", "(Complementation reverses the subset relation.)\n", "\n", "+ $A \\cap B \\subset A$. Moreover, $A\\cap B = A$ if and only if $A \\subset B$.\n", "\n", "+ $A \\subset A\\cup B$. Moreover, $A = A\\cup B$ if and only if $B \\subset A$.\n", "\n", "+ $(A\\cap B)^c = A^c \\cup B^c$. (de Morgan)\n", "\n", "+ $(A\\cup B)^c = A^c\\cap B^c$. (de Morgan)\n", "\n", "+ $A \\cap B = B \\cap A$. (Intersection is commutative.)\n", "\n", "+ $A\\cup B = B\\cup A$. (Union is commutative.)\n", "\n", "+ $A\\cap (B\\cap C) = (A\\cap B)\\cap C$. (Intersection is associative.)\n", "\n", "+ $A\\cup (B\\cup C) = (A\\cup B)\\cup C$. (Union is associative.)\n", "\n", "+ $A\\cap (B\\cup C) = (A\\cap B)\\cup (A\\cap C)$. (Distribution of intersection over union.)\n", "\n", "+ $A\\cup (B\\cap C) = (A\\cup B)\\cap (A\\cup C)$. (Distribution of union over intersection.)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Sets in R" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A: 1 2 3 3 \n", "A as a set (duplicates removed): 1 2 3 \n", "1 is in A: TRUE \n", "1 is in A (2nd method): TRUE \n", "\"green\" is in A: FALSE \n", "\"green\" is in B: TRUE \n" ] } ], "source": [ "# Make three sets, A = {1, 2, 3}, B = {\"a\", \"b\", \"green\", 3}, C = {1, 2}\n", "A <- c(1, 2, 3, 3); # this is a list but not a set, because 3 is listed twice\n", "cat('A:',A,'\\n')\n", "A <- unique(A); # this is a set containing the unique elements of A\n", "cat('A as a set (duplicates removed):',A,'\\n')\n", "B <- c(\"a\", \"b\", \"green\", 3);\n", "C <- 1:2; #a different construction\n", "#\n", "# Set membership\n", "cat('1 is in A:',1 %in% A,'\\n') # is 1 an element of A?\n", "# another way to check set membership\n", "cat('1 is in A (2nd method):',is.element(1, A),'\\n')\n", "cat('\"green\" is in A:',\"green\" %in% A,'\\n')\n", "cat('\"green\" is in B:',\"green\" %in% B,'\\n')\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "C a subset of A: TRUE \n", "C a subset of B: FALSE \n", "C is a subset of C: TRUE \n", "C a subset of A: TRUE \n", "C a subset of B: FALSE \n", "C a subset of C: TRUE \n" ] } ], "source": [ "# Subsets. We will look at two approaches:\n", "\n", "# A is a subset of B if a is an element of B for every a in A\n", "cat('C a subset of A:',all(C %in% A),'\\n') # C is a subset of A\n", "cat('C a subset of B:',all(C %in% B),'\\n') # C is not a subset of B\n", "cat('C is a subset of C:',all(C %in% C),'\\n') # C is a subset of itself\n", "\n", "# A is a subset of B if A\\B is empty\n", "cat('C a subset of A:',length(setdiff(C,A)) == 0,'\\n') # C is a subset of A\n", "cat('C a subset of B:',length(setdiff(C,B)) == 0,'\\n') # C is not a subset of B\n", "cat('C a subset of C:',length(setdiff(C, C)) == 0,'\\n') # C is a subset of itself\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "AUB: 1 2 3 a b green \n", "AB: 3 \n", "A\\B: 1 2 \n" ] } ], "source": [ "# unions and intersections\n", "cat('AUB:',union(A,B), '\\n') # union of A and B\n", "cat('AB:',intersect(A, B), '\\n') # intersection of A and B\n", "\n", "# set difference\n", "cat('A\\\\B:',setdiff(A,B), '\\n') # A \\ B" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "3" ], "text/latex": [ "3" ], "text/markdown": [ "3" ], "text/plain": [ "[1] 3" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/html": [ "4" ], "text/latex": [ "4" ], "text/markdown": [ "4" ], "text/plain": [ "[1] 4" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Cardinality of a finite set\n", "\n", "length(unique(A)) # A has 3 (unique) elements\n", "length(unique(B)) # B has 4 (unique) elements" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "'2^A:'" ], "text/latex": [ "'2^A:'" ], "text/markdown": [ "'2^A:'" ], "text/plain": [ "[1] \"2^A:\"" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/html": [ "
    \n", "\t
    1. \n", "
    \n", "
  1. \n", "\t
  2. 1
  3. \n", "\t
  4. 2
  5. \n", "\t
    1. \n", "\t
    2. 1
    3. \n", "\t
    4. 2
    5. \n", "
    \n", "
  6. \n", "\t
  7. 3
  8. \n", "\t
    1. \n", "\t
    2. 1
    3. \n", "\t
    4. 3
    5. \n", "
    \n", "
  9. \n", "\t
    1. \n", "\t
    2. 2
    3. \n", "\t
    4. 3
    5. \n", "
    \n", "
  10. \n", "\t
    1. \n", "\t
    2. 1
    3. \n", "\t
    4. 2
    5. \n", "\t
    6. 3
    7. \n", "
    \n", "
  11. \n", "
\n" ], "text/latex": [ "\\begin{enumerate}\n", "\\item \\begin{enumerate*}\n", "\\end{enumerate*}\n", "\n", "\\item 1\n", "\\item 2\n", "\\item \\begin{enumerate*}\n", "\\item 1\n", "\\item 2\n", "\\end{enumerate*}\n", "\n", "\\item 3\n", "\\item \\begin{enumerate*}\n", "\\item 1\n", "\\item 3\n", "\\end{enumerate*}\n", "\n", "\\item \\begin{enumerate*}\n", "\\item 2\n", "\\item 3\n", "\\end{enumerate*}\n", "\n", "\\item \\begin{enumerate*}\n", "\\item 1\n", "\\item 2\n", "\\item 3\n", "\\end{enumerate*}\n", "\n", "\\end{enumerate}\n" ], "text/markdown": [ "1. \n", "\n", "\n", "2. 1\n", "3. 2\n", "4. 1. 1\n", "2. 2\n", "\n", "\n", "\n", "5. 3\n", "6. 1. 1\n", "2. 3\n", "\n", "\n", "\n", "7. 1. 2\n", "2. 3\n", "\n", "\n", "\n", "8. 1. 1\n", "2. 2\n", "3. 3\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "[[1]]\n", "numeric(0)\n", "\n", "[[2]]\n", "[1] 1\n", "\n", "[[3]]\n", "[1] 2\n", "\n", "[[4]]\n", "[1] 1 2\n", "\n", "[[5]]\n", "[1] 3\n", "\n", "[[6]]\n", "[1] 1 3\n", "\n", "[[7]]\n", "[1] 2 3\n", "\n", "[[8]]\n", "[1] 1 2 3\n" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/html": [ "'2^B:'" ], "text/latex": [ "'2^B:'" ], "text/markdown": [ "'2^B:'" ], "text/plain": [ "[1] \"2^B:\"" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/html": [ "
    \n", "\t
  1. \n", "\t
  2. 'a'
  3. \n", "\t
  4. 'b'
  5. \n", "\t
    1. \n", "\t
    2. 'a'
    3. \n", "\t
    4. 'b'
    5. \n", "
    \n", "
  6. \n", "\t
  7. 'green'
  8. \n", "\t
    1. \n", "\t
    2. 'a'
    3. \n", "\t
    4. 'green'
    5. \n", "
    \n", "
  9. \n", "\t
    1. \n", "\t
    2. 'b'
    3. \n", "\t
    4. 'green'
    5. \n", "
    \n", "
  10. \n", "\t
    1. \n", "\t
    2. 'a'
    3. \n", "\t
    4. 'b'
    5. \n", "\t
    6. 'green'
    7. \n", "
    \n", "
  11. \n", "\t
  12. '3'
  13. \n", "\t
    1. \n", "\t
    2. 'a'
    3. \n", "\t
    4. '3'
    5. \n", "
    \n", "
  14. \n", "\t
    1. \n", "\t
    2. 'b'
    3. \n", "\t
    4. '3'
    5. \n", "
    \n", "
  15. \n", "\t
    1. \n", "\t
    2. 'a'
    3. \n", "\t
    4. 'b'
    5. \n", "\t
    6. '3'
    7. \n", "
    \n", "
  16. \n", "\t
    1. \n", "\t
    2. 'green'
    3. \n", "\t
    4. '3'
    5. \n", "
    \n", "
  17. \n", "\t
    1. \n", "\t
    2. 'a'
    3. \n", "\t
    4. 'green'
    5. \n", "\t
    6. '3'
    7. \n", "
    \n", "
  18. \n", "\t
    1. \n", "\t
    2. 'b'
    3. \n", "\t
    4. 'green'
    5. \n", "\t
    6. '3'
    7. \n", "
    \n", "
  19. \n", "\t
    1. \n", "\t
    2. 'a'
    3. \n", "\t
    4. 'b'
    5. \n", "\t
    6. 'green'
    7. \n", "\t
    8. '3'
    9. \n", "
    \n", "
  20. \n", "
\n" ], "text/latex": [ "\\begin{enumerate}\n", "\\item \n", "\\item 'a'\n", "\\item 'b'\n", "\\item \\begin{enumerate*}\n", "\\item 'a'\n", "\\item 'b'\n", "\\end{enumerate*}\n", "\n", "\\item 'green'\n", "\\item \\begin{enumerate*}\n", "\\item 'a'\n", "\\item 'green'\n", "\\end{enumerate*}\n", "\n", "\\item \\begin{enumerate*}\n", "\\item 'b'\n", "\\item 'green'\n", "\\end{enumerate*}\n", "\n", "\\item \\begin{enumerate*}\n", "\\item 'a'\n", "\\item 'b'\n", "\\item 'green'\n", "\\end{enumerate*}\n", "\n", "\\item '3'\n", "\\item \\begin{enumerate*}\n", "\\item 'a'\n", "\\item '3'\n", "\\end{enumerate*}\n", "\n", "\\item \\begin{enumerate*}\n", "\\item 'b'\n", "\\item '3'\n", "\\end{enumerate*}\n", "\n", "\\item \\begin{enumerate*}\n", "\\item 'a'\n", "\\item 'b'\n", "\\item '3'\n", "\\end{enumerate*}\n", "\n", "\\item \\begin{enumerate*}\n", "\\item 'green'\n", "\\item '3'\n", "\\end{enumerate*}\n", "\n", "\\item \\begin{enumerate*}\n", "\\item 'a'\n", "\\item 'green'\n", "\\item '3'\n", "\\end{enumerate*}\n", "\n", "\\item \\begin{enumerate*}\n", "\\item 'b'\n", "\\item 'green'\n", "\\item '3'\n", "\\end{enumerate*}\n", "\n", "\\item \\begin{enumerate*}\n", "\\item 'a'\n", "\\item 'b'\n", "\\item 'green'\n", "\\item '3'\n", "\\end{enumerate*}\n", "\n", "\\end{enumerate}\n" ], "text/markdown": [ "1. \n", "2. 'a'\n", "3. 'b'\n", "4. 1. 'a'\n", "2. 'b'\n", "\n", "\n", "\n", "5. 'green'\n", "6. 1. 'a'\n", "2. 'green'\n", "\n", "\n", "\n", "7. 1. 'b'\n", "2. 'green'\n", "\n", "\n", "\n", "8. 1. 'a'\n", "2. 'b'\n", "3. 'green'\n", "\n", "\n", "\n", "9. '3'\n", "10. 1. 'a'\n", "2. '3'\n", "\n", "\n", "\n", "11. 1. 'b'\n", "2. '3'\n", "\n", "\n", "\n", "12. 1. 'a'\n", "2. 'b'\n", "3. '3'\n", "\n", "\n", "\n", "13. 1. 'green'\n", "2. '3'\n", "\n", "\n", "\n", "14. 1. 'a'\n", "2. 'green'\n", "3. '3'\n", "\n", "\n", "\n", "15. 1. 'b'\n", "2. 'green'\n", "3. '3'\n", "\n", "\n", "\n", "16. 1. 'a'\n", "2. 'b'\n", "3. 'green'\n", "4. '3'\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "[[1]]\n", "character(0)\n", "\n", "[[2]]\n", "[1] \"a\"\n", "\n", "[[3]]\n", "[1] \"b\"\n", "\n", "[[4]]\n", "[1] \"a\" \"b\"\n", "\n", "[[5]]\n", "[1] \"green\"\n", "\n", "[[6]]\n", "[1] \"a\" \"green\"\n", "\n", "[[7]]\n", "[1] \"b\" \"green\"\n", "\n", "[[8]]\n", "[1] \"a\" \"b\" \"green\"\n", "\n", "[[9]]\n", "[1] \"3\"\n", "\n", "[[10]]\n", "[1] \"a\" \"3\"\n", "\n", "[[11]]\n", "[1] \"b\" \"3\"\n", "\n", "[[12]]\n", "[1] \"a\" \"b\" \"3\"\n", "\n", "[[13]]\n", "[1] \"green\" \"3\" \n", "\n", "[[14]]\n", "[1] \"a\" \"green\" \"3\" \n", "\n", "[[15]]\n", "[1] \"b\" \"green\" \"3\" \n", "\n", "[[16]]\n", "[1] \"a\" \"b\" \"green\" \"3\" \n" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Power set of a finite set\n", "\n", "# Recall that there's a 1:1 mapping between a subset of a set A of size n and an n digit binary number.\n", "# We will enumerate the 2^n binary numbers of length n and map them to the corresponding subsets of A\n", "# 0 maps to {} and 2^n maps to A\n", "\n", "powerset <- function(A) {\n", " n <- length(A); # cardinality of A\n", " ps <- 2^(0:(n-1)); # powers of 2 from 0 to n-1: 2^0, 2^1, ..., 2^(n-1) \n", " return(lapply(0:(2^n-1), function(x) A[bitwAnd(ps, x) != 0]))\n", "}\n", "\n", "'2^A:'\n", "powerset(A)\n", "\n", "'2^B:'\n", "powerset(B)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Cartesian Products\n", "The notation $(s_1, s_2, \\ldots, s_n) \\equiv (s_j)_{j=1}^n$ denotes an\n", "_ordered $n$-tuple_ consisting of\n", "$s_1$ in the first position, $s_2$ in the second position, etc.\n", "The parentheses are used instead of curly braces to distinguish\n", "$n$-tuples from sets: $(s_j)_{j=1}^n \\ne \\{s_j\\}_{j=1}^n$.\n", "The $k$th\n", "_component_ of the $n$-tuple $s = (s_j)_{j=1}^n$, is $s_k$,\n", "$k = 1, 2, \\ldots, n$.\n", "Two $n$-tuples are equal if their components are equal.\n", "That is, $(s_j)_{j=1}^n = (t_j)_{j=1}^n$ means that\n", "$s_j = t_j$ for $j = 1, \\ldots, n$.\n", "In particular, $(s, t) \\ne (t, s)$ unless $s=t$.\n", "In contrast, $\\{s, t \\} = \\{ t, s \\}$ always.\n", "\n", "The _Cartesian product of $S$ and $T$_ is\n", "$S \\bigotimes T \\equiv \\{(s, t): s \\in S \\mbox{ and } t \\in T\\}$.\n", "Unless $S = T$, $S \\bigotimes T \\ne T \\bigotimes S$.\n", "${\\mathbf R}^n$ is the Cartesian product of ${\\mathbf R}$ with itself, $n$ times; its elements\n", "are $n$-tuples of real numbers.\n", "If $s$ is the $n$-tuple $(s_1, s_2, \\ldots, s_n) = (s_j)_{j=1}^n$\n", "and $t$ is the $n$-tuple $(t_1, t_2, \\ldots, t_n) = (t_j)_{j=1}^n$,\n", "then $s = t$ iff $s_j = t_j$ for all $j=1, \\ldots, n$." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
Var1Var2Var3
11a1
22a1
33a1
41b1
52b1
63b1
71green1
82green1
93green1
10131
11231
12331
131a2
142a2
153a2
161b2
172b2
183b2
191green2
202green2
213green2
22132
23232
24332
\n" ], "text/latex": [ "\\begin{tabular}{r|lll}\n", " & Var1 & Var2 & Var3\\\\\n", "\\hline\n", "\t1 & 1 & a & 1\\\\\n", "\t2 & 2 & a & 1\\\\\n", "\t3 & 3 & a & 1\\\\\n", "\t4 & 1 & b & 1\\\\\n", "\t5 & 2 & b & 1\\\\\n", "\t6 & 3 & b & 1\\\\\n", "\t7 & 1 & green & 1\\\\\n", "\t8 & 2 & green & 1\\\\\n", "\t9 & 3 & green & 1\\\\\n", "\t10 & 1 & 3 & 1\\\\\n", "\t11 & 2 & 3 & 1\\\\\n", "\t12 & 3 & 3 & 1\\\\\n", "\t13 & 1 & a & 2\\\\\n", "\t14 & 2 & a & 2\\\\\n", "\t15 & 3 & a & 2\\\\\n", "\t16 & 1 & b & 2\\\\\n", "\t17 & 2 & b & 2\\\\\n", "\t18 & 3 & b & 2\\\\\n", "\t19 & 1 & green & 2\\\\\n", "\t20 & 2 & green & 2\\\\\n", "\t21 & 3 & green & 2\\\\\n", "\t22 & 1 & 3 & 2\\\\\n", "\t23 & 2 & 3 & 2\\\\\n", "\t24 & 3 & 3 & 2\\\\\n", "\\end{tabular}\n" ], "text/plain": [ " Var1 Var2 Var3\n", "1 1 a 1\n", "2 2 a 1\n", "3 3 a 1\n", "4 1 b 1\n", "5 2 b 1\n", "6 3 b 1\n", "7 1 green 1\n", "8 2 green 1\n", "9 3 green 1\n", "10 1 3 1\n", "11 2 3 1\n", "12 3 3 1\n", "13 1 a 2\n", "14 2 a 2\n", "15 3 a 2\n", "16 1 b 2\n", "17 2 b 2\n", "18 3 b 2\n", "19 1 green 2\n", "20 2 green 2\n", "21 3 green 2\n", "22 1 3 2\n", "23 2 3 2\n", "24 3 3 2" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/html": [ "
    \n", "\t
  1. '1 a 1'
  2. \n", "\t
  3. '2 a 1'
  4. \n", "\t
  5. '3 a 1'
  6. \n", "\t
  7. '1 b 1'
  8. \n", "\t
  9. '2 b 1'
  10. \n", "\t
  11. '3 b 1'
  12. \n", "\t
  13. '1 green 1'
  14. \n", "\t
  15. '2 green 1'
  16. \n", "\t
  17. '3 green 1'
  18. \n", "\t
  19. '1 3 1'
  20. \n", "\t
  21. '2 3 1'
  22. \n", "\t
  23. '3 3 1'
  24. \n", "\t
  25. '1 a 2'
  26. \n", "\t
  27. '2 a 2'
  28. \n", "\t
  29. '3 a 2'
  30. \n", "\t
  31. '1 b 2'
  32. \n", "\t
  33. '2 b 2'
  34. \n", "\t
  35. '3 b 2'
  36. \n", "\t
  37. '1 green 2'
  38. \n", "\t
  39. '2 green 2'
  40. \n", "\t
  41. '3 green 2'
  42. \n", "\t
  43. '1 3 2'
  44. \n", "\t
  45. '2 3 2'
  46. \n", "\t
  47. '3 3 2'
  48. \n", "
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item '1 a 1'\n", "\\item '2 a 1'\n", "\\item '3 a 1'\n", "\\item '1 b 1'\n", "\\item '2 b 1'\n", "\\item '3 b 1'\n", "\\item '1 green 1'\n", "\\item '2 green 1'\n", "\\item '3 green 1'\n", "\\item '1 3 1'\n", "\\item '2 3 1'\n", "\\item '3 3 1'\n", "\\item '1 a 2'\n", "\\item '2 a 2'\n", "\\item '3 a 2'\n", "\\item '1 b 2'\n", "\\item '2 b 2'\n", "\\item '3 b 2'\n", "\\item '1 green 2'\n", "\\item '2 green 2'\n", "\\item '3 green 2'\n", "\\item '1 3 2'\n", "\\item '2 3 2'\n", "\\item '3 3 2'\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. '1 a 1'\n", "2. '2 a 1'\n", "3. '3 a 1'\n", "4. '1 b 1'\n", "5. '2 b 1'\n", "6. '3 b 1'\n", "7. '1 green 1'\n", "8. '2 green 1'\n", "9. '3 green 1'\n", "10. '1 3 1'\n", "11. '2 3 1'\n", "12. '3 3 1'\n", "13. '1 a 2'\n", "14. '2 a 2'\n", "15. '3 a 2'\n", "16. '1 b 2'\n", "17. '2 b 2'\n", "18. '3 b 2'\n", "19. '1 green 2'\n", "20. '2 green 2'\n", "21. '3 green 2'\n", "22. '1 3 2'\n", "23. '2 3 2'\n", "24. '3 3 2'\n", "\n", "\n" ], "text/plain": [ ", , 1\n", "\n", " [,1] [,2] [,3] [,4] \n", "[1,] \"1 a 1\" \"1 b 1\" \"1 green 1\" \"1 3 1\"\n", "[2,] \"2 a 1\" \"2 b 1\" \"2 green 1\" \"2 3 1\"\n", "[3,] \"3 a 1\" \"3 b 1\" \"3 green 1\" \"3 3 1\"\n", "\n", ", , 2\n", "\n", " [,1] [,2] [,3] [,4] \n", "[1,] \"1 a 2\" \"1 b 2\" \"1 green 2\" \"1 3 2\"\n", "[2,] \"2 a 2\" \"2 b 2\" \"2 green 2\" \"2 3 2\"\n", "[3,] \"3 a 2\" \"3 b 2\" \"3 green 2\" \"3 3 2\"\n" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Cartesian products in R\n", "# Again, there are a variety of ways to do this, including SQL-like approaches (outer joins)\n", "expand.grid(A, B, C)\n", "\n", "# now using an outer product\n", "outer(outer(A, B, FUN=\"paste\"),C, FUN=\"paste\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Combinatorics\n", "Let $A$ be a finite set with $\\# A = n$.\n", "A _permutation_ of (the elements of) $A$ is an element $s$ of\n", "$\\bigotimes_{j=1}^n A = A^n$\n", "whose components are distinct elements of $A$.\n", "That is, $s = (s_j)_{j=1}^n \\in A^n$ is a permutation of $A$ if\n", "$\\# \\{s_j\\}_{j=1}^n = n$.\n", "There are $n! = n(n-1)\\cdots 1$ permutations of a set with $n$ elements:\n", "there are $n$ choices for the first component of the permutation, $n-1$ choices for\n", "the second (whatever the\n", "first might be), $n-2$ for the third (whatever the first two might be), etc.\n", "This is an illustration of the _Fundamental Rule of Counting_:\n", "in a sequence of $n$ choices, if there are $m_1$ possibilites for the first choice,\n", "$m_2$ possibilities for the second choice (no matter which was chosen in the first place),\n", "$m_3$ possibilities for the third choice\n", "(no matter which were chosen in the first two places),\n", "and so on, then there are $m_1 m_2 \\cdots m_n = \\prod_{j=1}^n m_j$ possible sequences\n", "of choices in all.\n", "\n", "The number of permutations of $n$ things taken $k$ at a time, ${}_nP_k$,\n", "is the number of ways there are of selecting $k$ of $n$ things, then permuting\n", "those $k$ things.\n", "There are $n$ choices for the object that will be in the first place in the permutation,\n", "$n-1$ for the second place (regardless of which is first), etc., and $n-k+1$ choices\n", "for the item that will be in the $k$th place.\n", "By the fundamental rule of counting, it follows that\n", "${}_nP_k = n(n-1)\\cdots(n-k+1) = n!/(n-k)!$.\n", "\n", "The number of subsets of size $k$ that can be formed from $n$ objects is\n", "$$\n", "{}_nC_k = {{n}\\choose{k}} =\n", "{}_nP_k/k! = n(n-1)\\cdots(n-k+1)/k! = \\frac{n!}{k!(n-k)!},\n", "$$\n", "since each subset of size $k$ can be ordered into $k!$ permutations.\n", "\n", "Here are some useful identities:\n", "\n", "$$ {{n}\\choose{k}} = {{n}\\choose{n-k}} $$\n", "$$ {{n}\\choose{0}} = 1$$\n", "$$ {{n}\\choose{1}} = n $$\n", "$$ {{n} \\choose {k}} = \\prod_{m=0}^{k-1} \\frac{n-m}{m+1} $$\n", "$$ {{n} \\choose {k}} = {{n-1}\\choose{k-1}} + {{n-1} \\choose {k}}.$$\n", "\n", "Because the power set of a set with $n$ elements can be partitioned as\n", "$$\n", "\\cup_{k=0}^n \\left \\{ \\mbox{all subsets of size $k$}\n", "\\right \\},\n", "$$\n", "and since the power set has $2^n$ elements, it follows that\n", "$$\n", "\\sum_{j=0}^n {{n} \\choose {k}} = 2^n.\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "### Exercise\n", "Prove the five identities given above.\n", "
" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "n!: 120 \n", "nCk: 10 \n", "nPk: 60 \n" ] } ], "source": [ "# Combinatorics in R\n", "# factorials\n", "n <- 5;\n", "k <- 3;\n", "cat('n!:',factorial(n),'\\n')\n", "\n", "# number of combinations of n objects taken k at a time\n", "cat('nCk:',choose(n,k),'\\n')\n", "\n", "# number of permutations of n objects taken k at a time\n", "permu <- function(n, k) {\n", " permu <- 1; \n", " s <- 0;\n", " while (s < k) {\n", " permu <- permu*(n-s);\n", " s <- s+1;\n", " }\n", " return(permu);\n", "}\n", "cat('nPk:',permu(n, k),'\\n')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "### Computational exercise\n", "\n", "Write an R function that constructs all permutations of a finite list.\n", "\n", "The function should take as input a list s, and produce as output all permutations of the elements of the list. Items should be considered unique based on their position in the list.\n", "\n", "For instance,\n", "\n", " s <- c('a','b','c');\n", " permute(s)\n", "\n", "should return the 6 permutations of a, b, and c:\n", "\n", " a b c\n", " a c b\n", " b a c\n", " b c a\n", " c a b\n", " c b a\n", "\n", "Write your function from scratch; i.e., do not use any libraries that are not part of the core of R.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Numerical stability and order of operations\n", "\n", "As a matter of mathematics, ${{n}\\choose{k}} = \\frac{n!}{k!(n-k)!}$, but this is a terrible way to compute ${{n}\\choose{k}}$.\n", "When $n$ is large, the numerator $n!$ is enormous, and can overflow finite-precision calculations even for values of $n$ and $k$ for which ${{n}\\choose{k}}$ is small.\n", "Moreover, relying on cancellation in finite-precision arithmetic can lead to large errors, whether the cancellation is through division or subtraction.\n", "\n", "So, rather than compute ${{n}\\choose{k}}$ by dividing factorials, it's better to build it by multiplying ratios.\n", "\n", "Let's have a look in R." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Warning message:\n", "In factorial(n): value out of range in 'gammafn'" ] }, { "name": "stdout", "output_type": "stream", "text": [ "1000!: Inf \n", "1000C2: 499500 \n", "1000C2 (2nd way): 499500 \n" ] } ], "source": [ "# illustrating overflow of factorial for large n\n", "n <- 1000;\n", "cat('1000!:',factorial(n),'\\n') # inf\n", "\n", "# however, 1000 choose 2 is only (1000*999)/2 = 499500\n", "cat('1000C2:',choose(n, 2),'\\n')\n", "\n", "# let's code this from scratch in a stable way\n", "myChoose <- function(n, k) {\n", " # no error checking. If this were for production, we'd check that the arguments are integers\n", " # and that n >= k >= 0.\n", " m <- 0;\n", " myc <- 1;\n", " while (m < k) {\n", " myc <- myc * (n-m)/(m+1);\n", " m <- m+1;\n", " }\n", " return(myc)\n", "}\n", "\n", "cat('1000C2 (2nd way):',myChoose(n, 2),'\\n')" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "A related same issue comes up in subtracting large numbers.\n", "\n", "For instance, algebraically, $x^2 - (x-1)^2 = 2x-1$. But if the mantissa of $x$ is large, squaring\n", "$x$ and $x-1$ will overflow the precision of the calculation, and their difference of the computed\n", "squares will be numerically wrong." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "x^2 - (x-1)^2: 0 \n", "2x-1: 2e+25 \n" ] } ], "source": [ "x <- 9999999999999999999999999; # 25 digits\n", "cat('x^2 - (x-1)^2:',x^2 - (x-1)^2,'\\n')\n", "cat('2x-1:', 2*x-1, '\\n')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Mappings and Functions\n", "Functions are subsets of Cartesian products.\n", "We write $f: {\\mathcal X} \\rightarrow {\\mathcal Y}$, pronounced \"$f$ maps ${\\mathcal X}$ into ${\\mathcal Y}$\"\n", "or \"$f$ is a function with domain ${\\mathcal X}$\n", "and co-domain ${\\mathcal Y}$\" if\n", "$f \\subset {\\mathcal X} \\bigotimes {\\mathcal Y}$\n", "such that for each $x \\in {\\mathcal X}$, $\\exists 1 y \\in {\\mathcal Y}$\n", "such that $(x, y) \\in f$.\n", "(The notation $\\exists 1 y$ means that there exists exactly one value of $y$.)\n", "The set ${\\mathcal X}$ is called the _domain_ of $f$ and ${\\mathcal Y}$ is called the co-domain of $f$.\n", "If the ${\\mathcal X}$-component of an element of $f$ is $x$, we denote the ${\\mathcal Y}$-component of that\n", "element of $f$ by $fx$ or $f(x)$, so that $(x, fx) \\in f$; we write $f: x \\mapsto y=f(x)$.\n", "The functions $f$ and $g$ are equal if they are the same subset of ${\\mathcal X} \\bigotimes {\\mathcal Y}$,\n", "which means that they have the same domain ${\\mathcal X}$, and $fx = gx$ $\\forall x \\in {\\mathcal X}$.\n", "\n", "Let $A \\subset {\\mathcal X}$.\n", "The _image_ of $A$ under $f$ is\n", "$$\n", "fA = f(A) \\equiv \\{ y \\in {\\mathcal Y} : (x, y) \\in f \\mbox{ for some } x \\in A\\}.\n", "$$\n", "More colloquially, we would write this as\n", "$$\n", "fA = \\{ y \\in {\\mathcal Y} : f(x) = y \\mbox{ for some } x \\in A \\}.\n", "$$\n", "If $f{\\mathcal X}$ is a proper subset of ${\\mathcal Y}$, $f$ is _into_.\n", "If $f{\\mathcal X} = {\\mathcal Y}$, $f$ is _onto_.\n", "For $B \\subset {\\mathcal Y}$, the _inverse image of $B$ under $f$_ or\n", "_pre-image of $B$ under $f$_ is\n", "$$\n", "f^{-1}B \\equiv \\{ x \\in {\\mathcal X} : fx \\in B \\}.\n", "$$\n", "Similarly, $f^{-1}y \\equiv \\{ x \\in {\\mathcal X} : fx = y \\}$\n", "If $\\forall y \\in {\\mathcal Y}$, $\\# \\{ f^{-1} y \\} \\le 1$, $f$ is _one-to-one_ (1:1).\n", "If $f$ is one-to-one and onto, i.e., if $\\forall y \\in {\\mathcal Y}$, $\\# \\{ f^{-1}y \\} = 1$,\n", "$f$ is a _bijection_.\n", "\n", "
\n", "### Exercise 1\n", "\n", "+ Does $f^{-1}(fA) = A$?\n", "+ Does $f(f^{-1}B) = B$?\n", "+ Does $f^{-1}(C \\cap D) = f^{-1}C \\cap f^{-1} D$?\n", "+ Does $f(C \\cap D) = fC \\cap fD$?\n", "+ Does $f(C \\cup D) = fC \\cup fD$?\n", "\n", "
\n", "\n", "## Groups\n", "
\n", "### Definition\n", "A _group_ is an ordered pair $({\\mathcal G}, \\times)$, where ${\\mathcal G}$ is\n", "a collection of objects (the elements of the group) and $\\times$ is a mapping\n", "from ${\\mathcal G} \\bigotimes {\\mathcal G}$ onto ${\\mathcal G}$,\n", "$$\n", "\\times : {\\mathcal G} \\bigotimes {\\mathcal G} \\rightarrow {\\mathcal G} \\\\\n", "(a, b) \\mapsto a \\times b,\n", "$$\n", "satisfying the following axioms:\n", "\n", "+ $\\exists e \\in {\\mathcal G}$ s.t. $\\forall a \\in {\\mathcal G}$, $e \\times a = a$.\n", "The element $e$ is called the _identity_.\n", "+ For each $a \\in {\\mathcal G}$, $\\exists a^{-1} \\in {\\mathcal G}$ s.t. $a^{-1}\\times a = e$.\n", "(Every element has an inverse.)\n", "+ If $a, b, c \\in {\\mathcal G}$, then $a \\times (b \\times c) = (a \\times b)\\times c$.\n", "(The group operation is associative.)\n", "\n", "If, in addition, for every $a, b \\in {\\mathcal G}$, $a \\times b = b \\times a$ (if the group\n", "operation commutes), we say that $({\\mathcal G}, \\times)$ is an _Abelian group_\n", "or _commutative group_.\n", "
\n", "\n", "Examples of groups include the real numbers together with ordinary addition, $({\\mathbf R}, +)$;\n", "the real numbers other than zero together with ordinary multiplication,\n", "$(\\Re \\setminus \\{0\\}, \\times)$; the rational numbers together with ordinary addition,\n", "$({\\mathbf Q}, +)$; and the integers 0 to $p-1$, $p$ prime, together with addition modulo $p$,\n", "$( \\{0, 1, \\ldots, p-1\\}, +)$.\n", "\n", "
\n", "### Exercise 2\n", "\n", "+ Show that $\\forall a \\in {\\mathcal G}$, $a \\times a^{-1} = e$.\n", "(The inverse from the left is also the\n", "inverse from the right; equivalently, $(a^{-1})^{-1} = a$.)\n", "+ Show that $\\forall a \\in {\\mathcal G}$, $ae = a$. (The identity from the left is\n", "also the identity from the right.)\n", "\n", "
\n", "\n", "## Fields\n", "\n", "
\n", "### Definition\n", "An ordered triple $({\\mathcal F}, \\times, +)$ is a _field_ if ${\\mathcal F}$ is\n", "a collection of objects and $\\times$ and $+$ are operations on ${\\mathcal F} \\times {\\mathcal F}$\n", "such that\n", "\n", "+ ${\\mathcal F}$ is an Abelian group under the operation $+$, with identity 0.\n", "+ ${\\mathcal F} \\setminus \\{0\\}$ is an Abelian group under the operation $\\times$,\n", "with identity $1$.\n", "+ $\\times$ is distributive over $+$. I.e., for any $a$, $b$, $c \\in {\\mathcal F}$\n", "$a \\times (b+c) = a \\times b + a \\times c$ and\n", "$(a+b) \\times c = a\\times c + b \\times c$.\n", "\n", "The additive inverse of $a$ is denoted $-a$; the multiplicative inverse of $a$ is\n", "$a^{-1} = 1/a$.\n", "
\n", "\n", "Examples:\n", "$({\\mathbf R}, \\times, +)$, where $\\times$ is ordinary (real) multiplication and $+$ is ordinary\n", "(real) addition.\n", "The complex numbers ${\\mathbf C}$, with complex multiplication and addition.\n", "\n", "These (and the extended reals) are the only fields we will use.\n", "\n", "## Arithmetic with $\\infty$\n", "We shall use the following conventions:\n", "\n", "+ $0 \\cdot \\infty = \\infty \\cdot 0 = 0$\n", "+ $x + \\infty = \\infty + x = \\infty$, $x \\in {\\mathbf R}$\n", "+ $x \\cdot \\infty = \\infty \\cdot x = \\infty$, $x > 0$\n", "\n", "With these conventions, $([-\\infty, \\infty], \\times, +)$ is a field." ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Linear Vector Spaces\n", "
\n", "### Definition\n", "A linear vector space is an ordered quadruple\n", "$\\left ( ({\\mathcal F}, \\times, +_1), {\\mathcal X}, \\cdot, +_2 \\right )$\n", "where $({\\mathcal F}, \\times, +_1)$ is a field, ${\\mathcal X}$ is a set of objects (the vectors),\n", "and $\\cdot$ is an operation on ${\\mathcal F} \\bigotimes {\\mathcal X}$ and $+_2$ is\n", "an operation on ${\\mathcal X} \\bigotimes {\\mathcal X}$ such that:\n", "\n", "+ $({\\mathcal X}, +_2)$ is an Abelian group with identity 0 (the zero vector)\n", "+ $\\cdot : {\\mathcal F} \\times {\\mathcal X} \\rightarrow {\\mathcal X}$; $(\\alpha, x) \\mapsto \\alpha \\cdot x$\n", "such that:\n", "\n", "+ If $1$ is the multiplicative identity on ${\\mathcal F}$, $1 \\cdot x = x$\n", "$\\forall x \\in {\\mathcal X}$.\n", "+ $\\alpha \\cdot (x +_2 y) = \\alpha \\cdot x +_2 \\alpha \\cdot y$\n", "$\\forall \\alpha \\in {\\mathcal F}$, $x, y \\in {\\mathcal X}$. (distribution)\n", "+ $\\alpha \\cdot (\\beta \\cdot x) = (\\alpha \\times \\beta) \\cdot x$\n", "$\\forall \\alpha, \\beta \\in {\\mathcal F}, x \\in {\\mathcal X}$. (association)\n", "+ $(\\alpha +_1 \\beta) \\cdot x = \\alpha \\cdot x +_2 \\beta \\cdot x$,\n", "$\\forall \\alpha, \\beta \\in {\\mathcal F}$, $x \\in {\\mathcal X}$. (distribution)\n", "\n", "\n", "
\n", "We rarely distinguish notationally between $+_1$ and $+_2$, between $\\times$ and $\\cdot$,\n", "or between the additive identity $0$ of the field ${\\mathcal F}$ and the identity $0$\n", "of the Abelian group ${\\mathcal X}$.\n", "Sometimes, the multiplication symbols are omitted; e.g., $\\alpha \\cdot x = \\alpha x$.\n", "Usually, one just calls ${\\mathcal X}$ a linear vector space (LVS), omitting mention of the other elements of\n", "the quadruple.\n", "For our purposes, ${\\mathcal F}$ is almost always ${\\mathbf R}$.\n", "In that case, ${\\mathcal X}$ is called a real linear vector space (RLVS).\n", "\n", "
\n", "### Definition\n", "A _functional_ on a linear vector space is a mapping from the vectors ${\\mathcal X}$\n", "to the field ${\\mathcal F}$ of the linear vector space.\n", "
\n", "\n", "
\n", "### Definition\n", "A _linear combination_ of $\\{ x_j\\}_{j=1}^n \\subset {\\mathcal X}$, ${\\mathcal X}$ a linear\n", "vector space, is a vector $x = \\sum_{j=1}^n \\alpha_j x_j$, where\n", "$\\{ \\alpha_j\\}_{j=1}^n \\subset {\\mathcal F}$.\n", "\n", "A set $\\{x_\\alpha : \\alpha \\in A \\} \\subset {\\mathcal X}$ is _linearly dependent_\n", "if there exist constants $\\{ \\beta_\\alpha : \\alpha \\in A \\} \\subset {\\mathcal F}$, not all\n", "equal to zero, such that $\\sum_{\\alpha \\in A} \\beta_\\alpha x_\\alpha = 0$.\n", "A set is _linearly independent_ if it is not linearly dependent.\n", "
\n", "\n", "
\n", "### Definition\n", "The _span_ of $\\{ x_j\\}_{j=1}^n \\subset {\\mathcal X}$, ${\\mathcal X}$ a linear\n", "vector space, is the set of all linear combinations of $\\{ x_j\\}_{j=1}^n$.\n", "
\n", "\n", "\n", "
\n", "### Definition\n", "A _subspace_ of a linear vector space ${\\mathcal X}$ is a subset of ${\\mathcal X}$ that is\n", "a linear vector space with the same field ${\\mathcal F}$ and operations $+_1, \\cdot, +_2, \\times$\n", "as ${\\mathcal X}$.\n", "
\n", "\n", "The span of a set of elements of ${\\mathcal X}$ is a subspace of ${\\mathcal X}$.\n", "\n", "\n", "\n", "
\n", "### Definition\n", "Let ${\\mathcal X}$ be a linear vector space, $A, B \\subset {\\mathcal X}$, $x \\in x$, $\\alpha \\in {\\mathcal F}$.\n", "\n", "+ $\\alpha A \\equiv \\{ \\alpha a : a \\in A\\}$\n", "+ $-A \\equiv \\{ -1 \\cdot a : a \\in A \\}$\n", "+ $x + A \\equiv \\{x+a : a \\in A \\} = A + x$\n", "+ $x - A \\equiv \\{ x - a: a \\in A \\}$\n", "+ $A + B \\equiv \\{ a + b: a \\in A, b \\in B \\}$\n", "\n", "
\n", "\n", "
\n", "### Exercise 3\n", "\n", "+ Does $x - A = -(A - x)$?\n", "+ Does $A + A = 2A$?\n", "+ When does $A - A = 2A$?\n", "\n", "If you cannot find necessary conditions, give sufficient ones.\n", "
\n", "\n", "
\n", "### Definition\n", "A mapping $M$ from a linear vector space ${\\mathcal X}$ into a linear vector space ${\\mathcal Y}$\n", "with the same field ${\\mathcal F}$ is _linear_ iff\n", "$$\n", "M(\\alpha x + \\beta y) = \\alpha M x + \\beta M y, \\;\\; \\forall \\alpha, \\beta \\in {\\mathcal F},\n", "\\;\\;x, y \\in {\\mathcal X}.\n", "$$\n", "
\n", "\n", "_Example_: Let ${\\mathcal X} = \\Re^n$, ${\\mathcal Y} = \\Re^m$, ${\\mathcal F} = \\Re$,\n", "and $M$ be an $m$ by $n$ matrix.\n", "\n", "
\n", "### Definition\n", "Let ${\\mathcal X}$ be a real linear vector space. A set $C \\subset {\\mathcal X}$ is _convex_\n", "iff $\\alpha C + (1-\\alpha)C \\subset C$ $\\forall \\alpha \\in [0, 1]$.\n", "A set $B \\subset {\\mathcal X}$ is _balanced_\n", "iff $\\alpha B \\subset B$ $\\forall \\alpha$ with $|\\alpha| \\le 1$.\n", "
\n", "Note that this requires us to define $|\\cdot |$ on the field ${\\mathcal F}$.\n", "For ${\\mathcal F} = {\\mathbf R}$, let $|\\cdot |$ be absolute value; for \n", "${\\mathcal F} = {\\mathbf C}$, let $|\\cdot|$ be\n", "the modulus.\n", "\n", "
\n", "### Exercise 4\n", "Show that if $C$, $D \\subset {\\mathcal X}$ are convex, then\n", "\n", "+ $\\alpha C$ is convex, $\\alpha \\in {\\mathbf R}$\n", "+ $C \\cap D$ is convex.\n", "\n", "
\n", "\n", "\n", "
\n", "### Definition\n", "A linear combination $\\sum_j \\beta_j x_j$ of elements $\\{x_j\\}$\n", "of a linear vector space is a _convex combination_ if\n", "\n", "+ $\\{ \\beta_j \\} \\subset {\\mathbf R}$,\n", "+ $\\beta_j \\ge 0$, $\\forall j$, and\n", "+ $\\sum_j \\beta_j = 1$.\n", "\n", "
\n", "\n", "\n", "
\n", "### Definition\n", "The _convex hull_ of a set $A \\subset {\\mathcal X}$ is the intersection of\n", "all convex sets that contain $A$.\n", "Equivalently, it is the set of all convex combinations of elements of $A$.\n", "If $C$ is convex, a point $x \\in C$ is an _extreme point_ of $C$ if $x$\n", "cannot be written as a convex combination of a subset of $C$ unless that subset\n", "contains $x$.\n", "A _polytope_ is the convex hull of a finite collection of points.\n", "
\n", "\n", "
\n", "### Definition\n", "A set $\\{ x_\\alpha : \\alpha \\in A \\}$ is a _basis_ for a linear vector\n", "space ${\\mathcal X}$ if every $x \\in {\\mathcal X}$ has a __unique__ representation\n", "$x = \\sum_{\\alpha \\in A} \\beta_\\alpha x_\\alpha$ with\n", "$\\{ \\beta_\\alpha: \\alpha \\in A\\} \\subset {\\mathcal F}$.\n", "If ${\\mathcal X}$ has a basis with $n$ elements, $n \\in {\\mathbf N}$, ${\\mathcal X}$ is _finite-dimensional_\n", "and the dimension of ${\\mathcal X}$, $\\dim({\\mathcal X})$, is $n$.\n", "If ${\\mathcal X}$ is not finite-dimensional, it is _infinite-dimensional_.\n", "
\n", "\n", "It is a theorem that all bases for ${\\mathcal X}$ have the same cardinality.\n", "\n", "\n", "
\n", "### Exercise 5\n", "Show that if ${\\mathcal Y}$ is a subspace of ${\\mathcal X}$ and $\\dim({\\mathcal X})=n$,\n", "then $\\dim({\\mathcal Y}) \\le n$.\n", "
\n", "\n", "
\n", "### Definition\n", "A _Hamel basis_ for a linear vector space ${\\mathcal X}$ is a maximal linearly independent\n", "subset of ${\\mathcal X}$.\n", "
\n", "\n", "## Normed linear vector spaces\n", "\n", "
\n", "### Definition\n", "A _norm_ on a linear vector space ${\\mathcal X}$ is a function $\\| \\cdot \\|: {\\mathcal X} \\rightarrow \\Re^+$ such\n", "that\n", "\n", "1. $\\forall x \\in {\\mathcal X}$, $\\|x \\| \\ge 0$ (nonnegative)\n", "2. $\\|x \\| = 0$ iff $x = 0$ (positive definite)\n", "3. $\\| \\alpha x \\| = | \\alpha| \\|x \\|$ for $\\alpha \\in \\Re$ (scalar homogeneity)\n", "4. $\\forall x, y \\in {\\mathcal X}$, $\\|x + y \\| \\le \\|x\\| + \\|y\\|$ (triangle inequality)\n", "\n", "A linear vector space endowed with a norm is called a _normed linear vector space_.\n", "
\n", "\n", "
\n", "### Definition\n", "A sequence $x_j$, $j=1, 2, \\ldots$ of elements of a normed linear vector space ${\\mathcal X}$ is a _Cauchy sequence_\n", "if $\\|x_i - x_j \\| \\rightarrow 0$ as $i, j \\rightarrow \\infty$.\n", "
\n", "\n", "
\n", "### Definition\n", "A normed linear vector space ${\\mathcal X}$ is _(sequentially) complete_ \n", "if every Cauchy sequence of elements of ${\\mathcal X}$ converges to an element of ${\\mathcal X}$.\n", "\n", "\n", "## Metric spaces\n", "\n", "\n", "## Hilbert Spaces\n", "\n", "### Inner products\n", "\n", "An (real) inner product on a linear vector space ${\\mathcal X}$ is a mapping $\\langle \\cdot , \\cdot \\rangle$\n", "from ${\\mathcal X} \\bigotimes {\\mathcal X}$\n", "into $\\Re$ such that $\\forall x, y \\in {\\mathcal X}$ and all $\\alpha \\in \\Re$,\n", "\n", "+ $\\langle x, y \\rangle = \\langle y, x \\rangle$\n", "+ $\\langle \\alpha x, y \\rangle = \\alpha \\langle y, x \\rangle$\n", "+ $\\langle x, x \\rangle \\ge 0$\n", "+ $\\langle x, x \\rangle = 0$ iff $x = 0$\n", "\n", "A (real) _Hilbert space_ is an inner product space that is sequentially complete:\n", "every Cauchy sequence has a limit.\n", "\n", "### Examples\n", "\n", "$\\Re$ with multiplication.\n", "\n", "$\\Re^n$ with the usual dot product, Euclidean norm.\n", "\n", "$\\ell^2$: infinite sequences with bounded sum of squares, $\\langle x, y \\rangle = \\sum_{j=1}^\\infty x_j y_j$.\n", "\n", "$L^2$, space of Lebesgue square-integrable functions, inner product $\\langle x, y \\rangle = \\int x(t)y(t)\\mu(dt)$.\n", "\n", "
\n", "### Exercise\n", "Show that $\\Re^n$ with the usual dot product is a Hilbert space over $\\Re$.\n", "
\n", "\n", "
\n", "### Definition\n", "The _norm_ of a linear functional $L$ on a normed linear vector space ${\\mathcal X}$ is\n", "$$ \\| L \\| \\equiv \\sup_{x \\in {\\mathcal X}}\\frac{|Lx|}{\\|x\\|} = \\sup_{x \\in {\\mathcal X} : \\|x\\|=1} |Lx|.$$\n", "
\n", "\n", "Hilbert spaces are self-dual: every bounded linear functional on a Hilbert space can be written as the\n", "inner product with an element of the space (this is the Riesz Hilbert space representation theorem).\n", "Moreover, there is an isometric isomorphism between ${\\mathcal H}^*$, normed dual of ${\\mathcal H}$, and ${\\mathcal H}$ itself:\n", "Every bounded linear functional on ${\\mathcal H}$ can be written as the inner product\n", "with an element of ${\\mathcal H}$; every element of ${\\mathcal H}$ defines a bounded linear functional on\n", "${\\mathcal H}$; and the operator norm of the linear functional and Hilbert-space norm of the element are equal.\n", "\n", "Finite-dimensional subspaces of Hilbert spaces are closed.\n", "\n", "### Orthogonality and orthogonal decompositions\n", "\n", "Two elements $x, y$ of a Hilbert space ${\\mathcal H}$ are _orthogonal_ if $\\langle x, y \\rangle = 0$;\n", "then we write $x \\perp y$.\n", "\n", "An element $x$ of a Hilbert space ${\\mathcal H}$ is orthogonal to a subset ${\\mathcal Y} \\subset {\\mathcal H}$\n", "iff $x \\perp y$ $\\forall y \\in {\\mathcal Y}$; then we write $x \\perp {\\mathcal Y}$.\n", "\n", "Let ${\\mathcal Y}$ be a closed subspace of Hilbert space ${\\mathcal H}$.\n", "Then each $x \\in {\\mathcal X}$ has a unique decomposition $x = x_\\| + x_\\perp$ where\n", "$x_\\| \\in {\\mathcal Y}$ and $x_\\perp \\perp {\\mathcal Y}$.\n", "\n", "__Proof.__\n", "[TO DO!]\n", "\n", "
\n", "### Definition\n", "An _orthonormal basis_ for a Hilbert space ${\\mathcal H}$ is a collection $\\{ x_\\alpha \\} \\subset {\\mathcal H}$\n", "such that $\\langle x_\\alpha, x_\\beta \\rangle = 0, \\alpha \\ne \\beta; 1 \\alpha=\\beta$.\n", "\n", "\n", "### The Gram Schmidt procedure\n", "[TO DO!]\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Partial order and convexity\n", "
\n", "### Definition\n", "A relation $\\le$ is a _partial order_ on a set ${\\mathcal X}$ if for all $x, y, z \\in {\\mathcal X}$,\n", "\n", "+ $x \\le x$, $\\forall x \\in {\\mathcal X}$\n", "+ if $x \\le y$ and $y \\le x$, then $x = y$\n", "+ if $x \\le y$ and $y \\le z$ then $x \\le z$\n", "\n", "If, in addition,\n", "for any $x$, $y \\in {\\mathcal X}$, either $x \\le y$ or $y \\le x$ (or both), then\n", "$\\le$ is an _order_.\n", "
\n", "\n", "The usual $\\le$ is an order on ${\\mathbf R}$.\n", "Set inclusion gives an order among the power set (set of all subsets) of a given set:\n", "$x \\le y$ if $x \\subset y$.\n", "One can think of orders as subsets of ${\\mathcal X} \\bigotimes {\\mathcal X}$ or as mappings from\n", "${\\mathcal X} \\bigotimes {\\mathcal X} \\rightarrow \\{0, 1\\}$.\n", "Henceforth, we take ${\\mathbf R}$ to be ordered by $\\le$.\n", "\n", "
\n", "### Definition\n", "A set $C \\subset {\\mathcal X}$, ${\\mathcal X}$ a linear vector space, is a _cone_ with\n", "vertex $0$ iff $\\alpha C \\subset C \\;\\;\\forall \\alpha \\ge 0$.\n", "A set $C \\subset {\\mathcal X}$, ${\\mathcal X}$ a linear vector space, is a _cone with vertex\n", "$p$_ if $C = p + C_0$, where $C_0$ is a cone with vertex $0$.\n", "
\n", "\n", "__Examples.__\n", "The positive numbers are a cone in $\\Re$.\n", "The positive orthant is a cone in $\\Re^n$.\n", "\n", "
\n", "### Definition\n", "Let ${\\mathcal X}$ be a LVS and let $P \\subset {\\mathcal X}$ be a convex cone with vertex $0$.\n", "For any $x, y \\in {\\mathcal X}$, we write $x \\le y$ (w.r.t. $P$) if $y - x \\in P$.\n", "The cone $P$ is called the _positive cone_ in ${\\mathcal X}$.\n", "$N \\equiv -P$ is the _negative cone_.\n", "We write $x \\ge y$ if $y - x \\in N$ (equivalently, if $x - y \\in P$).\n", "
\n", "\n", "__Examples.__ In ${\\mathbf R}$, $[0, \\infty)$ is the usual positive cone.\n", "In ${\\mathbf R}^n$, the positive orthant ($n$-tuples whose components are all non-negative)\n", "is the usual positive cone.\n", "The set of non-negative functions and the set of monotone functions can form\n", "the positive cones in some function spaces.\n", "\n", "Note: the relation $\\le$ defined above is almost--but not quite--a partial order on\n", "the linear vector space ${\\mathcal X}$: it does not satisfy the second axiom.\n", "If $P$ satisfies $(x \\in P \\mbox{ and } -x \\in P)$ implies $x = 0$, then the relation $\\le$\n", "is a partial order.\n", "\n", "
\n", "### Exercise 6\n", "For $x$, $y \\in {\\mathbf R}^n$, define\n", "$$\n", "R(x, y) = \\left \\{\n", "\\begin{array}{ll}\n", "\\mbox{true}, & \\max_{j=1}^n (y_j - x_j) \\ge 0 \\cr\n", "\\mbox{false}, & \\mbox{otherwise}.\n", "\\end{array}\n", "\\right .\n", "$$\n", "Does $R(\\cdot, \\cdot)$ define a partial order on ${\\mathbf R}^n$? Why or why not?\n", "
\n", "\n", "
\n", "### Definition\n", "Let ${\\mathcal X}$ and ${\\mathcal Y}$ be linear vector spaces, let $P$ be the positive\n", "cone on ${\\mathcal Y}$, and let $T: {\\mathcal X} \\rightarrow {\\mathcal Y}$ have domain $D$.\n", "$T$ is _convex_ if\n", "\n", "+ $D$ is a convex subset of ${\\mathcal X}$\n", "+ $T(\\alpha x_1 + (1-\\alpha) x_2) \\le \\alpha T x_1 + (1-\\alpha)Tx_2$,\n", "$\\forall x_1, x_2 \\in D$, and all $\\alpha \\in [0, 1]$.\n", "\n", "
\n", "\n", "Note that convexity depends on the definition of the positive cone $P$ in ${\\mathcal Y}$!\n", "Convexity and related (such as pseudoconvexity\n", "and quasiconvexity), play a crucial role in optimization theory.\n", "(A mapping $T$ is _quasiconvex_ if its domain is convex and\n", "$T(\\alpha x_1 + (1-\\alpha) x_2) \\le \\max(T x_1, Tx_2)$,\n", "$\\forall x_1, x_2 \\in D$, and all $\\alpha \\in [0, 1]$.)\n", "\n", "\n", "
\n", "### Definition\n", "Let $T$ be a subset of a linear vector space ${\\mathcal X}$.\n", "The _cone generated by $T$_ or _star of $T$_\n", "is the set\n", "$$\n", "C(T) \\equiv \\{ \\alpha t: \\alpha \\in [0, \\infty), t \\in T \\} \\subset {\\mathcal X}.\n", "$$\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Projection Theorem\n", "The Projection Theorem is one of the most widely used results in statistics.\n", "[TO DO!]\n", "\n", "### Two convex optimization problems\n", "\n", "#### Distance from a point to a convex set in Hilbert space\n", "\n", "Let ${\\mathcal X}$ be a Hilbert space and let ${\\mathcal Y}$ be a subspace of ${\\mathcal X}$.\n", "Let $x \\in {\\mathcal X}$.\n", "Consider the minimization problem\n", "$$\n", "\\min_{y \\in {\\mathcal Y}} \\| x - y \\|.\n", "$$\n", "Suppose $y_0$ is a solution to the problem.\n", "Then $x - y_0 \\perp {\\mathcal Y}$.\n", "\n", "#### Distance from a point to a subspace in a Euclidean space\n", "[TO DO!]\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## General-purpose Inequalities\n", "### The Arithmetic-Geometric Mean Inequality\n", "Let $\\{ a_j \\}_{j=1}^n$ and $\\{q_j\\}_{j=1}^n$ be sets of nonnegative numbers with\n", "$\\sum_j q_j = 1$.\n", "Then\n", "$$\n", "\\prod_{j=1}^n a_j^{q_j} \\le \\sum_{j=1}^n q_j a_j.\n", "$$\n", "\n", "### The Triangle Inequality and Generalizations\n", "[TO DO!]\n", "\n", "### Cauchy's Inequality\n", "[TO DO!]\n", "\n", "### The Cauchy-Schwartz Inequality\n", "If ${\\mathcal H}$ is a Hilbert space and $x , y \\in {\\mathcal H}$, then\n", "$$\n", "\\left |\\langle x, y \\rangle \\right | \\le \\|x\\| \\| y \\|.\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Probability Inequalities\n", "This follows Lugosi (2006) rather closely.\n", "\n", "#### A Helpful Identity: the tail-integral formula for expectation\n", "If $X$ is a nonnegative real-valued random variable,\n", "$$\n", "{\\mathbb E} X = \\int_0^\\infty {\\mathbb P} \\{X \\ge x\\} dx\n", "$$\n", "\n", "#### Jensen's Inequality\n", "If $\\phi$ is a convex function from ${\\mathcal X}$ to $\\Re$, then $\\phi({\\mathbb E} X) \\le {\\mathbb E} \\phi(X)$.\n", "\n", "#### Markov's, Chebychev's, and related inequalities\n", "From the tail-integral formula,\n", "$$\n", "{\\mathbb E} X \\ge \\int_0^t {\\mathbb P} \\{X \\ge x\\} dx \\ge t {\\mathbb P} \\{X \\ge t \\},\n", "$$\n", "which yields _Markov's Inequality_:\n", "$$\n", "{\\mathbb P} \\{ X \\ge t \\} \\le \\frac{{\\mathbb E} X}{t}.\n", "$$\n", "\n", "Moreover, for any strictly monotonic function $f$ and nonnegative $X$,\n", "we have the _Generalized Markov Inequality_:\n", "$$\n", "{\\mathbb P} \\{ X \\ge t \\} = {\\mathbb P} \\{ f(X) \\ge f(t) \\} \\le \\frac{{\\mathbb E} f(X)}{f(t)}.\n", "$$\n", "In particular, for any real-valued $X$ and real $q > 0$,\n", "$| X - {\\mathbb E} X |$ is a nonnegative\n", "random variable and $f(x) = x^q$ is strictly monotonic, so\n", "$$\n", "{\\mathbb P} \\{| X - {\\mathbb E} X | \\ge t \\} = {\\mathbb P} \\{ | X - {\\mathbb E} X |^q \\ge t^q \\} \\le\n", "\\frac{{\\mathbb E} | X - {\\mathbb E} X |^q}{t^q}.\n", "$$\n", "_Chebychev's inequality_ is a special case:\n", "$$\n", "{\\mathbb P} \\{ | X - {\\mathbb E} X |^2 \\ge t^2 \\} \\le \\frac{{\\mathbb E} | X - {\\mathbb E} X |^2}{t^2}\n", "= \\frac{{\\mbox{Var}} X}{t^2}.\n", "$$\n", "\n", "#### The Chernoff Bound\n", "Apply the Generalized Markov Inequality with $f(x) = e^{sx}$ for positive $s$\n", "to obtain the _Chernoff Bound_:\n", "$$\n", "{\\mathbb P} \\{ X \\ge t \\} = {\\mathbb P} \\{ e^{sX} \\ge e^{st} \\} \\le \\frac{{\\mathbb E} e^{sX}}{e^{st}}\n", "$$\n", "for all $s$.\n", "For particular $X$, one can optimize over $s$.\n", "\n", "\n", "#### Hoeffding's Inequality\n", "Let $\\{ X_j \\}_{j=1}^n$ be independent, and define\n", "$S_n \\equiv \\sum_{j=1}^n X_j$.\n", "Applying the Chernoff Bound yields\n", "$$\n", "{\\mathbb P} \\{ S_n - {\\mathbb E} S_n \\ge t \\} \\le e^{-st} {\\mathbb E} e^{sS_n} =\n", "e^{-st} \\prod_{j=1}^n e^{s(X_j - E X_j)}.\n", "$$\n", "Hoeffding bounds the moment generating function for a bounded random variable\n", "$X$:\n", "If $a \\le X \\le b$ with probability 1, then\n", "$$\n", "{\\mathbb E} e^{sX} \\le e^{s^2 (b-a)^2/8},\n", "$$\n", "from which follows _Hoeffding's tail bound_.\n", "\n", "If $\\{X_j\\}_{j=1}^n$ are independent and ${\\mathbb P} \\{a_j \\le X_j \\le b_j\\} = 1$,\n", "then\n", "$$ \n", "{\\mathbb P} \\{ S_n - {\\mathbb E} S_n \\ge t \\} \\le e^{-2t^2/\\sum_{j=1}^n (b_j - a_j)^2}\n", "$$\n", "and\n", "$$ \n", "{\\mathbb P} \\{ S_n - {\\mathbb E} S_n \\le -t \\} \\le e^{-2t^2/\\sum_{j=1}^n (b_j - a_j)^2}\n", "$$\n", "\n", "#### Hoeffding's Other Inequality\n", "Suppose $f$ is a convex, real function and ${\\mathcal X}$ is a finite set.\n", "Let $\\{X_j \\}_{j=1}^n$ be a simple random sample from ${\\mathcal X}$ and\n", "let $\\{Y_j \\}_{j=1}^n$ be an iid uniform random sample (with replacement) from ${\\mathcal X}$.\n", "Then\n", "$$ \n", "{\\mathbb E} f \\left ( \\sum_{j=1}^n X_j \\right ) \\le {\\mathbb E} f \\left ( \\sum_{j=1}^n Y_j \\right ).\n", "$$\n", "\n", "#### Bernstein's Inequality\n", "Suppose $\\{X_j \\}_{j=1}^n$ are independent with ${\\mathbb E} X_j = 0$ for all $j$,\n", "${\\mathbb P} \\{ | X_j | \\le c\\} = 1$,\n", "$\\sigma_j^2 = {\\mathbb E} X_j^2$ and $\\sigma = \\frac{1}{n} \\sum_{j=1}^n \\sigma_j^2$.\n", "Then for any $\\epsilon > 0$,\n", "$$\n", "{\\mathbb P} \\{ S_n/n > \\epsilon \\} \\le e^{-n \\epsilon^2 / 2(\\sigma^2 + c\\epsilon/3)}.\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "Next chapter: [Linear Algebra](linAlg.ipynb)" ] } ], "metadata": { "kernelspec": { "display_name": "R", "language": "", "name": "ir" }, "language_info": { "codemirror_mode": "r", "file_extension": ".r", "mimetype": "text/x-r-source", "name": "R", "pygments_lexer": "r", "version": "3.1.3" } }, "nbformat": 4, "nbformat_minor": 0 }