{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h1><center>cs1001.py , Tel Aviv University, Spring 2020</center></h1>\n",
    "<img src=\"http://www.pngall.com/wp-content/uploads/2016/05/Python-Logo-PNG-Image-180x180.png\" width=50/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Recitation 4"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Complexity\n",
    "\n",
    "We analyzed the time complexity of different solutions for a given problem.\n",
    "Then, we proved several claims using the Formal definition of Big $O(\\cdot)$.\n",
    "\n",
    "#### Takeaways:\n",
    "\n",
    "* Analyzing the running time complexity of a function can teach us about how the running time would change as a function of the input size\n",
    "* Two functions that have the same running time complexity may differ in their running time (one can be more efficient than the other)\n",
    "* Big $O(\\cdot)$ notation effectively \"hides\" additive and multiplicative constants\n",
    "* When analyzing a nested loop, we can sum over the boundaries of the external loop, when each summand will be the number of iterations of the inner loop within a given iteration of the external loop "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Problem: \n",
    "Given a natural number $p$, how many right triangles exist with integer-sized sides whose perimeter is $p$?\n",
    "\n",
    "#### Formally:\n",
    "Given a natural number $p > 0$, how many triplets $(a,b,c)$ are there such that:\n",
    "<ol>\n",
    "    <li>$a^2 +b^2 == c^2$</li>\n",
    "    <li>$a + b + c == p$</li>\n",
    "    <li>$a,b,c > 0$ are all integers</li>\n",
    "</ol>\n",
    "In order to avoid counting a triplet twice or more, we require that $0 < a < b < c$\n",
    "Note that:\n",
    "<ol>\n",
    "    <li>$a \\neq b$ and $b \\neq c$</li>\n",
    "    <li>The case $a = b$ cannot hold as this would imply $2a^2 = c^2 \\implies c = \\sqrt{2} a$</li>\n",
    "    <li>The condition $b < c$ is redundant (since we require that $a,b,c > 0$ and $a^2 + b^2 == c^2$)</li>\n",
    "    </ol>\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "When computing code complexity, we\n",
    "<ol>\n",
    "    <li>define the entire content of the innermost loop as an “atomic operation” which takes constant time.</li>\n",
    "    <li>describe the complexity as a function of $p$ (i.e., use $p$ as the “problem size” or “input size”). \n",
    "        Note, however, that the formal definition of “problem size” for a numeric input is the size it takes to represent the input, i.e. $n = \\log(p)$. \n",
    "        Working with $p$ here is easier and more intuitive.</li>\n",
    "</ol>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Trivial solution (v1):\n",
    "\n",
    "Go over all triplets of numbers in the relevant range, count only those where the above conditions hold.\n",
    "\n",
    "There are $p-1$ options for each variable $a,b,c$, so we need $(p-1)^3\\approx p^3$ iterations and we can check each triplet in constant time.\n",
    "\n",
    "Complexity: $O(p^3)$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "def num_of_triangles_v1(p):\n",
    "    cnt = 0\n",
    "    for a in range(1, p):\n",
    "        for b in range(1, p):\n",
    "            for c in range(1, p):\n",
    "                if a < b and a + b + c == p and a**2 + b**2 == c**2:\n",
    "                    cnt += 1\n",
    "    return cnt\n",
    "    \n",
    "    \n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Second version (v2):\n",
    "\n",
    "If we look at a pair of values for $a, b$, we can compute $c$ directly instead of going over all possible $c$ values in $\\{1,2,\\ldots,p-1\\}$.\n",
    "\n",
    "We now use two nested loops instead of three (and omit the condition $a + b + c = p$). \n",
    "\n",
    "There are $(p-1)^2\\approx p^2$ iterations, each one is again computed in constant time.\n",
    "\n",
    "Complexity: $O(p^2)$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "def num_of_triangles_v2(p):\n",
    "    cnt = 0\n",
    "    for a in range(1, p):\n",
    "        for b in range(1, p):\n",
    "            c = p - a - b\n",
    "            if a < b and a**2 + b**2 == c**2 and c > 0:\n",
    "                cnt += 1\n",
    "    return cnt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Third version (v3):\n",
    "\n",
    "Since we require $a < b$, in each iteration of the outer loop for $a$ we can define a lower bound for $b$, i.e. - in each iteration we need to check $b$ values between $a+1,\\ldots, p-1$ instead of $1,\\ldots,p-1$.\n",
    "\n",
    "The loops are now dependent and, therefore, to compute the number of atomic operations, we take a sum over $a$ of the number of $b$ values tested. \n",
    "\n",
    "In the $i$th iteration of $a$ (that is, for $a=i$) we need to check $p - 1 - i = p - (1 + i)$ values of $b$, so the total number of iterations is: $$\\sum_{a = 1}^{p-1} (p - (1 + a)) = \\sum_{i = 0}^{p-2} i = \\frac{(p-1)(p-2)}{2} $$\n",
    "\n",
    "Complexity: $O(p^2)$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "def num_of_triangles_v3(p):\n",
    "    cnt = 0\n",
    "    for a in range(1, p):\n",
    "        for b in range(a + 1, p):\n",
    "            c = p - a - b\n",
    "            if a**2 + b**2 == c**2 and c > 0:\n",
    "                cnt += 1\n",
    "    return cnt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Fourth version (v4):\n",
    "\n",
    "We now further improve our efficiency by defining **upper bounds** for $a,b$. we use the same strategy as in v3 to count operations.\n",
    "\n",
    "First of all, since $a < b < c$ and $a + b + c = p$, clearly $a < p/3$. That is, the maximal possible value of $a$ is $p//3$ (we will iterate till $p//3 + 1$ in order to include $p//3$ in the range).\n",
    "\n",
    "\n",
    "Next, $b = p-a-c < p-a-b  \\implies 2b < (p-a)$. Thus: $b < (p-a)/2$. That is, the maximal possible value of $b$ is $(p-a)//2$ (again, we iterate till $(p-a)//2 + 1$ in order to include $(p-a)//2$ in the range).\n",
    "\n",
    "\n",
    "The number of iterations is therefore: $$\\sum_{a = 1}^{p/3} \\left(\\left\\lfloor\\frac{p-a}{2}\\right\\rfloor + 1 - (a+1)\\right)\\approx \\sum_{a = 1}^{p/3} \\frac{p}{2} - \\frac{a}{2} - a =\\frac{p^2}{6} - \\frac{3}{2}\\sum_{a = 1}^{p/3}a = \\frac{p^2}{6} - \\frac{3}{2}(p/3 + 1)\\cdot \\frac{p}{6} = \\frac{p^2}{6} - \\frac{p^2}{12} - \\frac{p}{4} = \\frac{p^2}{12} - \\frac{p}{4}$$\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "Complexity: $O(p^2)$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "def num_of_triangles_v4(p):\n",
    "    cnt = 0\n",
    "    for a in range(1, p//3 + 1):\n",
    "        for b in range(a + 1, (p - a)//2 + 1):\n",
    "            c = p - a - b\n",
    "            if a**2 + b**2 == c**2:\n",
    "                cnt += 1\n",
    "    return cnt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Fifth version (v5):\n",
    "\n",
    "We realize we have two equations in three variables, therefore there’s only a single free parameter here. \n",
    "\n",
    "$a+b+c=p \\implies c = p-a-b$\n",
    "\n",
    "Substitute $c$ with $p-a-b$ in $a^2+b^2=c^2$ to get $a^2 + b^2 = (p -a -b)^2$, open and isolate $b$ to get $b = \\frac{p^2-2ap}{2(p-a)}$.\n",
    "   \n",
    "So what do we need to do? We loop only over $a$, but need to make sure that the resulting $b$ is integral, and that $a < b$. Note that we do not have to calculate $c$ here\n",
    "\n",
    "\n",
    "The number of iterations is $p/3$ and in each iteration we still do only a constant number of operations!\n",
    "\n",
    "Complexity: $O(p)$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "def num_of_triangles_v5(p):\n",
    "    cnt = 0\n",
    "    for a in range(1, p//3 + 1):\n",
    "        b = (p**2 - 2*p*a)/(2*(p - a))\n",
    "        if int(b) == b and a < b:\n",
    "            cnt += 1\n",
    "    return cnt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Experiment:\n",
    "\n",
    "We compare the change in running time of each of the functions as $p$ increases twofold.\n",
    "\n",
    "As expected, when $p$ increases by 2 and for large enough $p$ (so that asymptotic computations are valid):\n",
    "\n",
    "the cubical version ($O(p^3)$) runs in time which is roughly $2^3 = 8$ times longer,\n",
    "\n",
    "the quadratic versions ($O(p^2)$) runs in time which is roughly $2^2 = 4$ times longer,\n",
    "\n",
    "and the linear version ($O(p)$) runs in time which is roughly $2$ times longer.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "v1, p = 240 took 2.0098203190000277 secs\n",
      "v2, p = 240 took 0.07364331099995525 secs\n",
      "v3, p = 240 took 0.055745446999935666 secs\n",
      "v4, p = 240 took 0.013294021999968209 secs\n",
      "v5, p = 240 took 0.00012039499995353253 secs\n",
      "\n",
      "v1, p = 480 took 17.873038491999978 secs\n",
      "v2, p = 480 took 0.337060590999954 secs\n",
      "v3, p = 480 took 0.3156459970000469 secs\n",
      "v4, p = 480 took 0.044534067999961735 secs\n",
      "v5, p = 480 took 0.00019973799999206676 secs\n",
      "\n",
      "Since v5 is too fast for some machines to time, we check it for input size which is x100 and x200 bigger:\n",
      "v5, p = 24000 took 0.01722720300006131 secs\n",
      "v5, p = 48000 took 0.03454993299999387 secs\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "\n",
    "def elapsed(expression, number = 1):\n",
    "    ''' computes elapsed time for executing code\n",
    "    number of times (default is 1 time). expression should\n",
    "    be a string representing a Python expression. '''\n",
    "    \n",
    "    t1=time.perf_counter()\n",
    "    for i in range(number):\n",
    "        x = eval(expression)\n",
    "    t2=time.perf_counter()\n",
    "    return t2-t1\n",
    "\n",
    "\n",
    "print(\"v1, p = 240 took\",elapsed(\"num_of_triangles_v1(240)\"), \"secs\")\n",
    "print(\"v2, p = 240 took\",elapsed(\"num_of_triangles_v2(240)\"), \"secs\")\n",
    "print(\"v3, p = 240 took\",elapsed(\"num_of_triangles_v3(240)\"), \"secs\")\n",
    "print(\"v4, p = 240 took\",elapsed(\"num_of_triangles_v4(240)\"), \"secs\")\n",
    "print(\"v5, p = 240 took\",elapsed(\"num_of_triangles_v5(240)\"), \"secs\")\n",
    "print(\"\")\n",
    "print(\"v1, p = 480 took\",elapsed(\"num_of_triangles_v1(480)\"), \"secs\")\n",
    "print(\"v2, p = 480 took\",elapsed(\"num_of_triangles_v2(480)\"), \"secs\")\n",
    "print(\"v3, p = 480 took\",elapsed(\"num_of_triangles_v3(480)\"), \"secs\")\n",
    "print(\"v4, p = 480 took\",elapsed(\"num_of_triangles_v4(480)\"), \"secs\")\n",
    "print(\"v5, p = 480 took\",elapsed(\"num_of_triangles_v5(480)\"), \"secs\")\n",
    "print(\"\")\n",
    "print(\"Since v5 is too fast for some machines to time, we check it for input size which is x100 and x200 bigger:\")\n",
    "print(\"v5, p = 24000 took\",elapsed(\"num_of_triangles_v5(24000)\"), \"secs\")\n",
    "print(\"v5, p = 48000 took\",elapsed(\"num_of_triangles_v5(48000)\"), \"secs\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Big O notation\n",
    "\n",
    "Given two functions $f(n)$ and $g(n)$, we say that: $$f(n) = O(g(n))$$ \n",
    "And we sometimes ommit the indeterminate and simply sat that: $$f = O(g)$$\n",
    "If there exists a constant $c > 0$ and $n_{0}\\in \\mathbb{N}$ such that: $$\\forall n>n_0: |f(n)| \\leq c\\cdot|g(n)|$$ \n",
    "   \n",
    "   \n",
    "When $f,g$ are positive, an equivalent condition which is sometimes easier to check is that $$f(n) = O(g(n)) \\iff \\lim_{n \\to \\infty} \\frac{f(n)}{g(n)} < \\infty$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Time complexity hierarchy\n",
    "  \n",
    "Let $n$ denote the size of the input and $c > 1$ denote a constant.\n",
    "Here are some common classes of functions we encounter:\n",
    "\n",
    "* Constant - $O(1)$\n",
    "* Logarithmic - $O(\\log(n))$\n",
    "* Poly-logarithmic - $O(\\log^c(n))$\n",
    "* Root/fractional - $O(n^{1/c})$\n",
    "* Linear - $O(n)$\n",
    "* Loglinear - $O(n \\log(n))$\n",
    "* Polynomial - $O(n^{c})$\n",
    "* Exponential - $O(c^{n})$\n",
    "\n",
    "See also this list on [Wikipedia](http://en.wikipedia.org/wiki/Time_complexity#Table_of_common_time_complexities).\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### $O(\\cdot)$ - Facts\n",
    "\n",
    "Transitivity:\n",
    "\n",
    "* If $f(n) = O(g(n))$ and $g(n) = O(h(n))$ then $f(n) = O(h(n))$\n",
    "\n",
    "Non-symmetry:\n",
    "\n",
    "* $f(n) = O(g(n))$ does not imply $g(n) = O(f(n))$ or $g(n) \\neq O(f(n))$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Prove or disprove\n",
    "1. $2^n = O(2^{n/2})$\n",
    "    * Wrong, assume there exists $n_0, c>0$ such that for $n > n_0$ $2^n \\leq c \\cdot 2^{n/2}$. This implies $c \\geq 2^{n/2} \\to \\infty$ as $n \\to \\infty$ in contradiction to $c$ being constant.\n",
    "2. $\\log n! = O(n \\cdot \\log n)$ \n",
    "    * True. Note that as $\\log(\\cdot)$ is monotone increasing and that $\\log (ab) = \\log a + \\log b$, thus: $\\log n! = \\log 1 + \\ldots + \\log n = \\sum_{i=1}^n \\log i \\leq \\sum_{i=1}^n \\log n = n \\cdot \\log n$\n",
    "3. If $f_1(n) = O(g_1(n)), f_2(n) = O(g_2(n))$ then $f_1(n) + f_2(n) = O(g_1(n)+g_2(n)) = O(\\max \\{g_1(n),g_2(n)\\})$\n",
    "    * True. Let $n_1, c_1$ and $n_2, c_2$ be the constants such that $n > n_1 \\implies f_1(n)  \\leq c_1 g_1(n)$ and $n > n_2 \\implies f_2(n)  \\leq c_2 g_2(n)$ and note that for $n_3 = n_1 + n_2$ we have that if $n > n_3$: $$f_1(n) + f_2(n) \\leq c_1 g_1(n) + c_2 g_2(n) \\leq (c_1 + c_2) (g_1(n) + g_2(n)) \\leq 2 (c_1 + c_2) \\max\\{g_1(n),g_2(n)\\}$$ Thus the first claim holds with constants $n_3, (c_1 + c_2)$ and the second with $n_3, 2(c_1 + c_2)$\n",
    "    * The above is also true for any $k$ functions $f_1, \\ldots, f_k$ where $k$ is a constant. A direct consequence of which is:\n",
    "4. For any constant $k \\geq 1$ and constants $a_0, \\ldots, a_k \\in \\mathbb{R}^{+}$: $$f(n) = a_0 \\cdot n^0 + a_1 \\cdot n^1 + \\cdots + a_k\\cdot n^k = O(n^k)$$ \n",
    "    * True. Let $a_\\max = \\max\\{a_0,\\ldots,a_k\\}$ and observe that $f(n) \\leq a_\\max n^0 + a_\\max n^1 + \\cdots + a_\\max n^k \\leq k a_\\max n^k$. The claim follows as $k a_\\max$ is a constant\n",
    "5. For any constant $k > 1$: $\\log_2^k n = O\\left(\\log_{10}^k n\\right)$\n",
    "    * True. Basic logarithm identities show us that $\\log_2^k n = \\left(\\frac{\\log_{10} n}{ \\log_{10} 2}\\right)^k = \\left(\\frac{1}{ \\log_{10} 2}\\right)^k \\cdot \\log_{10}^k n$. The claim follows as $\\left(\\frac{1}{ \\log_{10} 2}\\right)^k$ is a constant.\n",
    "    * Letting $k = 1$ we can generalize the above: for any constants $a,b$ we have both $\\log_a n = O(\\log_b n)$ and $\\log_b n = O(\\log_a n)$. Because of this, we usually omit the base of the logarithm when using $O(\\cdot)$ notation (if it is constant!) and simply say $O(\\log n)$\n",
    "\n",
    "\n",
    "### If time permits\n",
    "1. For any two functions $f,g$, $f(n) = O(g(n)) \\implies 2^{f(n)} = O(2^{g(n)})$ \n",
    "    * Wrong. Letting $f(n) = n, g(n) = n/2$ this is the first example from the previous section.\n",
    "2. $\\sum_{i = 0}^{\\log n}\\frac{i}{2^n} = O(1)$ \n",
    "    * True. We first see that $\\sum_{i = 0}^{\\log n}\\frac{i}{2^n} \\leq \\frac{\\log^2 n}{2^n} = \\frac{2^{\\log \\log^2 n}}{2^n} =\\frac{2^{2\\log \\log n}}{2^n}$ The claim follows by noting that for large enough $n$ we have: $n > \\log \\log n$)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}