{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"code_folding": [],
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# setup\n",
"from IPython.core.display import display,HTML\n",
"display(HTML(''))\n",
"display(HTML(open('rise.css').read()))\n",
"\n",
"# imports\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns\n",
"%matplotlib inline\n",
"sns.set(style=\"whitegrid\", font_scale=1.5, rc={'figure.figsize':(12, 6)})\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# CMPS 2200\n",
"# Introduction to Algorithms\n",
"\n",
"## Functional Programming\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Today's agenda:\n",
"\n",
"- What is functional programming?\n",
"- What's the connection between functional programming and parallelism?\n",
"- How do we perform computation in functional languages?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Specifying algorithms\n",
"\n",
"We need a way to specify both **what** and algorithm does and **how** it does it.\n",
"\n",
"**Algorithm specification**: defines what an algorithm should do.\n",
"\n",
"> Given a sequence $A$ of $n$ elements, return a sequence $B$ such that $B[i] \\leq B[j]$ for all $0 \\leq i \\le j \\le n$\n",
"\n",
"May also include **cost specification**, e.g. $O(n \\log n)$ work and. $O(\\log^2 n)$ span."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"**Algorithm implementation** (or just **algorithm**): defines **how** an algorithm works.\n",
"\n",
"This could be real, working code, or pseudo-code.\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Our textbook uses a pseudo code language called **SPARC**\n",
"- based on a functional language called ML.\n",
"- suitable for parallel algorithms\n",
"- good level of abstraction for talking about parallel algorithms\n",
"\n",
"When possible, we will also show Python versions of key algorithms."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Functional languages\n",
"\n",
"In functional languages, functions act like mathematical functions.\n",
"\n",
"Two key properties:\n",
"\n",
"1. function maps an input to an output $f : X \\mapsto Y$\n",
" - no **side effects**\n",
" \n",
" \n",
"2. function can be treated as values\n",
" - function A can be passed to function B"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Pure function\n",
"\n",
"A function is **pure** if it maps an input to an output with no **side effects.**\n",
"\n",
"A computation is **pure** if all of its functions are pure.\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"20"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def double(value):\n",
" return 2 * value\n",
"\n",
"double(10)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"We can view the `double` function as a mathematical function, defined by the mapping:\n",
"\n",
"$$ \\{(0, 0), (1, 2), (2, 4), \\ldots \\}$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"versus..."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[1, 2, 3, 6]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def append_sum(mylist):\n",
" return mylist.append(sum(mylist))\n",
"\n",
"mylist = [1,2,3]\n",
"append_sum(mylist)\n",
"mylist"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"This has the side effecet of changing (or *mutating*) `mylist`."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"though compare with..."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[1, 2, 3]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def append_sum(mylist):\n",
" return list(mylist).append(sum(mylist))\n",
"\n",
"mylist = [1,2,3]\n",
"append_sum(mylist)\n",
"mylist"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Almost all \"real\" computations have some side effects. Consider:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"def do_sum(mylist):\n",
" total = 0\n",
" for v in mylist:\n",
" total += v\n",
" return total"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"`do_sum` has the side effect of modifying `total`. But, this effect is not visible outside of `do_sum`, due to variable scoping.\n",
"\n",
"> **benign effect:** a side-effect that is not observable from outside of the function.\n",
"\n",
"A function with benign effects is still considered pure."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Why is pure computation good for parallel programming?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
" Recall our **race condition** example:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"188980\n"
]
}
],
"source": [
"from multiprocessing.pool import ThreadPool\n",
"\n",
"def in_parallel(f1, arg1, f2, arg2):\n",
" with ThreadPool(2) as pool:\n",
" result1 = pool.apply_async(f1, [arg1]) # launch f1\n",
" result2 = pool.apply_async(f2, [arg2]) # launch f2\n",
" return (result1.get(), result2.get()) # wait for both to finish\n",
" \n",
"total = 0\n",
"\n",
"def count(size):\n",
" global total\n",
" for _ in range(size):\n",
" total += 1\n",
" \n",
"def race_condition_example():\n",
" global total\n",
" in_parallel(count, 100000,\n",
" count, 100000)\n",
" print(total)\n",
" \n",
"race_condition_example()"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"The `count` function has a side-effect of changing the global variable `total`."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Heisenbugs\n",
"\n",
"\n",
"\n",
"- Race conditions can lead to bugs that only appear, e.g., 1 out of 1000 runs of the program.\n",
"- Reference to Heisenberg uncertainty principal (the bug disappears when you study it, but reappears when you stop studying it)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"\n",
"More generally, if we want to parallelize two functions $f(a)$ and $g(b)$, we want the same result **no matter which order they are run in.**\n",
"\n",
"> Because of the lack of side-effects, pure functions satisfy this condition."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Data Persistence\n",
"\n",
"In pure computation no data can ever be overwritten, only new data can be created. \n",
"\n",
"Data is therefore always **persistent**\n",
" —if you keep a reference to a data structure, it will always be there and in the same state as it started.\n",
"\n",
"### Isn't this horribly space inefficient?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"> garbage collection"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Functional languages\n",
"\n",
"In functional languages, functions act like mathematical functions.\n",
"\n",
"Two key properties:\n",
"\n",
"1. function maps an input to an output $f : X \\mapsto Y$\n",
" - no **side effects**\n",
" \n",
" \n",
"2. function can be treated as values\n",
" - function A can be passed to function B"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Functions as values\n",
"\n",
"Many languages allow functions to be passed to other functions.\n",
"\n",
"Functions as \"first-class values.\""
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"12"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def double(value):\n",
" return 2 * value\n",
"\n",
"def double_and_sum(double_fn, vals):\n",
" total = 0\n",
" for v in vals:\n",
" total += double_fn(v)\n",
" return total\n",
"\n",
"# pass the function double to the function double_and_sum\n",
"double_and_sum(double, [1,2,3]) \n",
"# 1*2 + 2*2 + 3*3"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"`double_and_sum` is called a **higher-order function**, since it takes another function as input.\n",
"\n",
"Why is this useful?"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[2, 4, 6]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def map_function(function, values):\n",
" for v in values:\n",
" yield function(v)\n",
"\n",
"list(map_function(double, [1,2,3]))"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[1, 4, 9]"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def square(value):\n",
" return value * value\n",
"\n",
"list(map_function(square, [1,2,3]))"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[2, 8, 18]"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"list(map_function(double, map_function(square, [1,2,3])))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"- If we know that `function` is pure, then we can trivially parallelize `map_function` for many inputs.\n",
"\n",
"\n",
"\n",
"- By using higher-order functions, we can define a few primitive, high-order functions that will make it easier to reason about and analyze run-time of parallel computations."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Lambda Calculus\n",
"\n",
"- pure language developed by Alonzo Church in the 1930s\n",
"- inherently parallel"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Consists of expressions $e$ in one of three forms:\n",
"\n",
"1. a **variable**, e.g., $x$\n",
"2. a **lambda abstraction**, e.g., $(\\lambda \\: x \\: . \\: e)$, where $e$ is a function body.\n",
"3. an **application**, written $(e_1, e_2)$ for expressions $e_1$, $e_2$."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Beta reduction\n",
"\n",
"A generic way of applying a sequence of anonymous functions.\n",
"\n",
"$$\n",
"(\\lambda \\: x \\: . \\: e_1) e_2 \\mapsto e_1 [x / e_2]\n",
"$$\n",
"\n",
"$e_1 [x / e_2]$: for every free occurence of $x$ in $e_1$, substitute it with $e_2$.\n",
"\n",
"Can read as \"substitute $e_2$ for $x$ in $e_1$.\"\n",
"\n",
"E.g.\n"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"100"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# lambda functions exist in Python.\n",
"# these are anonymous functions (no names)\n",
"# Here, e_2 is a variable.\n",
"(lambda x: x*x)(10)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"\n",
"$\n",
"(\\lambda \\: x \\: . \\: x * x) \\: 10 \\mapsto 10 * 10 \\mapsto 100\n",
"$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"We can also chain functions together. E.g., $e_2$ can be another function."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"144"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(lambda x: x*x)((lambda x: x+2)(10))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"beta reduction:\n",
"\n",
"$ (\\lambda \\: x \\: . \\: x * x)((\\lambda \\: x \\: . \\: x+2) \\: 10 ) \\mapsto$ \n",
"$(\\lambda \\: x \\: . x * x) \\: (10+2) \\mapsto$ \n",
"$(\\lambda \\: x \\: . x * x) \\: 12 \\mapsto$ \n",
"$( 12 * 12 ) \\mapsto$ \n",
"$ 144 $\n",
"\n",
"
\n",
"\n",
"**Could we have done this in any other order?**"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"$( (\\lambda \\: x \\: . \\: x * x) (\\lambda \\: x \\: . \\: x+2)) \\: 10 \\mapsto$ \n",
"$(\\lambda \\: x \\: . (x+2) * (x+2)) \\: 10 \\mapsto$ \n",
"$(10+2) * (10+2) \\mapsto$ \n",
"$ 144 $\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Beta reduction order\n",
"\n",
"$$\n",
"(\\lambda \\: x \\: . \\: e_1) e_2 \\mapsto e_1 [x / e_2]\n",
"$$\n",
"\n",
"**call-by-value:** (first example): $e_2$ is evaluated to a value first, then reduction is applied\n",
"- square$(10+2)$\n",
"- square$(12)$\n",
"- $12*12$\n",
"- $144$\n",
"- Languages that use call-by-value are called **strict**: argument is always evaluated before applying function\n",
"\n",
"**call-by-need:** (second example): $e_2$ is first copied into $e_1$, then $e_1$ is evaluated.\n",
"- square$(10+2)$\n",
"- $(10+2) * (10+2)$\n",
"- $144$\n",
"\n",
"
\n",
"We will be using call-by-value, which is more amenable to parallelism."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Computation via beta reductions\n",
"\n",
"**Computation** in lambda calculus means to apply beta reductions until there is nothing left to reduce.\n",
"\n",
"When there is nothing left to reduce, the expression is in **normal form**.\n",
"\n",
"
\n",
"\n",
"How can we make an infinite loop in lambda computation?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"$ ( (\\lambda \\: x \\: . \\: (x \\: x)) (\\lambda \\: x \\: . \\: (x \\: x))) \\mapsto$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"$ ( (\\lambda \\: x \\: . \\: (x \\: x)) (\\lambda \\: x \\: . \\: (x \\: x)))$"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Doing computation via lambda calculus seems limiting. \n",
"\n",
"Can it compute everything we need?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Surprising result:\n",
"\n",
"The Lambda calculus is equivalent to Turing Machines.\n",
"\n",
"That is... \n",
"\n",
"Anything that can be computed by a Turing Machine can also be computed by the Lambda calculus, and visa versa.\n",
"\n",
"\n",
"More details: \n",
"- CMPS/MATH 3250\n",
"- [**Church-Turing Thesis**](https://en.wikipedia.org/wiki/Church%E2%80%93Turing_thesis)\n"
]
}
],
"metadata": {
"celltoolbar": "Slideshow",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
},
"rise": {
"autolaunch": true,
"controls": false,
"enable_chalkboard": true,
"scroll": true,
"theme": "simple",
"transition": "fade"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": true,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": true,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 4
}