% Notes for NDMI011 Combinatorics and Graph Theory % Charles University in Prague, Summer 2019 % Instructor: Prof. Andreas Feldmann % Notes by Marcel Goh \input fontmac \input mathmac \leftrighttop{NDMI011 COMBINATORICS AND GRAPH THEORY}{MARCEL GOH} \maketitle{NDMI011 Combinatorics and Graph Theory\footnote{$^*$}{\eightpt Course given by Prof. Andreas Feldmann at Charles University in Prague}}{Notes by}{Marcel K. Goh {\rm (Prague, Czech Rep.)}}{10 June 2019} \section 1. PRELIMINARIES We say that a function $f(n)$ is $O\big(g(n)\big)$ (read ``big-oh of $g(n)$'') if there exist constants $n_0$ and $C$ such that for all $n\geq n_0$, $f(n)\leq C\cdot g(n)$. If the ratio ${f(n)\over g(n)}$ approaches 0 as n approaches infinity, then we say $f(n)$ is $o\big(g(n)\big)$ (read ``little-oh of $g(n)$''). If $f(n)$ is both $O\big(g(n)\big)$ and $o\big(g(n)\big)$, then $f(n)$ is $\Theta\big(g(n)\big)$ (read ``big-theta of $g\big(n)\big)$''). Finally, if the ratio ${f(n)\over g(n)}$ approaches 1 as $n$ approaches infinity, then we write $f(n) \sim g(n)$. Let $G=(V,E)$ be a graph. The graph $G'=(V',E')$ is a {\it subgraph} of $G$ if $V'\subseteq V$ and $E'\subseteq E$. A {\it cycle} is a graph on $v_1, \ldots, v_n$ with edges $\{v_1, v_2\}, \ldots, \{v_{n-1}, v_n\}$. A {\it tree} is a connected graph that contains no cycle (i.e\ does not have a cycle as a subgraph). \proclaim Theorem D. In any graph $G=(V,E)$, $$\displaystyle\sum_{v\in V} \deg v = 2\cdot|E|$$ \proof Every edge $\{u, v\}\in E$ adds 1 to the degree of $u$ and adds 1 to the degree of $v$. Hence if you sum over all vertices, each edge gets counted twice.\slug \proclaim Theorem E. The number of vertices with odd degree is even. \proof Divide $V$ into $V_1$ and $V_2$ where $V_1$ is the set of vertices with odd degree and $V_2$ is the set of vertices with even degree. Then we have $\sum_{v\in V} \deg v = \sum_{v\in V_1} \deg v + \sum_{v\in V_2} \deg v$. Observe that the sum over all odd vertices $\sum_{v\in V_1}\deg v$ equals $2\cdot|E| - \sum_{v\in V_2}\deg v = 2(|E| - x)$ for some $x$, meaning it is even. Since the sum is even but each vertex has an odd degree, the number of vertices must be even.\slug \proclaim Theorem T. A tree with $n$ vertices has $n-1$ edges. \proof Let $G = (V,E)$ be a tree and let $n$ denote $|V|$, the number of vertices. Using induction on $n$, we show that it has $n-1$ vertices. The base case $n = 1$ is easy, since a tree with only one vertex has no edges. Now assume that a tree with $n$ vertices has $n-1$ edges and we consider the case where $G$ has $n+1$ vertices. Let us create $G'$ by removing a vertex from $G$. We cannot remove an internal vertex, since then $G'$ would not be connected. So we must remove a leaf, along with the edge that connected it to $G$. Now $G'$ has $n$ vertices, so by the induction hypothesis it has $n-1$ edges. This implies that $G$ had $n$ edges. \slug \section 2. GENERATING FUNCTIONS \vskip -\medskipamount \section 2.1. Power series and generating functions Given an infinite sequence $a_0, a_1, \ldots = (a_i)_{i=0}^\infty$, the power series $$\sum_{i=0}^\infty a_ix^i$$ is called the {\it generating function} of $(a_i)_{i=0}^\infty$ and is denoted $a(x)$. For example, the sequence $1, 1, 1, \ldots$ has generating function $\sum_{i=0}^\infty x^i$, which converges to $1/(1-x)$ for $x\in (-1, 1)$. Since we only really care about the coefficients of a generating function, we assume that $x$ is within the power series' interval of convergence. \section 2.2. Binomial coefficients For $r\in \R$ and $k\in \N$, the {\it binomial coefficient} is given by $${r\choose k} = \left(\prod_{i=0}^{k-1} (r-i)\right) /k!$$ where ${r\choose 0} = 1$. If $r\in \N$, then we can think of ${r\choose k}$ as the number of ways to choose $k$ objects out of a pool of $r$ objects. \parenproclaim Theorem G (Generalised binomial theorem). For any $r\in \R$, $$\sum_{i\geq 0} {r\choose i} x^i = (1 + x)^r.$$ \proof In this proof we will gloss over technicalities relating to convergence. Let $a(x)$ denote $(1+x)^r$. Then the derivative of $a(x)$ is $a'(x)=r(1+x)^{r-1}$, the second derivative $a''(x)$ is $r(r-1)(1+x)^{r-2}$, and so on. Generally, $$a^{(i)}(x) = \left(\prod_{j=0}^{i-1} (r-j)\right)(1+x)^{r-i}.$$ If we evaluate the function at $x=0$, we get $$a^{(i)}(0) = \prod_{j=0}^{i-1} (r-j)$$ so we can use Taylor's theorem to get $$\eqalign{ a(x) &= \sum_{i\geq 0} {a^{(i)}(0)\over i!} x^i\cr &= \sum_{i\geq 0}{r\choose i} x^i, }$$ which is what we wanted.\slug \section 2.3. Existence of a closed form It is clear from the theorem above that the generating function for the sequence ${r\choose 1}, {r\choose 2}, \ldots$ is $(1+x)^r$. We may wonder what kinds of sequences have a {\it closed form} generating function. The following theorem, presented without proof, answers this question. \proclaim Theorem S. Let $(a_i)_{i=0}^\infty$ be a sequence of reals such that there exists some $k>0$ for which $|a_i|\leq k^i$ for every $i\geq 1$. Then for all $x$ in the interval $(-1/k, 1/k)$, the power series $$a(x) = \sum_{i=0}^\infty a_ix^i$$ converges, so the generating function $a(x)$ has a closed form. Moreover, for any arbitrarily small $\epsilon >0$, the values of $a(x)$ for $x$ in the interval $(-\epsilon, \epsilon)$ uniquely determine the sequence $(a_i)_{i=0}^\infty$ where, by Taylor's theorem, $a_i = a^{(i)}(0)/i!$\slug \section 2.4. Application to counting Generating functions give us a method for counting the number of elements with some parameter $i$. For example, we could count the number of trees with $i$ vertices or the number of steps performed by an algorithm with input size $i$. Often, $a_i$ is defined inductively or recursively. For example, suppose that for some number $c$ our sequence $a_i$ is given by $a_0 = c$, and $a_i = a_{i-1} + c$ for $i\geq 1$. Then we get that $$\eqalign{ a(x) &= \sum_{i\geq 0} a_ix^i\cr &= cx^0 + \sum_{i\geq 1} (a_{i-1} + c) x^i\cr &= cx^0 + \sum_{i\geq 1}cx^i + \sum_{i\geq 1} a_{i-1}\cr &= {c\over 1-x} + x\cdot a(x). }$$ This implies that $a(x) = c/(1-x)$, and by Taylor's theorem, $a_i = a^{(0)}(0)/i!$. Calculating this directly can be very tedious, so very often, we will express $a(x)$ in terms of generating functions that we know. In the case of this example, we note that $$\eqalign{ {c\over (1-x)^2} &= c\cdot {1\over 1-x}\cdot {1\over 1-x} \cr &= c\left(\sum_{i\geq 0} x^i\right) \left(\sum_{i\geq 0} x^i\right)\cr &= \sum_{i\geq 0} c(i+1) x^i, }$$ which gives us that $a_i = c(i+1)$ (which we could have easily gotten by observing the recurrence). \section 2.5. Operations on generating functions Some useful operations on generating functions include: \medskip {\bf Addition.} $a(x) + b(x) = \displaystyle\sum_{i\geq 0} (a_i + b_i) x^i$. \smallskip {\bf Multiplication.} $\beta a(x) = \displaystyle\sum_{i\geq 0} (\beta a_i) x^i$. \smallskip {\bf Scaling.} $a(\beta x) = \displaystyle\sum_{i\geq 0} (\beta^i a_i) x^i$. \smallskip {\bf Gaps.} $a(x^k) = \displaystyle\sum_{i\geq 0} a_i x^{ki}$. \smallskip {\bf Shift right.} $xa(x) = \displaystyle\sum_{i\geq 0} (a_{i-1}) x^{i}$. \smallskip {\bf Shift left.} $\left(a(x) - a_0\right)/x = \displaystyle\sum_{i\geq 0} (a_{i+1}) x^{i}$. \smallskip {\bf Derivative.} $a'(x) = \displaystyle\sum_{i\geq 0} \big((a+1) a_{i+1}\big) x^{i}$. \smallskip {\bf Integral.} $\displaystyle\int_0^x a(x)dx = \displaystyle\sum_{i\geq 1} (a_{i-1}/i) x^{i}$. \smallskip {\bf Convolution.} $a(x)b(x) = \displaystyle\sum_{i\geq 0} \left(\displaystyle\sum_{k=0}^i a_k b_{i-k}\right) x^{i}$. \smallskip {\bf Partial sums.} $a(x)/(1-x) = \displaystyle\sum_{i\geq 0} \left(\displaystyle\sum_{k\geq 0} a_k \right) x^{i}$. \medskip \section 2.6. Some applications Imagine a situation at a grocery store. A customer has six \$1 coins, five \$2 coins, and four \$5 bills. Suppose she has to pay \$21. How many ways can she do this? To find the answer, we need only compute the coefficient of $x^{21}$ in the polynomial $$ (1 + x + x^2 + \cdots + x^6)\cdot(1+x^2+x^4+\cdots+x^{10})\cdot(1+x^5+x^{10}+\cdots+x^{20}).$$ Now suppose that a box contains 30 red balls, 40 blue balls, and 50 white balls. In how many ways can one draw 70 balls? As you may have suspected, we need to get the coefficient of $x^{70}$ in $$ (1+x+\cdots+x^{30})\cdot(1+x+\cdots+x^{40})\cdot(1+x+\cdots+x^{50}).$$ It may be instructive to work through a full example. We will use generating functions to count binary trees. A {\it binary tree} $T$ is either empty ($T=\emptyset$) or consists of a root adjacent to a left subtree $T_l$ and a right subtree $T_r$. Let $b_n$ denote the number of binary trees with $n$ vertices. We will try to find a closed formula for $b_n$. There is exactly one binary tree with 0 vertices (the empty one), and one binary tree with 1 vertex. From the definition, $n=0$ or $n = 1 + n_l + n_r$ where $n_l = k$ and $n_r = n-1-k$ for some $k\in \{0,\ldots, n-1\}$. Hence we obtain the recurrence $b_n = b_0\cdot b_{n-1} + b_1\cdot b_{n-2} + \cdots b_{n-1}\cdot b_0$ for $n\geq 1$, i.e. $$ b_n = \sum_{k=0}^{n-1} b_k\cdot b_{n-1-k}. $$ Then the generating function $b(x)$ can be derived: $$\eqalign{ b(x) &= \sum_{n\geq 0} b_nx^n\cr &= 1x^0 + \sum_{n\geq 1} \left(\sum_{k\geq 0}^{n-1} b_k\cdot b_{n-1-k}\right) x^n\cr &= 1 + x\sum_{n\geq 0} \left(\sum_{k\geq 0}^{n-1} b_k\cdot b_{n-1-k}\right) x^n\cr &= 1 + xb(x)b(x). }$$ This implies that $xb(x)^2 - b(x) + 1 = 0$, which means that $$b(x) = {1\pm\sqrt{1-4x}\over 2x}$$ Which one is correct? Well, note that $b(0) = \sum_{n\geq 0} b_n0^n = b_0 = 1$, whereas $$\lim_{x\rightarrow 0} {1 + \sqrt{1-4x}\over 2x} = +\infty.$$ We try the other root, starting with $$\lim_{x\rightarrow 0} {1 - \sqrt{1-4x}\over 2x},$$ upon which we can apply l'Hospital's rule to get a limit of 1. So this is the generating function we want. Applying Newton's formula and the scaling operation, we observe that $$\sqrt{1-4x}=\big(1+(-4x)\big)^{1/2} = \sum_{n\geq 0} (-4)^n {{1/2}\choose n}x^n.$$ We can substitute this into our generating function to get that $$b(x) = {1\over 2x}\left( 1 - \sum_{n\geq 0} (-4)^n {{1/2}\choose n}x^n\right).$$ Since $1 = \sum_{n\geq 0} (-4)^n {{1/2}\choose n}x^0$, we can further simplify, obtaining $$\eqalign{ b(x) &= {1\over x}\sum_{n\geq 1} -{1\over 2}(-4)^n{{1/2}\choose n}x^n\cr &= \sum_{n\geq 0} -{1\over 2} (-4)^{n+1}{{1/2}\choose n+1}x^n\cr &= \sum_{n\geq 0} {1\over n+1}{2n\choose n}x^n. }$$ So the number of binary trees with $n$ vertices is ${2n\choose n}/(n+1)$. These are the {\it Catalan numbers}. \section 3. GRAPHS \vskip -\medskipamount \section 3.1. Network flow \vskip -\medskipamount \section 3.1.1. Edge capacities Suppose we are given a directed graph $G=(V,E)$ ($E\subseteq V\times V$) with designated source $s\in V$ and target $t\in V$. To each edge $e$ is assigned a capacity $c(e)$, which may be a non-negative real number or $+\infty$. Then an {\it s-t-flow} is a function $f:E\rightarrow \R$ such that for every edge $e\in E$, $0\leq f(e)\leq c(e)$. These are called the {\it capacity constraints}. Our flow must also obey the law of {\it flow conservation}, also known as {\it Kirchhoff's First Law} (which was originally observed in electrical circuits): For every vertex $v\in V\setminus\{s, t\}$, $\sum_{uv\in E} f(uv) = \sum_{vu\in E} f(vu)$. We want to maximise the flow that leaves $s$, subject to these constraints. The {\it value} of an s-t-flow is $$|f| = \sum_{su\in E} f(su) - f(us),$$ and a {\it max-flow} is an s-t-flow with maximum value. The intuitive way to check that a flow is maximal is to study ``bottlenecks''. Concretely, if we let $S\subseteq V$ be such that $s\in S$ and $t\notin S$, then the set $\delta^+(S) = \{uv\in E : u\in S, v\notin S\}$ is an {\it s-t-cut} or {\it s-t-edge-cut}. After removing $\delta^+(S)$ from $G$, there is no longer a path from $s$ to $t$, i.e.\ $\delta^+(S)$ {\it separates} $s$ from $t$. (Since we only removed edges {\it leaving} $S$, there may still be a path from $t$ to $s$.) The {\it capacity} of an s-t-cut is given by $$ c\big(\delta^+(S)\big) = \sum_{uv\in \delta^+(S)} c(uv).$$ Then we can define a {\it min-cut} to be an s-t-cut with minimum capacity. \proclaim Lemma D. For any s-t-flow $f$ and s-t-cut $\delta^+(S)$, $$|f| = \sum_{e\in \delta^+(S)} f(e) - \sum_{e\in \delta^-(S)} f(e),$$ where $\delta^-(S)$ is the set $\{uv : u\notin S, v\in S\}$. \proof By flow conservation, we know that for every $v \neq s,t$, $$\sum_{vu\in E} f(vu) - \sum_{uv\in E} f(uv) = 0.$$ Then the value of the s-t-flow is the sum of all equations for the cut $S$: $$|f| = \sum_{v\in S} \left( \sum_{vu\in E} f(vu) - \sum_{uv\in E} f(uv) \right).$$ We need to consider three possible cases for an edge $e = uv$: \medskip \item {1.} ($u, v\in S$) If both $u$ and $v$ are in $S$, then we will add $f(uv)$ and then subtract $f(uv)$, so this case does not contribute to $|f|$. \smallskip \item {2.} ($u\in S$, $v\notin S$) In this case, $uv\in \delta^+(S)$, so we will have a contribution of $+f(uv)$ to $|f|$. \smallskip \item {3.} ($u\notin S$, $v\in S$) In this case, $uv\in \delta^-(S)$, so we will have a contribution of $-f(uv)$ to $|f|$. \medskip This implies that $$ |f| = \sum_{e\in \delta^+(S)} f(e) - \sum_{e\in \delta^-(S)} f(e) ,$$ which is exactly what we wanted to prove.\slug We can use this to derive an easy corollary: \proclaim Corollary. If $f$ is a max-flow and $\delta^+(S)$ is a min-cut, then $|f| \leq c\big(\delta^+(S)\big)$. \proof By the preceding lemma, $|f| \leq \sum_{e\in \delta^+(S)} f(e)$, which by capacity constraints is no greater than $\sum_{e\in \delta^+(S)} c(e)$. But by the definition of the capacity of a min-cut, this is just $c\big(\delta^+(S)\big)$. \slug In fact, the value of a max-flow {\it equals} the capacity of a min-cut, but the other inequality requires a more involved proof. \proclaim Theorem C. If $f$ is a max-flow and $\delta^+(S)$ is a min-cut, then $|f| = c\big(\delta^+(S)\big)$. \proof By the preceding corollary, it suffices to show that $|f| \geq c\big(\delta^+(S)\big)$. For any max-flow $f$, the idea is to construct some s-t-cut $\delta^+(S)$ with $|f| = \delta^+(S)$. This would imply that any {\it min-cut} would have capacity no greater than $|f|$. Consider the following procedure. We start with $S\gets \{s\}$. While there exists an edge $uv\in \delta^+(S)$ such that $c(uv) > f(uv)$ {\it or} there exists an edge $vu\in \delta^-(S)$ such that $f(uv) > 0$, we add $v$ to $S$. The claim is that if $f$ is a max-flow, then this algorithm will produce an $S$ such that $t\notin S$, i.e.\ $S$ is an s-t-cut. If this claim is true, then we are done; since $\sum_{e\in \delta^-(S)} = 0$ after the algorithm terminates, we will have $|f| = c\big(\delta^+(S)\big)$. Suppose, towards a contradiction, that $t\in S$. Then there exists a sequence of vertices $v_0,\ldots, v_k\in S$ such that $v_0 = s$, $v_k = t$, and for all $i\in \{0,\ldots, k-1\}$, either \medskip \item {a)} $v_iv_{i+1}\in E$ and $c(v_iv_{i+1}) > f(v_iv_{i+1})$, or \smallskip \item {b)} $v_{i+1}v_i\in E$ and $f(v_{i+1}v_i) > 0$. \medskip For each $i$, define a small real number $\epsilon_i$ given by $$\epsilon_i = \cases{ c(v_iv_{i+1}) - f(v_iv_{i+1}) & \hbox{if case (a) holds}\cr f(v_{i+1}v_i) & \hbox{if case (b) holds} }$$ Then we can let $\epsilon = \min\{e_i : 0\leq i\leq k-1\}$. Note that, by construction, $\epsilon > 0$. Now we create a new function $f':E\rightarrow \R$ given by $$ f'(e) = \cases{ f(e) + \epsilon & \hbox{if}\ e\ \hbox{satisfies case (a)}\cr f(e) - \epsilon & \hbox{if}\ e\ \hbox{satisfies case (b)}\cr f(e) & \hbox{otherwise} }$$ The claim is that $f'$ is an s-t-flow. The values of $f'(e)$ are non negative, because $f'(e)\geq f(e)\geq 0$ in case a), and in case b), $f'(e) = f(e) - \epsilon \geq 0$. The capacity constraints are satisfied, since in case a), $\epsilon\leq c(e) - f(e)$ implies that $f'(e) = f(e) + \epsilon \leq c(e)$; and in case b), $f'(e)\leq f(e)\leq c(e)$. And we can see that the law of flow conservation is obeyed as well, since for every $v\in V\setminus\{s, t\}$, either flow is unchanged at $v$ or exactly two edges are changed (one adds $\epsilon$ and the other one subtracts $\epsilon$ from $f$). Computing the value of the flow $f'$, we find that $$ |f'| = \sum_{sv\in E} f'(sv) - \sum_{vs\in E} f'(vs) = |f| + \epsilon.$$ This implies that $f$ is not a max-flow, a contradiction.\slug So to determine if a flow is maximal, we can figure out what $S$ is and see if $t\in S$. If not, then $f$ is a max-flow and $\delta^+(S)$ is a min-cut. The proof of the theorem also suggests an algorithm to compute max-flow and min-cut: \algbegin Algorithm M (Find max-flow/min-cut). Given a directed graph $G=(V,E)$, this algorithm finds a max-flow $f$ and a set $S\subset V$ such that $\delta^+(S)$ is a min-cut. \algstep M1. [Initialise.] Set $i\gets 0$ and $f_0(e) \gets 0$ for every $e\in E$. \algstep M2. [Find $S_i$] Construct the set $S_i$ for the flow $f_i$ using the procedure outlined in the preceding proof. \algstep M3. [Is $t$ in $S_i$?] If $t\notin S_i$, the flow $f_i$ is maximal. Output $f_i$ and $S_i$. \algstep M4. [Increase flow.] Find the ``path''$P$ from $s$ to $t$ in $S$, and compute $\epsilon$. Let $f_{i+1}$ be the result of augmenting $f_i$ by $\epsilon$ along $P$. Set $i\gets i+1$. \algstep M5. [Repeat.] Go to step M2.\slug It is clear that if the algorithm terminates, it will output the correct answer. So it is natural to wonder when the algorithm is sure to terminate. In the case of integer capacities, termination is easy to prove: \proclaim Theorem I. If for all $e\in E$, $c(e)\in \N_0$, then the capacity $n$ of the min-cut is integral and Algorithm M terminates in at most $n$ iterations. \proof We prove, by induction on $i$, that at every step of the algorithm, $|f_i|\in \N_0$. The base case is simple because the algorithm starts with $|f_0|=0$. Now assume that $f_i\in \N_0$. In the step that increases $i$ to $i+1$, we choose $\epsilon$ to be the minimum value computed over a path $P$ of edges. So for some $e\in P$, either $\epsilon = c(e) - f_i(e)$ or $\epsilon = f_i(e)$. Since both $c(e), f_i(e)\in \N_0$, we have that $\epsilon\in \N_0$ and the value of the flow at each iteration is integral. Then because $\epsilon > 0$, we have from integrality that $\epsilon \geq 1$ meaning that for every iteration of the algorithm, the value $|f_{i+1}| = |f_i| + \epsilon \geq |f_i| + 1$. Since the value of the max-flow is exactly $n$, the algorithm terminates in at most $n$ iterations.\slug \section 3.1.2. Vertex capacites Instead of constraining flow at the edges, we can instead assign capacities $d(v)$ to vertices $v\in V\setminus\{s,t\}$. Then for all $v\in V\setminus\{s,t\}$, an s-t-flow $f:E\rightarrow \R$ must satisfy $\sum_{uv\in E} f(uv)\leq d(v)$ and $\sum_{uv\in E} f(uv)= \sum_{vu\in E} f(vu)$. Note that by this definition, if $st\in E$, then the value of the max-flow will be infinite, so assume that $st\notin E$. Now we define an {\it s-t-vertex-cut} to be a set $T\subseteq V\setminus\{s,t\}$ such that the graph $G\setminus T$ has no path from $s$ to $t$. For ease of notation, let $d(T) = \sum_{v\in T} d(v)$. \proclaim Theorem J. Let $f$ be a max-flow in a network with vertex capacities. If $T$ is a min-cut, then $|f| = d(T)$, and furthermore, if all capacities are integers, then there is an integer max-flow. \proof We simply convert our vertex-constrained network to an edge-constrained one. Replace each constrained vertex $v_i$ with two new vertices $v_{i1}$ and $v_{i2}$. Connect these vertices with a single edge $e_i$ with $c(e_i) = d(v_i)$. Set the capacties of all other edges in the new graph to $+\infty$. Then the theorem follows from our previous theorems regarding edge-constrained networks.\slug \section 3.2. Bipartite matchings A graph $G=(V,E)$ is {\it bipartite} if $V$ can be split into disjoint subsets $A$ and $B$ such that every edge in $E$ has one endpoint in $A$ and the other in $B$. A {\it matching} is a set $M\subseteq E$ such that no two edges of $M$ share a vertex. A {\it maximum matching} is matching of maximum size. Consider the so-called {\it marriage problem}. Given a set $A$ of boys and $B$ of girls, can we marry off all boys to girls that they know? We can model this problem with a bipartite graph $(A\cup B, E)$, where an edge exists between a boy and a girl if they know each other. First we define the set of {\it neigbours} of a vertex set $S\subseteq A$ to be $N(S) = \{v\in B : uv\in E\ \hbox{and}\ u\in S\}$. Then the following theorem tells us exactly when the marriage problem is solvable. \parenproclaim Theorem H (Hall). A bipartite graph $G=(A\cup B, E)$ (with $A$ and $B$ disjoint) has a matching $M\subseteq E$ of size $|M| = |A|$ if and only if $|N(S)|\geq |S|$ for all $S\subseteq A$. \proof The ``only if'' direction is obvious. We prove the ``if'' direction by contrapositive, i.e.\ if there is no matching of size $|A|$ then there exists some $S\subseteq A$ such that $|N(S) < |S|$. The idea is to use the vertex version of the max-flow min-cut theorem. We construct a directed graph $H=(V',E')$ from G. Introduce two new vertices $s$ and $t$ and set $V' = A\cup B \cup \{s,t\}$. We want to connect $s$ to every vertex in $A$ and connect every vertex in $B$ to $t$. Then for every edge already in the graph, ensure it is directed from $A$ to $B$. So $$ E' = \{ su : u\in A\}\cup\{vt : v\in B\}\cup\{uv : uv\in E, u\in A, v\in B\}.$$ Now we set all the capacities on every vertex $v\in A\cup B$ to 1. We know from the max-flow min-cut theorem that there exists an integer max-flow, call it $f$, which is a {\it 0-1 flow}, meaning that $f(e)\in \{0,1\}$ for all $e\in E'$. Note that if $f\geq |A|$, then there exists a matching of size $|A|$, since the 0-1 flow uses at most one incoming and one outgoing edge of any vertex in $V\setminus \{s,t\}$. But we assumed that no matching of size $|A|$ exists, so $|f| < |A|$. Let $T$ be a min-cut. From the max-flow min-cut theorem, we know that $d(T) = |f| < |A|$. Consider the sets $X = A\cap T$ and $Y = B\cap T$. Since $|X| + |Y| = |T| = \sum_{v\in T} 1 = d(T)$, we have that $|Y| < |A| - |X|$. There is no edge from $A\setminus X$ to $B\setminus Y$, as $T$ is an s-t-vertex-cut. So $N(A\setminus X)\subseteq Y$ in $G$ and thus $|N(A\setminus X)| \leq |Y| < |A| - |X| = |A\setminus X|$. The set $A\setminus X$ is exactly the $S$ we were looking for.\slug In a graph $G = (V,E)$, a {\it vertex cover} is a set $C\subseteq V$ such that every edge is incident to at least one vertex in $C$. A {\it minimum vertex cover} is a vertex cover of minimum cardinality. Notice that every vertex cover of $G$ is an s-t-vertex cut in $H$, otherwise there would be a path from $s$ to $t$. So we can derive the following theorem as a corollary. \parenproclaim Theorem K (K\H onig's theorem). Let $G$ be a bipartite graph with maximum matching $M$ and minimum vertex cover $C$. Then $|M| = |C|$.\slug \section 3.3. Graph connectivity \vskip -\medskipamount \section 3.3.1. Definitions and inequalities A graph $G=(V,E)$ is {\it connected} if there is a path between any two vertices. Otherwise, we say $G$ is {\it disconnected}. A {\it component} of $G$ is a maximal connected subgraph. If $G$ is connected and removing a set $W$ of vertices or edges causes it to become disconnected, then we say $W$ {\it separates} $G$. We have special names for $W$ if $|W| = 1$. If such a $W\subseteq V$, we call it a {\it cut vertex}. If $W\subseteq E, |W| = 1$, we call $W$ a {\it bridge}. A graph $G=(V,E)$ is {\it k-connected} if $|V|>k$ and no vertex set $W\subseteq V$ of size $|W|2^{k/2}$. \proof First we introduce some basic notions from probability that we will need. Let $\Omega$ be a probability space and let $X$ be an event. Then $\Pr[X] = |X|/|\Omega| < 1$ if and only if $|X| < |\Omega|$, which would imply that $\neg X \neq \emptyset$. Now we create a random graph $G$ by starting with a set of $n$ vertices and adding each edge with probablity $1/2$ uniformly at random. We're trying to prove that if $n\leq 2^{k/2}$ there is a $G$ with $|V| = n$ such that $\alpha(G) < k$ and $\omega(G) < k$, so it suffices to show that $$\Pr[\alpha(G) \geq k\ \hbox{or}\ \omega(G)\geq k : n\leq k^{k/2}] < 1.$$ Let $A$ be a subset of vertices with $|A|=k$. Let $K_A$ denote the proposition ``A forms a clique'' and let $I_A$ denote the proposition ``A forms an independent set". Then let $X_A = K_A \cup I_A$. $\Pr[K_A] = (1/2)^{k\choose 2} = \Pr[I_A]$ and since $K_A \cap I_A = \emptyset$, $$X_A = 2\cdot\left({1\over 2}\right)^{k\choose 2} = 2^{1-{k\choose 2}}.$$ Let $\Pr(Y)$ denote the probability that $\alpha(G) \geq k$ or $\omega(G) \geq k$. This is equal to the probablilty that $X_A$ holds for some $A\subseteq V$ with $|A|\geq k$. So $$\Pr[Y] \leq \sum_{A\subseteq V\atop |A|\geq k} 2^{1-{k\choose 2}} = {n\choose 2}2^{1-{k\choose 2}} \leq {n^k\over k!}2^{1-{k\choose 2}},$$ and we can further derive the strict inequality $${n^k\over k!}2^{1-{k\choose 2}} < {n^k\over {2^{k/2+1}}}2^{1-(k^2/2-k/2)} = \left({n\over 2^{k/2}}\right)^k,$$ which is less than $1$, since $n\leq 2^{k/2}$. \slug \section 7.2. Increasing/decreasing subsequences Given a finite sequence $S = (x_1, x_2, \ldots, x_n)$ of numbers, an {\it increasing} ({\it decreasing}) {\it subsequence} of length $k$ is a sequence $(x_{i_1}, x_{i_2}, \ldots, x_{i_n})$ such that for all $j,l$ with $1\leq j < l\leq k$, $i_j < i_l$ and $x_{i_j} \leq x_{i_l}$ ($x_{i_j} \geq x_{i_l}$). \parenproclaim Theorem E (Erd\H os-Szekeres). For any finite sequence $S$ of at least $(r-1)(s-1) + 1$ numbers, there is an increasing subsequence of length $r$ or a decreasing subsequence of length $s$. \proof Suppose, towards a contradiction, that every increasing/decreasing subsequence has length at most $r-1$/$s-1$ respectively. Then we label each $x_i$ of $S$ with a pair $(a_i, b_i)$ where $a_i$/$b_i$ is the length of the longest increasing/decreasing subsequence ending in $x_i$. We claim that each of the numbers $x_i, x_j$ have different labels. Let $i\neq j$ and without loss of generality, assume $i b_i$, as any decreasing subsequence ending in $x_i$ can be extended by $x_j$. By our assumption, $a_i\leq r-1$ and $b_i\leq s-1$ for all $i$. So the number of labels $(a_i, b_i)\leq (r-1)(s-1)$ which implies that the length of the sequence $S$ is at most $(r-1)(s-1)$, a contradiction.\slug The given bound is tight; there are sequences with length $(r-1)(s-1)$ numbers that neither have an increasing subsequence of length $r$, nor have a decreasing subsequence of length $s$. To construct one, take a grid of $(r-1)(s-1)$ points in the plane and rotate it {\it very} slightly counterclockwise such that no two points are on the same horizontal or vertical line. Then we have a set of points $(x_i, y_i),\ldots(x_n,y_n)$ where $n=(r-1)(s-1)$. Now order the $y$-coordinates {\sl by the value of the $x$-coordinates}. Any increasing subsequence can take at most one element from each column, so at most $r-1$, and any decreasing subsequence can take at most one element from each row, so at most $s-1$. \section 8. ERROR-CORRECTING CODES Suppose we have two people who need to communicate over a channel that is ``noisy'' in the sense that errors may be introduced into the message. To detect and solve this issue, checksums may be used. Suppose we have an alphabet $\Sigma = \{0,\ldots,q-1\}$ where $q$ is prime. Let $x$ denote a message $x\in \Sigma^k$ for some positive integer $k$. The {\it checksum} $c$ of $x$ is $\sum_{i=1}^k x_i \pmod{q}$ and we can encode the message as $y\in {\bf F}_q^{k+1}$ where $$ y_i = \cases{ x_i & for $i\leq k$ \cr c & for $i = k+1$ }$$ Since $q$ is prime, for any $x,y\in {\bf F}_q^kk$, if $x$ and $y$ differ at {\it exactly one} place, then their checksums will differ. But what happens if more errors occur? Also, is it possible for the receiver to determine where the error occured without asking the sender to resend the message? To formalise the problem further, we introduce some more definitions. The {\it block length} $n$ is the length of the encoded message. Suppose the original (non-encoded) message has length $k$. Then the {\it encoding function} $E : \Sigma^k \rightarrow \Sigma^n $ is applied by the sender before transmission and the {\it decoding function} $D : \Sigma^n \rightarrow \Sigma^k$ is applied by the receiver upon the message's arrival. The {\it code} $C\subseteq \Sigma^n$ is the image of $E$. A code is {\it binary} if $\Sigma = \{0,1\}$. Checksums work by increasing the distance between two messages $x,y\in \Sigma^k$ if $x\neq y$. The {\it Hamming distance} between two messages $x,y\in \Sigma^m$ is given by $$ \Delta(x,y) = \sum_{i=1}^m [x_i\neq y_i].$$ For a code $C\subseteq \Sigma^n$, the {\it minimum distance} of $C$ is $$ \Delta(C) = \min_{x,y\in C} \{\Delta(x,y)\}.$$ An {\it $(n,k,d)_q$-code} is a code $C$ such that \medskip \item {i)} $C\subseteq \Sigma^n$; \smallskip \item {ii)} $k = \log_q |C|$; \smallskip \item {iii)} and $d = \Delta(C)$. \medskip Note that $|C|$ is the size of the code and $k$ need not be an integer. A checksum is a $(n, n-1, 2)_q$-code: Suppose that $x\in C$. Let $y$ be such that $\Delta(x,y) = 1$. Is it possible that $y\in C$? No, because for all $x,y\in C$, if $x\neq y$ then $\Delta(x,y)\geq 2$. If, for a code $C$, we can be sure that at most $r$ errors occur during transmission, then: \medskip \item {i)} If $\Delta(C) \geq r+1$, then the error can be {\it detected} because for any $x\in C$, if we have $y$ that differs from $x$ at not less than $r+1$ places, i.e.\ $\Delta(x,y)\leq r$, then $y\notin C$. \smallskip \item {ii)} If $r\leq \lfloor (\Delta(C) - 1)/2 \rfloor$, then the error can be {\it corrected}. Construct a graph with $V = \Sigma^n$ with an edge between $x,y\in \Sigma^n$ if $\Delta(x,y) = 1$. Now if we have $y\in \Sigma^n$ which we know is a garbled message, we can find the closest valid message in the graph. If the error is not too large, then the closest valid message will have been the intended one. \medskip The general aim is to construct codes with small $n$ and large $\Delta(C)$. For any finite projective plane $(X,{\cal L})$ of order $m$, we can construct a $(m^2+m+1, \log_2(m^2+m+1), 2m)_2$-code by associating every letter in the alphabet with a characteristic vector of $L\in {\cal L}$. For example, the letter associated with the line $\{1,5,6\}$ in the Fano plane will be coded as $1000110$. Since for any $L_1, L_2\in {\cal L}, L_1\neq L_2$, $|L_1\cap L_2| = 1$ and each of $L_1, L_2$ have $m+1$ points, $\Delta(y_{L_1}, y_{L_2}) = 2m$, where $y_{L_i}$ is the letter associated with the line $L_i$. \smallskip A {\it linear} code $C\subseteq {\bf F}_q^n$ is a linear subspace of ${\bf F}_q$, i.e.\ for all $x,y\in C$, $\alpha\in {\bf F}_q$, $\alpha x+y\in C$. There exist $k$ independent vectors $x_1,\ldots, x_k\in {\bf F}_q^n$ such that $$C = \bigg\{\sum_{i=1}^k \alpha_ix_i : \alpha_i,\ldots \alpha_n\in {\bf F}_q\bigg\}.$$ More succinctly, we can express $C$ using the generator matrix $G = \left[\matrix{ x_1 & \cdots & x_k }\right]\in {\bf F}_q^{n\times k}$. Multiplying any message by $G$ on the left encodes it and $C = \{G_\alpha : \alpha \in {\bf F}_q^k\}$. Alternatively, we may represent $C$ as the {\it null space} of a matrix. Recall that for any $k$-dimensional linear subspace $C$, there is an $(n-k)$-dimensional linear subspace $C^\perp$ such that for all $x\in C, y\in C^\perp$, $x^Ty = 0$. Using the generator matrix $H\in {\bf F}_k^{n\times (n-k)}$ of $C^\perp$, express $C = \{ x : x^TH = 0\}$. Then $H$ is called the {\it parity check} matrix of $C$. In a sense, linear codes are simple because we can detect errors simply by computing $x^TH$. The {\it support} of a vector $x$ is the set $\{i : x_i \neq 0\}$ and the {\it Hamming weight} of $x$ is $\hbox{wt}\,x = |\{i : x_i \neq 0\}|$. \proclaim Theorem L. For a linear code $C$, $$ \Delta(C) = \min\{\hbox{wt}\,x : x\in C\ \hbox{and}\ x\neq 0\}.$$ \proof For ease of notation, let $d$ denote $\min\{\hbox{wt}\,x : x\in C\ \hbox{and}\ x\neq 0\}$. First we show that $d\leq \Delta(C)$. Let $y,z\in C$ be such that $\Delta(y,z)$ = $\Delta(C)$. Since $C$ is linear, $x = y-z \in C$. Note that $\Delta(y,z) = |\{i : x_i \neq 0\}|$, so $d \leq \hbox{wt}\,x = \Delta(C)$. Now we show that $d\geq \Delta(C)$. Note that $0\in C$. Let $x\neq 0$ be such that $\hbox{wt}\,x = d$. Then $\Delta(C)\leq \Delta(0,x) = |\{i : x_i \neq 0\}| = \hbox{wt}\,x = d$.\slug We can use this theorem to construct a linear code with $\Delta(C) \geq 3$. We find a parity check matrix $H$ such that $x^TH \neq 0$ for any non-zero $x$ with $\hbox{wt} x\in \{1,2\}$. For simplicity, we will work with $q=2$. If $\hbox{wt}\,x = 1$, then $X$ has exactly one 1-entry, say it is at position $i$. So $x^TH$ is the $i$-th row $H_i$ of $H$. We need that $H_i$ is non-zero for every $i$. If $\hbox{wt}\,x = 2$, then $x^T$ has exactly two non-zero elements, say they are at positions $i$ and $j$. Then $x^TH$ = $(H_i + H_j)^T$ hence we need $H_i\neq H_j$ for all $i\neq j$. If all rows of $H$ are distinct and non-zero, then we get $\Delta(C) \geq 3$. The largest number of such rows if $H\in {\bf F}_2^{n\times l}$ is $2^l-1$ so $n = 2^l-1$ for any $l = n-k$. Hence we obtain an $(n, n-\log_2(n+1), 3)_2$-code called the {\it Hamming code}. Let $x\in {\bf F}_q^n$. The {\it ball} around $x$ of radius $r$ is given by $$B(x,r) = \{y\in {\bf F}_q^n : \Delta(x,y) \leq r\}.$$ The {\it volume} of such a ball is $$\hbox{vol}(r,n) = |B(x,r)| = \sum_{i=0}^r {n\choose i} (q-1)^i.$$ \parenproclaim Theorem H (Hamming bound). If an $(n,k,d)_q$-code exists, then $$q^k\cdot\hbox{\rm vol}\left(\left\lfloor {d-1\over 2} \right\rfloor, n\right) \leq q^n.$$ \proof Let $r = \lfloor (d-1)/2 \rfloor$ and consider $\bigcup_{x\in C} B(x,r)\subset \{1,\ldots,q\}^n$. By observation, $$\left|\bigcup_{x\in C} B(x,r)\right| = \sum_{x\in C} \hbox{vol}\,(r,n) = q^k\cdot \hbox{vol}\,(r,n) \leq q^n.\noskipslug$$ Note that if $d = 3$ and $q = 2$ then the Hamming bound is $2^k(n+1)\leq 2^n$ and the Hamming code matches this bound with equality as $k=n-log_2(n+1)$. An $(n,k,d)_2$-code is called {\it perfect} if $q^k\cdot\hbox{vol}\,(\lfloor (d-1)/2 \rfloor, n) = q^n$. There are no codes with larger size $k$ than a $(n, n-d+1, d)_2$-code (even if we let $q>2$), as the following theorem shows. \parenproclaim Theorem S (Singleton bound). If $C$ is an $(n,k,d)_q$-code, then $k\leq n-d+1$. \proof Define a function $f:\Sigma^n\rightarrow\Sigma^{k-1}$ such that $f(x_1,\ldots, x_n) = (x_1,\dots, x_{k-1})$. As $|\Sigma^{k-1}| = q^{k-1}$ and $|C| = q^k > q^{k-1}$, there are $x,y\in C$, with $x\neq y$ and $f(x) = f(y)$. So $x$ and $y$ can only differ in the last $n-k+1$ entries. Since $d = \Delta(C) \leq \Delta(x,y) \leq n-k+1$, we get $k\leq n-d+1$.\slug An $(n,k,d)_q$ code is called {\it maximum-distance separable} or MDS if $k = n-d+1$. \end