\n",
"\n",
"\n",
"Let $D=\\{x_1,\\dots, x_n\\}$ be a set of values with $x_1\\leq \\cdots\\leq x_n$.\n",
"The median of the set is defined to be the following number:\n",
"1. If $n$ is odd and $n=2k+1$ then \n",
"\n",
"$$\\text{Median}(D)=x_{k+1}.$$\n",
"\n",
"2. If $n$ is even and $n=2k$ then\n",
"\n",
"$$\\text{Median}(D)=\\frac{x_{k}+x_{k+1}}{2}.$$\n",
"\n",
"\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
\n",
"
\n",
"\n",
"
\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Theorem.**\n",
"\n",
"\n",
"
\n",
" \n",
" \n",
"The minimum of the sum\n",
"\n",
"$$g(c)=\\sum\\limits_{i=1}^n|x_i-c|$$\n",
"\n",
"is attained at the median. \n",
"\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Proof.**\n",
"\n",
"Let us take the case of only two points $D=\\{x_1, x_2\\}$. Using the triangle inequality \n",
"\n",
"$$|x_1-c|+|x_2-c|\\geq |x_1-x_2|=x_2-x_1.$$\n",
"\n",
"Moreover, for any $c\\in [x_1,x_2]$, \n",
"\n",
"$$|x_1-c|+|x_2-c|=(c-x_1)+(x_2-c)=x_2-x_1.$$\n",
"\n",
"So any point in $[x_1,x_2]$ minimizes $|x_1-c|+|x_2-c|$.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* **When n is even**\n",
"\n",
"Then rewrite \n",
"\n",
"$$\\sum\\limits_{i=1}^{2k}|x_i-c|= \\color{blue}{(|x_1-c|+|x_{2k}-c|)}+\\color{blue}{(|x_2-c|+|x_{2k-1}-c|)}+\\cdots + \\color{blue}{(|x_{k}-c|+|x_{k+1}-c|)}$$\n",
"\n",
"
\n",
"
\n",
"\n",
"According to the discussion above the first term is minimal when $c\\in [x_1, x_{2k}]$, the second term is minimal when $c\\in [x_2, x_{2k-1}]$ and so on. Thus, when $c\\in [x_k,x_{k+1}]$ all the terms have their smallest values. In particular, the median is in that interval: $\\frac{x_{k}+x_{k+1}}{2}\\in [x_k,x_{k+1}]$ so we have that the median the sum takes its minimal value. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* **When n is odd**\n",
"\n",
"Then rewrite \n",
"\n",
"$$\\sum\\limits_{i=1}^{2k+1}|x_i-c|= \\color{blue}{(|x_1-c|+|x_{2k+1}-c|)}+\\color{blue}{(|x_2-c|+|x_{2k}-c|)}+\\cdots + \\color{blue}{(|x_{k}-c|+|x_{k+2}-c|)}+\\color{green}{|x_{k+1}-c|}.$$\n",
"\n",
"
\n",
"
\n",
"\n",
"Again, according to the discussion above the first term is minimal when $c\\in [x_1, x_{2k+1}]$, the second term is minimal when $c\\in [x_2, x_{2k}]$ and so on. Fiinally, the last term $|x_{k+1}-c|$ is minimal when $c=x_{k+1}$. Thus, when $c=x_{k+1}$ all the terms have their smallest values since it belongs to all the above intervals. And we concude thay the minimal value of their sum is at the median. $\\blacksquare$\n",
"\n",
"A more general proof of the theorem can be found in \"The theory of probability\" by Gnedenko (p. 190) among other places. "
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.7"
}
},
"nbformat": 4,
"nbformat_minor": 4
}