{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 벡터, 행렬에 대한 미분\n",
    "\n",
    "<p style=\"text-align: right;\">2018.01.02 조준우 metamath@gmail.com</p>\n",
    "\n",
    "<hr/>\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 스칼라-벡터, 스칼라-행렬, 벡터-벡터간의 미분\n",
    "\n",
    "### 스칼라를 스칼라로 미분\n",
    "\n",
    "- 가장 일반적인 경우로 따로 설명 필요없음\n",
    "\n",
    "### 스칼라를 벡터로 미분\n",
    "\n",
    "- $\\mathbf{x}: n \\times 1$\n",
    "\n",
    "- 분자레이아웃<sup>numerator layout</sup>\n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, x}{\\partial \\, \\mathbf{x}} = \\begin{bmatrix}\n",
    "\\dfrac{\\partial x}{\\partial x_{1}} & \\dfrac{\\partial x}{\\partial x_{2}} & \\cdots & \\dfrac{\\partial x}{\\partial x_{n}}\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "- 분모레이아웃<sup>denominator layout</sup>\n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, x}{\\partial \\, \\mathbf{x}} = \\begin{bmatrix}\n",
    "\\dfrac{\\partial x}{\\partial x_{1}} \\\\\n",
    "\\dfrac{\\partial x}{\\partial x_{2}} \\\\\n",
    "\\vdots \\\\\n",
    "\\dfrac{\\partial x}{\\partial x_{n}} \\\\\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "- 분모, 분자 레이아웃은 쓰는 사람 마음\n",
    "\n",
    "### 스칼라를 행렬로 미분\n",
    "\n",
    "-  $\\mathbf{X} : m \\times n$\n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, x}{\\partial \\,  \\mathbf{X}} =\n",
    "\\begin{bmatrix}\n",
    "\\dfrac{\\partial x}{\\partial X_{11}} & \\dfrac{\\partial x}{\\partial X_{12}} & \\cdots & \\dfrac{\\partial x}{\\partial X_{1n}} \\\\\n",
    "\\dfrac{\\partial x}{\\partial X_{21}} & \\dfrac{\\partial x}{\\partial X_{22}} & \\cdots & \\dfrac{\\partial x}{\\partial X_{2n}} \\\\\n",
    "\\vdots                              & \\vdots                              & \\ddots & \\vdots                              \\\\\n",
    "\\dfrac{\\partial x}{\\partial X_{m1}} & \\dfrac{\\partial x}{\\partial X_{m2}} & \\cdots & \\dfrac{\\partial x}{\\partial X_{mn}}\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "\n",
    "### 벡터를 스칼라로 미분\n",
    "\n",
    "- 분자레이아웃\n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, \\mathbf{x}}{\\partial \\, x} =\n",
    "\\begin{bmatrix}\n",
    "\\dfrac{\\partial x_{1}}{\\partial x} \\\\\n",
    "\\dfrac{\\partial x_{2}}{\\partial x} \\\\\n",
    "\\vdots \\\\\n",
    "\\dfrac{\\partial x_{n}}{\\partial x} \n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "- 분모레이아웃\n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, \\mathbf{x}}{\\partial \\, x} =\n",
    "\\begin{bmatrix}\n",
    "\\dfrac{\\partial x_{1}}{\\partial x} &\n",
    "\\dfrac{\\partial x_{2}}{\\partial x} &\n",
    "\\cdots &\n",
    "\\dfrac{\\partial x_{n}}{\\partial x} \n",
    "\\end{bmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 벡터를 벡터로 미분\n",
    "\n",
    "- $\\mathbf{f} : m \\times 1$, $\\mathbf{x} : n \\times 1$\n",
    "\n",
    "- 분자레이아웃(야코비안과 같은 경우)\n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, \\mathbf{f}}{\\partial \\, \\mathbf{x}} =\n",
    "\\begin{bmatrix}\n",
    "\\dfrac{\\partial f_{1}}{\\partial x_{1}} & \\dfrac{\\partial f_{1}}{\\partial x_{2}} & \\cdots & \\dfrac{\\partial f_{1}}{\\partial x_{n}} \\\\\n",
    "\\dfrac{\\partial f_{2}}{\\partial x_{1}} & \\dfrac{\\partial f_{2}}{\\partial x_{2}} & \\cdots & \\dfrac{\\partial f_{2}}{\\partial x_{n}} \\\\\n",
    "\\vdots                                 & \\vdots                                 & \\ddots & \\vdots                                 \\\\\n",
    "\\dfrac{\\partial f_{m}}{\\partial x_{1}} & \\dfrac{\\partial f_{m}}{\\partial x_{2}} & \\cdots & \\dfrac{\\partial f_{m}}{\\partial x_{n}}\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "- 분모레이아웃\n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, \\mathbf{f}}{\\partial \\, \\mathbf{x}} =\n",
    "\\begin{bmatrix}\n",
    "\\dfrac{\\partial f_{1}}{\\partial x_{1}} & \\dfrac{\\partial f_{2}}{\\partial x_{1}} & \\cdots & \\dfrac{\\partial f_{m}}{\\partial x_{1}} \\\\\n",
    "\\dfrac{\\partial f_{1}}{\\partial x_{2}} & \\dfrac{\\partial f_{2}}{\\partial x_{2}} & \\cdots & \\dfrac{\\partial f_{m}}{\\partial x_{2}} \\\\\n",
    "\\vdots                                 & \\vdots                                 & \\ddots & \\vdots                                 \\\\\n",
    "\\dfrac{\\partial f_{1}}{\\partial x_{n}} & \\dfrac{\\partial f_{2}}{\\partial x_{n}} & \\cdots & \\dfrac{\\partial f_{m}}{\\partial x_{n}}\n",
    "\\end{bmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 벡터를 벡터로 미분할 때 체인룰\n",
    "\n",
    "세 벡터 변수 $\\mathbf{x}$, $\\mathbf{y}$, $\\mathbf{z}$에 대해  $\\mathbf{y} = f(\\mathbf{x})$, $\\mathbf{z}  = g(\\mathbf{y})$인 함수관계가 있을 때 $\\dfrac{\\partial \\, \\mathbf{z}}{\\partial \\, \\mathbf{x}}$\n",
    "\n",
    "$$\n",
    "\\mathbf{x} = \\begin{bmatrix}\n",
    "x_{1} \\\\\n",
    " x_{2} \\\\\n",
    "\\vdots \\\\\n",
    "x_{n} \\\\\n",
    "\\end{bmatrix} \\qquad\n",
    "\\mathbf{y} = \\begin{bmatrix}\n",
    "y_{1} \\\\\n",
    "y_{2} \\\\\n",
    "\\vdots \\\\\n",
    "y_{r} \\\\\n",
    "\\end{bmatrix} \\qquad\n",
    "\\mathbf{z} = \\begin{bmatrix}\n",
    "z_{1} \\\\\n",
    "z_{2} \\\\\n",
    "\\vdots \\\\\n",
    "z_{m} \\\\\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "- 분자레이아웃\n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, \\mathbf{z}}{\\partial \\, \\mathbf{x}} =\n",
    "\\begin{bmatrix}\n",
    "\\dfrac{\\partial z_{1}}{\\partial x_{1}} & \\dfrac{\\partial z_{1}}{\\partial x_{2}} & \\cdots & \\dfrac{\\partial z_{1}}{\\partial x_{n}} \\\\\n",
    "\\dfrac{\\partial z_{2}}{\\partial x_{1}} & \\dfrac{\\partial z_{2}}{\\partial x_{2}} & \\cdots & \\dfrac{\\partial z_{2}}{\\partial x_{n}} \\\\\n",
    "\\vdots                                 & \\vdots                                 & \\ddots & \\vdots                                 \\\\\n",
    "\\dfrac{\\partial z_{m}}{\\partial x_{1}} & \\dfrac{\\partial z_{m}}{\\partial x_{2}} & \\cdots & \\dfrac{\\partial z_{m}}{\\partial x_{n}}\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "- 체인룰에 의해 다음과 같으므로\n",
    "\n",
    "$$\n",
    "\\frac{\\partial\\,  z_i}{\\partial \\, x_j} = \\frac{\\partial\\,  z_i}{\\partial \\, y_k}\\frac{\\partial \\, y_k}{\\partial \\, x_j}\n",
    "$$\n",
    "\n",
    "- 체인룰을 각 요소에 적용하고 행렬곱으로 분해하면 \n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\frac{\\partial \\, \\mathbf{z}}{\\partial \\, \\mathbf{x}} &=\n",
    "\\begin{bmatrix}\n",
    "\\dfrac{\\partial z_{1}}{\\partial y_k} \\dfrac{\\partial y_k}{\\partial x_{1}} & \n",
    "\\dfrac{\\partial z_{1}}{\\partial y_k} \\dfrac{\\partial y_k}{\\partial x_{2}} & \n",
    "\\cdots  & \n",
    "\\dfrac{\\partial z_{1}}{\\partial y_k} \\dfrac{\\partial y_k}{\\partial x_{n}} \\\\\n",
    "\\dfrac{\\partial z_{2}}{\\partial y_k} \\dfrac{\\partial y_k}{\\partial x_{1}} & \n",
    "\\dfrac{\\partial z_{2}}{\\partial y_k} \\dfrac{\\partial y_k}{\\partial x_{2}} & \n",
    "\\cdots & \n",
    "\\dfrac{\\partial z_{2}}{\\partial y_k} \\dfrac{\\partial y_k}{\\partial x_{n}} \\\\\n",
    "\\vdots                                 & \\vdots                                 & \\ddots & \\vdots                                 \\\\\n",
    "\\dfrac{\\partial z_{m}}{\\partial y_k} \\dfrac{\\partial y_k}{\\partial x_{1}} & \n",
    "\\dfrac{\\partial z_{m}}{\\partial y_k} \\dfrac{\\partial y_k}{\\partial x_{2}} & \n",
    "\\cdots &\n",
    "\\dfrac{\\partial z_{m}}{\\partial y_k} \\dfrac{\\partial y_k}{\\partial x_{n}}\n",
    "\\end{bmatrix} \\\\[10pt] &=\n",
    "\\begin{bmatrix}\n",
    "\\dfrac{\\partial z_{1}}{\\partial y_{1}} & \\dfrac{\\partial z_{1}}{\\partial y_{2}} & \\cdots & \\dfrac{\\partial z_{1}}{\\partial y_{r}} \\\\\n",
    "\\dfrac{\\partial z_{2}}{\\partial y_{1}} & \\dfrac{\\partial z_{2}}{\\partial y_{2}} & \\cdots & \\dfrac{\\partial z_{2}}{\\partial y_{r}} \\\\\n",
    "\\vdots                                 & \\vdots                                 & \\ddots & \\vdots                                 \\\\\n",
    "\\dfrac{\\partial z_{m}}{\\partial y_{1}} & \\dfrac{\\partial z_{m}}{\\partial y_{2}} & \\cdots & \\dfrac{\\partial z_{m}}{\\partial y_{r}}\n",
    "\\end{bmatrix} \\,\n",
    "\\begin{bmatrix}\n",
    "\\dfrac{\\partial y_{1}}{\\partial x_{1}} & \\dfrac{\\partial y_{1}}{\\partial x_{2}} & \\cdots & \\dfrac{\\partial y_{1}}{\\partial x_{n}} \\\\\n",
    "\\dfrac{\\partial y_{2}}{\\partial x_{1}} & \\dfrac{\\partial y_{2}}{\\partial x_{2}} & \\cdots & \\dfrac{\\partial y_{2}}{\\partial x_{n}} \\\\\n",
    "\\vdots                                 & \\vdots                                 & \\ddots & \\vdots                                 \\\\\n",
    "\\dfrac{\\partial y_{r}}{\\partial x_{1}} & \\dfrac{\\partial y_{r}}{\\partial x_{2}} & \\cdots & \\dfrac{\\partial y_{r}}{\\partial x_{n}}\n",
    "\\end{bmatrix}\\\\[10pt] &=\n",
    "\\frac{\\partial \\, \\mathbf{z}}{\\partial \\, \\mathbf{y}} \\frac{\\partial \\, \\mathbf{y}}{\\partial \\, \\mathbf{x}}\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "- 분자 레이아웃으로 하면 스칼라 미분의 체인룰과 별 다를 것이 없음\n",
    "\n",
    "- 스칼라라면 \n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, z}{\\partial \\, x} = \\frac{\\partial \\, z}{\\partial \\, y} \\frac{\\partial \\, y }{\\partial \\, x} = \\frac{\\partial \\, y }{\\partial \\, x} \\frac{\\partial \\, z}{\\partial \\, y} \n",
    "$$\n",
    "\n",
    "처럼 어떤 순서로 체인룰을 적어도 상관없지만 관습적으로 첫번째처럼  오른쪽으로 가면서 체인룰을 적는다. \n",
    "\n",
    "- 하지만 분모 레이아웃으로 하면 다음과 같다.\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\left(\\frac{\\partial \\, \\mathbf{z}}{\\partial \\, \\mathbf{x}}\\right)^{\\text{T}}  \n",
    "&= \\left(\\frac{\\partial \\, \\mathbf{z}}{\\partial \\, \\mathbf{y}} \\frac{\\partial \\, \\mathbf{y}}{\\partial \\, \\mathbf{x}} \\right)^{\\text{T}} \\\\[5pt]\n",
    "&=  \\left( \\frac{\\partial \\, \\mathbf{y}}{\\partial \\, \\mathbf{x}} \\right)^{\\text{T}}  \\left(\\frac{\\partial \\, \\mathbf{z}}{\\partial \\, \\mathbf{y}} \\right)^{\\text{T}}\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "- 분모 레이아웃일 때는 체인룰의 진행 방향이 왼쪽으로 가면서 진행된다는것 주의해야 함"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 행렬을 스칼라로 미분\n",
    "\n",
    "-  $\\mathbf{X} : m \\times n$\n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, \\mathbf{X}}{\\partial \\, x} =\n",
    "\\begin{bmatrix}\n",
    "\\dfrac{\\partial X_{11}}{\\partial x} & \\dfrac{\\partial X_{12}}{\\partial x} & \\cdots & \\dfrac{\\partial X_{1n}}{\\partial x} \\\\\n",
    "\\dfrac{\\partial X_{21}}{\\partial x} & \\dfrac{\\partial X_{22}}{\\partial x} & \\cdots & \\dfrac{\\partial X_{2n}}{\\partial x} \\\\\n",
    "\\vdots                              & \\vdots                              & \\ddots & \\vdots                              \\\\\n",
    "\\dfrac{\\partial X_{m1}}{\\partial x} & \\dfrac{\\partial X_{m2}}{\\partial x} & \\cdots & \\dfrac{\\partial X_{mn}}{\\partial x}\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "## 새로운 연산자\n",
    "\n",
    "- 벡터를 행렬로 미분하거나 행렬을 행렬로 미분하기위해 필요한 연산자를 정의\n",
    "\n",
    "### 크로네커 곱<sup>Kronecker product[1]</sup>\n",
    "\n",
    "\n",
    "\n",
    "- $\\mathbf{A}: p \\times q $와 $\\mathbf{B}: r \\times s $가 있을 때 이 두 행렬의 크로네커 곱은 다음과 같고 $pr \\times qs$ 행렬이 된다.\n",
    "\n",
    "$$\n",
    "\\mathbf{A} \\otimes \\mathbf{B} = \\{ a_{ij}\\mathbf{B} \\}\n",
    "$$\n",
    "\n",
    "$\\mathbf{A}$가 2 x 2 행렬이면 다음처럼 된다.\n",
    "\n",
    "$$\n",
    "\\begin{bmatrix} a_{11} & a_{12} \\\\ a_{21} & a_{22} \\end{bmatrix} \\otimes \\mathbf{B} = \\begin{bmatrix} a_{11}\\mathbf{B} & a_{12}\\mathbf{B} \\\\ a_{21}\\mathbf{B} & a_{22}\\mathbf{B} \\end{bmatrix}\n",
    "$$\n",
    "\n",
    "- 구체적인 예\n",
    "\n",
    "$$\n",
    "\\mathbf{A} = \\begin{bmatrix} \\color{RoyalBlue}{3} & \\color{OrangeRed}{5} \\\\ \\color{YellowGreen}{9} & \\color{Goldenrod}{7} \\end{bmatrix} \\qquad \\text{and} \\qquad \\mathbf{b} = \\begin{bmatrix} 4 \\\\ 5\\\\ 6 \\end{bmatrix}\n",
    "$$\n",
    "\n",
    "$$\n",
    "\\mathbf{A} \\otimes \\mathbf{b} = \\begin{bmatrix}\n",
    "\\color{RoyalBlue}{3 \\cdot 4} & \\color{OrangeRed}{5 \\cdot 4} \\\\\n",
    "\\color{RoyalBlue}{3 \\cdot 5} & \\color{OrangeRed}{5 \\cdot 5} \\\\\n",
    "\\color{RoyalBlue}{3 \\cdot 6} & \\color{OrangeRed}{5 \\cdot 6} \\\\\n",
    "\\color{YellowGreen}{9 \\cdot 4} & \\color{Goldenrod}{7 \\cdot 4} \\\\\n",
    "\\color{YellowGreen}{9 \\cdot 5} & \\color{Goldenrod}{7 \\cdot 5} \\\\\n",
    "\\color{YellowGreen}{9 \\cdot 6} & \\color{Goldenrod}{7 \\cdot 6} \n",
    "\\end{bmatrix}=  \\begin{bmatrix}\n",
    "\\color{RoyalBlue}{12} & \\color{OrangeRed}{20} \\\\\n",
    "\\color{RoyalBlue}{15} & \\color{OrangeRed}{25} \\\\\n",
    "\\color{RoyalBlue}{18} & \\color{OrangeRed}{30} \\\\\n",
    "\\color{YellowGreen}{36} & \\color{Goldenrod}{28} \\\\\n",
    "\\color{YellowGreen}{45} & \\color{Goldenrod}{35} \\\\\n",
    "\\color{YellowGreen}{54} & \\color{Goldenrod}{42} \n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "\n",
    "### vec과 vec 전치<sup>[2]</sup>\n",
    "\n",
    "- 행렬의 열벡터를 열방향으로 죽 늘어 세워 행렬을 벡터화 시키는 연산자\n",
    "\n",
    "$$\n",
    "\\text{vec}\\left( \\begin{bmatrix} \\color{RoyalBlue}{a_{11}} & \\color{OrangeRed}{a_{12}} \\\\ \\color{RoyalBlue}{a_{21}} & \\color{OrangeRed}{a_{22}} \\end{bmatrix} \\right) = \n",
    "\\begin{bmatrix} \\color{RoyalBlue}{a_{11} \\\\ a_{21}} \\\\ \\color{OrangeRed}{a_{12} \\\\ a_{22}} \\end{bmatrix}\n",
    "$$\n",
    "\n",
    "- vec 전치 <sup>vec transpose</sup> : 전치의 일반화\n",
    "\n",
    "$$\n",
    "\\begin{bmatrix}\n",
    "\\color{RoyalBlue}{a_{11}}   & \\color{Goldenrod}{a_{12}} \\\\\n",
    "\\color{RoyalBlue}{a_{21}}   & \\color{Goldenrod}{a_{22}} \\\\\n",
    "\\color{OrangeRed}{a_{31}}   & \\color{Violet}{a_{32}} \\\\\n",
    "\\color{OrangeRed}{a_{41}}   & \\color{Violet}{a_{42}} \\\\\n",
    "\\color{YellowGreen}{a_{51}} & \\color{Emerald}{a_{52}} \\\\\n",
    "\\color{YellowGreen}{a_{61}} & \\color{Emerald}{a_{62}} \\\\\n",
    "\\end{bmatrix}^{(2)} =\n",
    "\\begin{bmatrix}\n",
    "\\color{RoyalBlue}{a_{11}} & \\color{OrangeRed}{a_{31}} & \\color{YellowGreen}{a_{51}} \\\\\n",
    "\\color{RoyalBlue}{a_{21}} & \\color{OrangeRed}{a_{41}} & \\color{YellowGreen}{a_{61}} \\\\\n",
    "\\color{Goldenrod}{a_{12}} & \\color{Violet}{a_{32}} &  \\color{Emerald}{a_{52}} \\\\\n",
    "\\color{Goldenrod}{a_{22}} & \\color{Violet}{a_{42}} &  \\color{Emerald}{a_{62}}\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "$$\n",
    "\\begin{bmatrix}\n",
    "\\color{RoyalBlue}{a_{11}}   & \\color{Goldenrod}{a_{12}} \\\\\n",
    "\\color{RoyalBlue}{a_{21}}   & \\color{Goldenrod}{a_{22}} \\\\\n",
    "\\color{RoyalBlue}{a_{31}}   & \\color{Goldenrod}{a_{32}} \\\\\n",
    "\\color{OrangeRed}{a_{41}}   & \\color{Violet}{a_{42}} \\\\\n",
    "\\color{OrangeRed}{a_{51}} & \\color{Violet}{a_{52}} \\\\\n",
    "\\color{OrangeRed}{a_{61}} & \\color{Violet}{a_{62}} \\\\\n",
    "\\end{bmatrix}^{(3)} =\n",
    "\\begin{bmatrix}\n",
    "\\color{RoyalBlue}{a_{11}} & \\color{OrangeRed}{a_{41}} \\\\\n",
    "\\color{RoyalBlue}{a_{21}} & \\color{OrangeRed}{a_{51}} \\\\\n",
    "\\color{RoyalBlue}{a_{31}} & \\color{OrangeRed}{a_{61}} \\\\\n",
    "\\color{Goldenrod}{a_{12}} & \\color{Violet}{a_{42}} \\\\\n",
    "\\color{Goldenrod}{a_{22}} & \\color{Violet}{a_{52}} \\\\\n",
    "\\color{Goldenrod}{a_{32}} & \\color{Violet}{a_{62}}\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "- (1)전치는 일반적인 전치와 동일하게 됨 $\\mathbf{A}^{(1)} = \\mathbf{A}^{\\text{T}}$\n",
    "\n",
    "- (행개수)전치는 vec과 동일하게 됨 $\\mathbf{A}^{(rows(\\mathbf{A}))} = \\text{vec}(\\mathbf{A})$\n",
    "\n",
    "- 전치에 들어갈 수 있는 숫자 $(r)$은 행개수를 나눌 수 있는 자연수\n",
    "\n",
    "- 따라서 행벡터는 (1)전치만 성립"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 벡터를 행렬로 미분\n",
    "\n",
    "- $\\mathbf{x} : p \\times 1$\n",
    "- $\\mathbf{X} : m \\times n$\n",
    "- 크로네커곱을 이용하여 벡터를 스칼라로 미분하게 한 다음 벡터를 분자레이아웃으로 미분\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\frac{\\partial \\, \\mathbf{x}}{\\partial \\, \\mathbf{X}} = \\frac{\\partial}{\\partial \\, \\mathbf{X}} \\otimes \\mathbf{x} \n",
    "&= \\begin{bmatrix}\n",
    "\\dfrac{\\partial}{\\partial \\, X_{11}} & \\dfrac{\\partial}{\\partial \\, X_{12}} & \\cdots & \\dfrac{\\partial}{\\partial \\, X_{1n}} \\\\\n",
    "\\dfrac{\\partial}{\\partial \\, X_{21}} & \\dfrac{\\partial}{\\partial \\, X_{22}} & \\cdots & \\dfrac{\\partial}{\\partial \\, X_{2n}} \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "\\dfrac{\\partial}{\\partial \\, X_{m1}} & \\dfrac{\\partial}{\\partial \\, X_{m2}} & \\cdots & \\dfrac{\\partial}{\\partial \\, X_{mn}}\n",
    "\\end{bmatrix} \\otimes \\mathbf{x} \\\\[5pt]\n",
    "&=\\begin{bmatrix}\n",
    "\\dfrac{\\partial \\, \\mathbf{x}}{\\partial \\, X_{11}} & \\dfrac{\\partial \\, \\mathbf{x}}{\\partial \\, X_{12}} & \\cdots & \\dfrac{\\partial \\, \\mathbf{x}}{\\partial \\, X_{1n}} \\\\\n",
    "\\dfrac{\\partial \\, \\mathbf{x}}{\\partial \\, X_{21}} & \\dfrac{\\partial \\, \\mathbf{x}}{\\partial \\, X_{22}} & \\cdots & \\dfrac{\\partial \\, \\mathbf{x}}{\\partial \\, X_{2n}} \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "\\dfrac{\\partial \\, \\mathbf{x}}{\\partial \\, X_{m1}} & \\dfrac{\\partial \\, \\mathbf{x}}{\\partial \\, X_{m2}} & \\cdots & \\dfrac{\\partial \\, \\mathbf{x}}{\\partial \\, X_{mn}}\n",
    "\\end{bmatrix} \\\\[5pt]\n",
    "&=\\begin{bmatrix}\n",
    "\\begin{pmatrix} \\dfrac{x_1}{\\partial \\, X_{11}} \\\\ \\dfrac{x_2}{\\partial \\, X_{11}} \\\\ \\vdots \\\\ \\dfrac{x_p}{\\partial \\, X_{11}}  \\end{pmatrix} & \n",
    "\\begin{pmatrix} \\dfrac{x_1}{\\partial \\, X_{12}} \\\\ \\dfrac{x_2}{\\partial \\, X_{12}} \\\\ \\vdots \\\\ \\dfrac{x_p}{\\partial \\, X_{12}}  \\end{pmatrix} &\n",
    "\\cdots &\n",
    "\\begin{pmatrix} \\dfrac{x_1}{\\partial \\, X_{1n}} \\\\ \\dfrac{x_2}{\\partial \\, X_{1n}} \\\\ \\vdots \\\\ \\dfrac{x_p}{\\partial \\, X_{1n}}  \\end{pmatrix} \\\\\n",
    "\\begin{pmatrix} \\dfrac{x_1}{\\partial \\, X_{21}} \\\\ \\dfrac{x_2}{\\partial \\, X_{21}} \\\\ \\vdots \\\\ \\dfrac{x_p}{\\partial \\, X_{21}}  \\end{pmatrix} & \n",
    "\\begin{pmatrix} \\dfrac{x_1}{\\partial \\, X_{22}} \\\\ \\dfrac{x_2}{\\partial \\, X_{22}} \\\\ \\vdots \\\\ \\dfrac{x_p}{\\partial \\, X_{22}}  \\end{pmatrix} &\n",
    "\\cdots &\n",
    "\\begin{pmatrix} \\dfrac{x_1}{\\partial \\, X_{2n}} \\\\ \\dfrac{x_2}{\\partial \\, X_{2n}} \\\\ \\vdots \\\\ \\dfrac{x_p}{\\partial \\, X_{2n}}  \\end{pmatrix} \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "\\begin{pmatrix} \\dfrac{x_1}{\\partial \\, X_{m1}} \\\\ \\dfrac{x_2}{\\partial \\, X_{m1}} \\\\ \\vdots \\\\ \\dfrac{x_p}{\\partial \\, X_{m1}}  \\end{pmatrix} & \n",
    "\\begin{pmatrix} \\dfrac{x_1}{\\partial \\, X_{m2}} \\\\ \\dfrac{x_2}{\\partial \\, X_{m2}} \\\\ \\vdots \\\\ \\dfrac{x_p}{\\partial \\, X_{m2}}  \\end{pmatrix} &\n",
    "\\cdots &\n",
    "\\begin{pmatrix} \\dfrac{x_1}{\\partial \\, X_{mn}} \\\\ \\dfrac{x_2}{\\partial \\, X_{mn}} \\\\ \\vdots \\\\ \\dfrac{x_p}{\\partial \\, X_{mn}}  \\end{pmatrix} \\\\\n",
    "\\end{bmatrix} \n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "- 분모에 vec연산자를 이용하여 벡터를 벡터로 미분하는것도 가능(아래 예제에서 확인함)\n",
    "\n",
    "- 행렬을 행렬로 미분하는 경우도 두 방식 모두 똑같이 적용 가능"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 예제<sup>[3]</sup>\n",
    "- $\\mathbf{X} : m \\times n $, $\\mathbf{b} : n \\times 1$, $\\mathbf{Xb} : m \\times 1$\n",
    "\n",
    "- 분모를 행렬로 그대로 미분하는 경우\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\frac{\\partial \\, \\mathbf{Xb}}{\\partial \\, \\mathbf{X}} \n",
    "&= \\frac{\\partial }{\\partial \\mathbf{X}} \\otimes \\mathbf{Xb} \\\\[5pt]\n",
    "&= \\begin{bmatrix}\n",
    "\\dfrac{\\partial \\, \\mathbf{Xb}}{\\partial \\, X_{11}} & \\dfrac{\\partial \\, \\mathbf{Xb}}{\\partial \\, X_{12}} & \\cdots & \\dfrac{\\partial \\, \\mathbf{Xb}}{\\partial \\, X_{1n}} \\\\\n",
    "\\dfrac{\\partial \\, \\mathbf{Xb}}{\\partial \\, X_{21}} & \\dfrac{\\partial \\, \\mathbf{Xb}}{\\partial \\, X_{22}} & \\cdots & \\dfrac{\\partial \\, \\mathbf{Xb}}{\\partial \\, X_{2n}} \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "\\dfrac{\\partial \\, \\mathbf{Xb}}{\\partial \\, X_{m1}} & \\dfrac{\\partial \\, \\mathbf{Xb}}{\\partial \\, X_{m2}} & \\cdots & \\dfrac{\\partial \\, \\mathbf{Xb}}{\\partial \\, X_{mn}}\n",
    "\\end{bmatrix} \\\\[5pt]\n",
    "&= \\begin{bmatrix}\n",
    "\\begin{pmatrix} \\dfrac{\\partial \\, (\\mathbf{Xb})_{1}}{\\partial \\, X_{11}}\\\\\\dfrac{\\partial \\, (\\mathbf{Xb})_{2}}{\\partial \\, X_{11}}\\\\\\vdots\\\\\\dfrac{\\partial \\, (\\mathbf{Xb})_{m}}{\\partial \\, X_{11}} \\end{pmatrix} & \\begin{pmatrix} \\dfrac{\\partial \\, (\\mathbf{Xb})_{1}}{\\partial \\, X_{12}}\\\\\\dfrac{\\partial \\, (\\mathbf{Xb})_{2}}{\\partial \\, X_{12}}\\\\\\vdots\\\\\\dfrac{\\partial \\, (\\mathbf{Xb})_{m}}{\\partial \\, X_{12}} \\end{pmatrix} & \\cdots & \\begin{pmatrix} \\dfrac{\\partial \\, (\\mathbf{Xb})_{1}}{\\partial \\, X_{1n}}\\\\\\dfrac{\\partial \\, (\\mathbf{Xb})_{2}}{\\partial \\, X_{1n}}\\\\\\vdots\\\\\\dfrac{\\partial \\, (\\mathbf{Xb})_{m}}{\\partial \\, X_{1n}} \\end{pmatrix} \\\\\n",
    "\\begin{pmatrix} \\dfrac{\\partial \\, (\\mathbf{Xb})_{1}}{\\partial \\, X_{21}}\\\\\\dfrac{\\partial \\, (\\mathbf{Xb})_{2}}{\\partial \\, X_{21}}\\\\\\vdots\\\\\\dfrac{\\partial \\, (\\mathbf{Xb})_{m}}{\\partial \\, X_{21}} \\end{pmatrix} & \\begin{pmatrix} \\dfrac{\\partial \\, (\\mathbf{Xb})_{1}}{\\partial \\, X_{22}}\\\\\\dfrac{\\partial \\, (\\mathbf{Xb})_{2}}{\\partial \\, X_{22}}\\\\\\vdots\\\\\\dfrac{\\partial \\, (\\mathbf{Xb})_{m}}{\\partial \\, X_{22}} \\end{pmatrix} & \\cdots & \\begin{pmatrix} \\dfrac{\\partial \\, (\\mathbf{Xb})_{1}}{\\partial \\, X_{2n}}\\\\\\dfrac{\\partial \\, (\\mathbf{Xb})_{2}}{\\partial \\, X_{2n}}\\\\\\vdots\\\\\\dfrac{\\partial \\, (\\mathbf{Xb})_{m}}{\\partial \\, X_{2n}} \\end{pmatrix} \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "\\begin{pmatrix} \\dfrac{\\partial \\, (\\mathbf{Xb})_{1}}{\\partial \\, X_{m1}}\\\\\\dfrac{\\partial \\, (\\mathbf{Xb})_{2}}{\\partial \\, X_{m1}}\\\\\\vdots\\\\\\dfrac{\\partial \\, (\\mathbf{Xb})_{m}}{\\partial \\, X_{m1}} \\end{pmatrix} & \\begin{pmatrix} \\dfrac{\\partial \\, (\\mathbf{Xb})_{1}}{\\partial \\, X_{m2}}\\\\\\dfrac{\\partial \\, (\\mathbf{Xb})_{2}}{\\partial \\, X_{m2}}\\\\\\vdots\\\\\\dfrac{\\partial \\, (\\mathbf{Xb})_{m}}{\\partial \\, X_{m2}} \\end{pmatrix} & \\cdots & \\begin{pmatrix} \\dfrac{\\partial \\, (\\mathbf{Xb})_{1}}{\\partial \\, X_{mn}}\\\\\\dfrac{\\partial \\, (\\mathbf{Xb})_{2}}{\\partial \\, X_{mn}}\\\\\\vdots\\\\\\dfrac{\\partial \\, (\\mathbf{Xb})_{m}}{\\partial \\, X_{mn}} \\end{pmatrix}\n",
    "\\end{bmatrix} \\\\[5pt]\n",
    "&= \\begin{bmatrix}\n",
    "\\begin{pmatrix} b_1 \\\\ 0   \\\\ \\vdots \\\\ 0 \\end{pmatrix} & \n",
    "\\begin{pmatrix} b_2 \\\\ 0   \\\\ \\vdots \\\\ 0 \\end{pmatrix} & \\cdots & \n",
    "\\begin{pmatrix} b_n \\\\ 0   \\\\ \\vdots \\\\ 0 \\end{pmatrix} \\\\\n",
    "\\begin{pmatrix} 0   \\\\ b_1 \\\\ \\vdots \\\\ 0 \\end{pmatrix} & \n",
    "\\begin{pmatrix} 0   \\\\ b_2 \\\\ \\vdots \\\\ 0 \\end{pmatrix} & \\cdots & \n",
    "\\begin{pmatrix} 0   \\\\ b_n \\\\ \\vdots \\\\ 0 \\end{pmatrix} \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "\\begin{pmatrix} 0   \\\\ 0 \\\\ \\vdots \\\\ b_1 \\end{pmatrix} & \n",
    "\\begin{pmatrix} 0   \\\\ 0 \\\\ \\vdots \\\\ b_2 \\end{pmatrix} & \\cdots & \n",
    "\\begin{pmatrix} 0   \\\\ 0 \\\\ \\vdots \\\\ b_n \\end{pmatrix} \n",
    "\\end{bmatrix}\n",
    "\\end{align}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- 분모를 $\\text{vec}(\\mathbf{X})$로 바꿔서 미분하는 경우 (분모레이아웃)\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\frac{\\partial \\, \\mathbf{Xb}}{\\partial \\, \\mathbf{X}} &= \\frac{\\partial \\, \\mathbf{Xb}}{\\partial \\, \\left(\\text{vec}(\\mathbf{X}) \\right)} \\\\[5pt]\n",
    "&= \\begin{bmatrix}\n",
    "\\dfrac{\\partial \\, (\\mathbf{Xb})_{1}}{\\partial \\, X_{11}} & \\dfrac{\\partial \\, (\\mathbf{Xb})_{2}}{\\partial \\, X_{11}} & \\cdots & \\dfrac{\\partial \\, (\\mathbf{Xb})_{m}}{\\partial \\, X_{11}} \\\\\n",
    "\\dfrac{\\partial \\, (\\mathbf{Xb})_{1}}{\\partial \\, X_{21}} & \\dfrac{\\partial \\, (\\mathbf{Xb})_{2}}{\\partial \\, X_{21}} & \\cdots & \\dfrac{\\partial \\, (\\mathbf{Xb})_{m}}{\\partial \\, X_{21}} \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "\\dfrac{\\partial \\, (\\mathbf{Xb})_{1}}{\\partial \\, X_{mn}} & \\dfrac{\\partial \\, (\\mathbf{Xb})_{2}}{\\partial \\, X_{mn}} & \\cdots & \\dfrac{\\partial \\, (\\mathbf{Xb})_{m}}{\\partial \\, X_{mn}}\n",
    "\\end{bmatrix} \\\\[5pt]\n",
    "&= \\begin{bmatrix}\n",
    "b_1 & 0 & \\cdots & 0 \\\\\n",
    "0   & b_1 & \\cdots & 0 \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "0 & 0 & 0 & b_1 \\\\ \n",
    "b_2 & 0 & \\cdots & 0 \\\\\n",
    "0   & b_2 & \\cdots & 0 \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "0 & 0 & 0 & b_2 \\\\ - & - & - & - \\\\\n",
    "\\vdots & \\vdots & \\vdots & \\vdots \\\\ - & - & - & - \\\\\n",
    "b_n & 0 & \\cdots & 0 \\\\\n",
    "0   & b_n & \\cdots & 0 \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "0 & 0 & 0 & b_n\n",
    "\\end{bmatrix}\n",
    "\\end{align}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- 두 결과가 다른가? 첫번째 결과를 (m)-transpose 시키면 두번째 결과와 같아 진다.\n",
    "\n",
    "$$\n",
    "\\begin{bmatrix}\n",
    "\\begin{pmatrix} \\color{RoyalBlue}{b_1 \\\\ 0   \\\\ \\vdots \\\\ 0} \\end{pmatrix} & \n",
    "\\begin{pmatrix} \\color{OrangeRed}{b_2 \\\\ 0   \\\\ \\vdots \\\\ 0} \\end{pmatrix} & \\cdots & \n",
    "\\begin{pmatrix} \\color{YellowGreen}{b_n \\\\ 0   \\\\ \\vdots \\\\ 0} \\end{pmatrix} \\\\\n",
    "\\begin{pmatrix} \\color{RoyalBlue}{0   \\\\ b_1 \\\\ \\vdots \\\\ 0} \\end{pmatrix} & \n",
    "\\begin{pmatrix} \\color{OrangeRed}{0   \\\\ b_2 \\\\ \\vdots \\\\ 0} \\end{pmatrix} & \\cdots & \n",
    "\\begin{pmatrix} \\color{YellowGreen}{0   \\\\ b_n \\\\ \\vdots \\\\ 0} \\end{pmatrix} \\\\\n",
    "\\color{RoyalBlue}{\\vdots} & \\vdots & \\ddots & \\vdots \\\\\n",
    "\\begin{pmatrix} \\color{RoyalBlue}{0   \\\\ 0 \\\\ \\vdots \\\\ b_1} \\end{pmatrix} & \n",
    "\\begin{pmatrix} \\color{OrangeRed}{0   \\\\ 0 \\\\ \\vdots \\\\ b_2} \\end{pmatrix} & \\cdots & \n",
    "\\begin{pmatrix} \\color{YellowGreen}{0   \\\\ 0 \\\\ \\vdots \\\\ b_n} \\end{pmatrix} \n",
    "\\end{bmatrix}^{(m)} = \\begin{bmatrix}\n",
    "\\color{RoyalBlue}{b_1} & \\color{RoyalBlue}{0} & \\color{RoyalBlue}{\\cdots} & \\color{RoyalBlue}{0} \\\\\n",
    "\\color{RoyalBlue}{0}   & \\color{RoyalBlue}{b_1} & \\color{RoyalBlue}{\\cdots} & \\color{RoyalBlue}{0} \\\\\n",
    "\\color{RoyalBlue}{\\vdots} & \\color{RoyalBlue}{\\vdots} & \\color{RoyalBlue}{\\ddots} & \\color{RoyalBlue}{\\vdots} \\\\\n",
    "\\color{RoyalBlue}{0} & \\color{RoyalBlue}{0} & \\color{RoyalBlue}{0} & \\color{RoyalBlue}{b_1} \\\\ \n",
    "\\color{OrangeRed}{b_2} & \\color{OrangeRed}{0} & \\color{OrangeRed}{\\cdots} & \\color{OrangeRed}{0} \\\\\n",
    "\\color{OrangeRed}{0}   & \\color{OrangeRed}{b_2} & \\color{OrangeRed}{\\cdots} & \\color{OrangeRed}{0} \\\\\n",
    "\\color{OrangeRed}{\\vdots} & \\color{OrangeRed}{\\vdots} & \\color{OrangeRed}{\\ddots} & \\color{OrangeRed}{\\vdots} \\\\\n",
    "\\color{OrangeRed}{0} & \\color{OrangeRed}{0} & \\color{OrangeRed}{0} & \\color{OrangeRed}{b_2} \\\\ - & - & - & - \\\\\n",
    "\\vdots & \\vdots & \\vdots & \\vdots \\\\ - & - & - & - \\\\\n",
    "\\color{YellowGreen}{b_n} & \\color{YellowGreen}{0} & \\color{YellowGreen}{\\cdots} & \\color{YellowGreen}{0} \\\\\n",
    "\\color{YellowGreen}{0}   & \\color{YellowGreen}{b_n} & \\color{YellowGreen}{\\cdots} & \\color{YellowGreen}{0} \\\\\n",
    "\\color{YellowGreen}{\\vdots} & \\color{YellowGreen}{\\vdots} & \\color{YellowGreen}{\\ddots} & \\color{YellowGreen}{\\vdots} \\\\\n",
    "\\color{YellowGreen}{0} & \\color{YellowGreen}{0} & \\color{YellowGreen}{0} & \\color{YellowGreen}{b_n}\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "- 따라서 다음과 같다\n",
    "\n",
    "$$\n",
    "\\left( \\frac{\\partial }{\\partial \\mathbf{X}} \\otimes \\mathbf{Xb} \\right)^{(m)} = \\frac{\\partial \\, \\mathbf{Xb}}{\\partial \\, \\left(\\text{vec}(\\mathbf{X}) \\right)} \n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 곱의 미분<sup>[1]</sup>\n",
    "\n",
    "$\\mathbf{X} : m \\times n$, $\\mathbf{Y} : n \\times r$ , $\\mathbf{Z} : p \\times q$ 일 때 미분 결과는 $mp \\times rq$\n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, (\\mathbf{XY})}{\\partial \\, \\mathbf{Z}} = \\left( \\frac{\\partial \\, \\mathbf{X} }{\\partial \\, \\mathbf{Z}} \\right) \\left( \\mathbf{I}_{q} \\otimes \\mathbf{Y} \\right) + \\left( \\mathbf{I}_{p} \\otimes \\mathbf{X} \\right)\\left( \\frac{\\partial \\, \\mathbf{Y}}{\\partial \\, \\mathbf{Z}} \\right)\n",
    "$$\n",
    "\n",
    "행렬로 미분을 할때도 곱의 미분법이 그대로 적용되나 차원 맞춤에 주의 해야 한다. \n",
    "\n",
    "위 미분이 다음처럼 되지 않는것은 \n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, (\\mathbf{XY})}{\\partial \\, \\mathbf{Z}} = \\left( \\frac{\\partial \\, \\mathbf{X} }{\\partial \\, \\mathbf{Z}} \\right)   \\mathbf{Y}  +   \\mathbf{X} \\left( \\frac{\\partial \\, \\mathbf{Y}}{\\partial \\, \\mathbf{Z}} \\right)\n",
    "$$\n",
    "\n",
    "$\\frac{\\partial \\, \\mathbf{X} }{\\partial \\, \\mathbf{Z}}$가 $mp \\times nq$가 되기 때문에 $\\mathbf{Y}$를 바로 곱할 수 가 없기 때문이다. \n",
    "\n",
    "뒤에서 곱해지는 $\\mathbf{Y}$가 어떤 형태로 변해야 적절히 식의 곱을 유지할 수 있는지 알아보기 위해 $\\mathbf{X} : 1 \\times 2$, $\\mathbf{Y} : 2 \\times 1$, $\\mathbf{Z} : 2 \\times 2$로 두고 예를 들어보면\n",
    "\n",
    "$$\n",
    "\\mathbf{X}\\mathbf{Y} = \n",
    "\\begin{bmatrix} \\color{RoyalBlue}{X_1} & \\color{OrangeRed}{X_2} \\end{bmatrix} \n",
    "\\begin{bmatrix} \\color{RoyalBlue}{Y_1} \\\\ \\color{OrangeRed}{Y_2} \\end{bmatrix} = \\color{RoyalBlue}{X_1} \\color{RoyalBlue}{Y_1} + \\color{OrangeRed}{X_2} \\color{OrangeRed}{Y_2}\n",
    "$$\n",
    "\n",
    "처럼 $\\mathbf{X}$와 $\\mathbf{Y}$의 곱은 $X_i Y_i$가 되어야 한다.\n",
    "\n",
    "아래처럼 $\\mathbf{X}$가 미분된 결과에 $\\mathbf{Y}$가 $X_i Y_i$ 형태로 적절히 곱해지기 위해서는 \n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, \\mathbf{X} }{\\partial \\, \\mathbf{Z}} =\n",
    "\\begin{bmatrix}\n",
    "\\dfrac{\\partial \\, \\color{RoyalBlue}{X_1}}{\\partial \\, Z_{11}} & \\dfrac{\\partial \\, \\color{OrangeRed}{X_2}}{\\partial \\, Z_{11}} & \\dfrac{\\partial \\, \\color{RoyalBlue}{X_1}}{\\partial \\, Z_{12}} & \\dfrac{\\partial \\, \\color{OrangeRed}{X_2}}{\\partial \\, Z_{12}} \\\\\n",
    "\\dfrac{\\partial \\, \\color{RoyalBlue}{X_1}}{\\partial \\, Z_{21}} & \\dfrac{\\partial \\, \\color{OrangeRed}{X_2}}{\\partial \\, Z_{21}} & \\dfrac{\\partial \\, \\color{RoyalBlue}{X_1}}{\\partial \\, Z_{22}} & \\dfrac{\\partial \\, \\color{OrangeRed}{X_2}}{\\partial \\, Z_{22}}\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "$\\mathbf{Y}$의 형태가 다음처럼 확장되어야 한다.\n",
    "\n",
    "$$\n",
    "\\begin{bmatrix}\n",
    "\\color{RoyalBlue}{Y_1} & 0 \\\\ \\color{OrangeRed}{Y_2} & 0 \\\\ 0 & \\color{RoyalBlue}{Y_1} \\\\ 0 & \\color{OrangeRed}{Y_2}\n",
    "\\end{bmatrix} = \\mathbf{I}_{2} \\otimes \\mathbf{Y}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 곱의 미분 예제 : 텐서플로우 코리아 임성빈님에 의해 제시됨, https://www.facebook.com/groups/TensorFlowKR/permalink/581553265519069/\n",
    "\n",
    "$\\mathbf{X} : m \\times n = m \\times 1$, $\\mathbf{Y} : n \\times r = 1 \\times 1$, $\\mathbf{Z} : p \\times q = p \\times 1$ 일 때  $\\dfrac{\\partial \\, (\\mathbf{XY})}{\\partial \\, \\mathbf{Z}}$\n",
    "\n",
    "- 분자의 $\\mathbf{XY} : m \\times 1$인 벡터이므로 결국 $m \\times 1$ 벡터를 $p \\times 1$벡터로 미분하는 것이 되어 결과적으로 다음과 같이 야코비안 행렬이 된다. (분자레이아웃)\n",
    "\n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, (\\mathbf{XY})}{\\partial \\, \\mathbf{Z}}=\n",
    "\\begin{bmatrix}\n",
    "\\dfrac{\\partial \\, \\left(\\mathbf{XY}\\right)_1}{\\partial \\, \\mathbf{Z}_1} & \\dfrac{\\partial \\, \\left(\\mathbf{XY}\\right)_1}{\\partial \\, \\mathbf{Z}_2} & \\cdots & \\dfrac{\\partial \\, \\left(\\mathbf{XY}\\right)_1}{\\partial \\, \\mathbf{Z}_p} \\\\\n",
    "\\dfrac{\\partial \\, \\left(\\mathbf{XY}\\right)_2}{\\partial \\, \\mathbf{Z}_1} & \\dfrac{\\partial \\, \\left(\\mathbf{XY}\\right)_2}{\\partial \\, \\mathbf{Z}_2} & \\cdots & \\dfrac{\\partial \\, \\left(\\mathbf{XY}\\right)_2}{\\partial \\, \\mathbf{Z}_p} \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "\\dfrac{\\partial \\, \\left(\\mathbf{XY}\\right)_m}{\\partial \\, \\mathbf{Z}_1} & \\dfrac{\\partial \\, \\left(\\mathbf{XY}\\right)_m}{\\partial \\, \\mathbf{Z}_2} & \\cdots & \\dfrac{\\partial \\, \\left(\\mathbf{XY}\\right)_m}{\\partial \\, \\mathbf{Z}_p}\n",
    "\\end{bmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- 이것을 vec 연산자를 사용하여 미분하는 경우 이미 분자, 분모가 모두 벡터이므로 결과는 위와 동일하게 된다.\n",
    "\n",
    "- 하지만 이것을 크로네커 곱을 이용한 방법으로 나타내면 조금 복잡해지는데\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\frac{\\partial \\, (\\mathbf{XY})}{\\partial \\, \\mathbf{Z}} &= \\left( \\frac{\\partial \\, \\mathbf{X} }{\\partial \\, \\mathbf{Z}} \\right) \\left( \\mathbf{I}_{1} \\otimes \\mathbf{Y} \\right) + \\left( \\mathbf{I}_{p} \\otimes \\mathbf{X} \\right)\\left( \\frac{\\partial \\, \\mathbf{Y}}{\\partial \\, \\mathbf{Z}} \\right) \\\\\n",
    "&=\\left( \\frac{\\partial \\,}{\\partial \\, \\mathbf{Z}}  \\otimes  \\mathbf{X} \\right) \\left( \\mathbf{I}_{1} \\otimes \\mathbf{Y} \\right) + \\left( \\mathbf{I}_{p} \\otimes \\mathbf{X} \\right)\\left( \\frac{\\partial \\, }{\\partial \\, \\mathbf{Z}} \\otimes \\mathbf{Y} \\right)\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "로 되며, 이때 $\\left( \\frac{\\partial \\, \\mathbf{X} }{\\partial \\, \\mathbf{Z}} \\right)$가 벡터를 벡터로 미분하는 경우이기 때문에 야코비안이 될 수도 있지만 크로네커 곱을 이용하여 미분을 계산할 때 모든 미분을 일관성있게 크로네커 곱의 방식으로 기술해야 한다. 그렇지 않으면 차원 맞춤이 깨지는 오류가 발생한다. 모든 미분을 크로네커 곱으로 확장하면\n",
    "\n",
    "$$\n",
    "\\underbrace{\\left( \\frac{\\partial \\,}{\\partial \\, \\mathbf{Z}}  \\otimes  \\mathbf{X} \\right)}_{mp \\times 1} \\underbrace{\\left( \\mathbf{I}_{1} \\otimes \\mathbf{Y} \\right)}_{1 \\times 1} + \\underbrace{\\left( \\mathbf{I}_{p} \\otimes \\mathbf{X} \\right)}_{mp \\times p} \\underbrace{\\left( \\frac{\\partial \\, }{\\partial \\, \\mathbf{Z}} \\otimes \\mathbf{Y} \\right)}_{p \\times 1}\n",
    "$$\n",
    "\n",
    "가 되어 결과는 $mp \\times 1$이 되고 이를 (m)-transpose 시키면 $m \\times p$ 야코비안과 일치하게 된다. 직접 계산을 해보면\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\frac{\\partial \\, \\mathbf{X} }{\\partial \\, \\mathbf{Z}} \n",
    "&= \\frac{\\partial \\,}{\\partial \\, \\mathbf{Z}}  \\otimes  \\mathbf{X} = \\begin{bmatrix}\n",
    "\\dfrac{\\partial}{\\partial \\, Z_{1}} \\\\\n",
    "\\dfrac{\\partial}{\\partial \\, Z_{2}} \\\\\n",
    "\\vdots \\\\\n",
    "\\dfrac{\\partial}{\\partial \\, Z_{p}}\n",
    "\\end{bmatrix}\\otimes  \\mathbf{X} = \\left[\n",
    "\\begin{array}{c}\n",
    "\\color{RoyalBlue}{\\begin{matrix}\n",
    "\\dfrac{\\partial \\, X_1}{\\partial \\, Z_1} \\\\\n",
    "\\dfrac{\\partial \\, X_2}{\\partial \\, Z_1} \\\\\n",
    "\\vdots \\\\\n",
    "\\dfrac{\\partial \\, X_m}{\\partial \\, Z_1}\n",
    "\\end{matrix}} \\\\\n",
    "\\hline\n",
    "\\color{OrangeRed}{\\begin{matrix}\n",
    "\\dfrac{\\partial \\, X_1}{\\partial \\, Z_2} \\\\\n",
    "\\dfrac{\\partial \\, X_2}{\\partial \\, Z_2} \\\\\n",
    "\\vdots \\\\\n",
    "\\dfrac{\\partial \\, X_m}{\\partial \\, Z_2}\n",
    "\\end{matrix}} \\\\\n",
    "\\hline\n",
    "\\vdots \\\\\n",
    "\\hline\n",
    "\\color{YellowGreen}{\\begin{matrix}\n",
    "\\dfrac{\\partial \\, X_1}{\\partial \\, Z_p} \\\\\n",
    "\\dfrac{\\partial \\, X_2}{\\partial \\, Z_p} \\\\\n",
    "\\vdots \\\\\n",
    "\\dfrac{\\partial \\, X_m}{\\partial \\, Z_p}\n",
    "\\end{matrix}}\n",
    "\\end{array}\n",
    "\\right]\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "$$\n",
    "\\mathbf{I}_{1} \\otimes \\mathbf{Y} = \\left[ Y_1 \\right]\n",
    "$$\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\mathbf{I}_p \\otimes \\mathbf{X} &= \\begin{bmatrix}\n",
    "1 & 0 & \\cdots & 0 \\\\\n",
    "0 & 1 & \\cdots & 0 \\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "0 & 0 & \\cdots & 1\n",
    "\\end{bmatrix} \\otimes \\mathbf{X} = \\left[\n",
    "\\begin{array}{c}\n",
    "\\begin{matrix}\n",
    "    X_1 & 0 & \\cdots & 0 \\\\\n",
    "    X_2 & 0 & \\cdots & 0 \\\\\n",
    "    \\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "    X_m & 0 & \\cdots & 0 \n",
    "\\end{matrix} \\\\\n",
    "\\hline \n",
    "\\begin{matrix}\n",
    "    0 & X_1 & \\cdots & 0 \\\\\n",
    "    0 & X_2 & \\cdots & 0 \\\\\n",
    "    \\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "    0 & X_m & \\cdots & 0 \n",
    "\\end{matrix} \\\\\n",
    "\\hline\n",
    "\\vdots \\\\\n",
    "\\hline \n",
    "\\begin{matrix}\n",
    "    0 & 0 & \\cdots & X_1 \\\\\n",
    "    0 & 0 & \\cdots & X_2 \\\\\n",
    "    \\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "    0 & 0 & \\cdots &  X_m \n",
    "\\end{matrix}\n",
    "\\end{array}\n",
    "\\right]\n",
    "\\end{align}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\frac{\\partial \\, \\mathbf{Y}}{\\partial \\, \\mathbf{Z}} = \\frac{\\partial}{\\partial \\, \\mathbf{Z}} \\otimes \\mathbf{Y} = \\begin{bmatrix}\n",
    "\\dfrac{\\partial \\, Y_1}{\\partial \\, Z_1} \\\\\n",
    "\\dfrac{\\partial \\, Y_1}{\\partial \\, Z_2} \\\\\n",
    "\\vdots \\\\\n",
    "\\dfrac{\\partial \\, Y_1}{\\partial \\, Z_p}\n",
    "\\end{bmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "각 항을 실제로 곱해보면\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "&\\underbrace{\\left( \\frac{\\partial \\,}{\\partial \\, \\mathbf{Z}}  \\otimes  \\mathbf{X} \\right)}_{mp \\times 1} \\underbrace{\\left( \\mathbf{I}_{1} \\otimes \\mathbf{Y} \\right)}_{1 \\times 1} + \\underbrace{\\left( \\mathbf{I}_{p} \\otimes \\mathbf{X} \\right)}_{mp \\times p} \\underbrace{\\left( \\frac{\\partial \\, }{\\partial \\, \\mathbf{Z}} \\otimes \\mathbf{Y} \\right)}_{p \\times 1} \\\\\n",
    "=& \\, \\left[\n",
    "\\begin{array}{c}\n",
    "\\color{RoyalBlue}{\\begin{matrix}\n",
    "Y_1 \\dfrac{\\partial \\, X_1}{\\partial \\, Z_1} \\\\\n",
    "Y_1 \\dfrac{\\partial \\, X_2}{\\partial \\, Z_1} \\\\\n",
    "\\vdots \\\\\n",
    "Y_1 \\dfrac{\\partial \\, X_m}{\\partial \\, Z_1}\n",
    "\\end{matrix}} \\\\\n",
    "\\hline\n",
    "\\color{OrangeRed}{\\begin{matrix}\n",
    "Y_1 \\dfrac{\\partial \\, X_1}{\\partial \\, Z_2} \\\\\n",
    "Y_1 \\dfrac{\\partial \\, X_2}{\\partial \\, Z_2} \\\\\n",
    "\\vdots \\\\\n",
    "Y_1 \\dfrac{\\partial \\, X_m}{\\partial \\, Z_2}\n",
    "\\end{matrix}} \\\\\n",
    "\\hline\n",
    "\\vdots \\\\\n",
    "\\hline\n",
    "\\color{YellowGreen}{\\begin{matrix}\n",
    "Y_1 \\dfrac{\\partial \\, X_1}{\\partial \\, Z_p} \\\\\n",
    "Y_1 \\dfrac{\\partial \\, X_2}{\\partial \\, Z_p} \\\\\n",
    "\\vdots \\\\\n",
    "Y_1 \\dfrac{\\partial \\, X_m}{\\partial \\, Z_p}\n",
    "\\end{matrix}}\n",
    "\\end{array}\n",
    "\\right] + \n",
    "\\left[\n",
    "\\begin{array}{c}\n",
    "\\color{RoyalBlue}{\\begin{matrix}\n",
    "X_1 \\dfrac{\\partial \\, Y_1}{\\partial \\, Z_1} \\\\\n",
    "X_2 \\dfrac{\\partial \\, Y_1}{\\partial \\, Z_1} \\\\\n",
    "\\vdots \\\\\n",
    "X_m \\dfrac{\\partial \\, Y_1}{\\partial \\, Z_1}\n",
    "\\end{matrix}} \\\\\n",
    "\\hline\n",
    "\\color{OrangeRed}{\\begin{matrix}\n",
    "X_1 \\dfrac{\\partial \\, Y_1}{\\partial \\, Z_2} \\\\\n",
    "X_2 \\dfrac{\\partial \\, Y_1}{\\partial \\, Z_2} \\\\\n",
    "\\vdots \\\\\n",
    "X_m \\dfrac{\\partial \\, Y_1}{\\partial \\, Z_2}\n",
    "\\end{matrix}} \\\\\n",
    "\\hline\n",
    "\\vdots \\\\\n",
    "\\hline\n",
    "\\color{YellowGreen}{\\begin{matrix}\n",
    "X_1 \\dfrac{\\partial \\, Y_1}{\\partial \\, Z_p} \\\\\n",
    "X_2 \\dfrac{\\partial \\, Y_1}{\\partial \\, Z_p} \\\\\n",
    "\\vdots \\\\\n",
    "X_m \\dfrac{\\partial \\, Y_1}{\\partial \\, Z_p}\n",
    "\\end{matrix}}\n",
    "\\end{array}\n",
    "\\right] \\quad = \\quad\n",
    "\\left[\n",
    "\\begin{array}{c}\n",
    "\\color{RoyalBlue}{\\begin{matrix}\n",
    "Y_1 \\dfrac{\\partial \\, X_1}{\\partial \\, Z_1} + X_1 \\dfrac{\\partial \\, Y_1}{\\partial \\, Z_1} \\\\\n",
    "Y_1 \\dfrac{\\partial \\, X_2}{\\partial \\, Z_1} + X_2 \\dfrac{\\partial \\, Y_1}{\\partial \\, Z_1} \\\\\n",
    "\\vdots \\\\\n",
    "Y_1 \\dfrac{\\partial \\, X_m}{\\partial \\, Z_1} + X_m \\dfrac{\\partial \\, Y_1}{\\partial \\, Z_1}\n",
    "\\end{matrix}} \\\\\n",
    "\\hline\n",
    "\\color{OrangeRed}{\\begin{matrix}\n",
    "Y_1 \\dfrac{\\partial \\, X_1}{\\partial \\, Z_2} + X_1 \\dfrac{\\partial \\, Y_1}{\\partial \\, Z_2} \\\\\n",
    "Y_1 \\dfrac{\\partial \\, X_2}{\\partial \\, Z_2} + X_2 \\dfrac{\\partial \\, Y_1}{\\partial \\, Z_2} \\\\\n",
    "\\vdots \\\\\n",
    "Y_1 \\dfrac{\\partial \\, X_m}{\\partial \\, Z_2} + X_m \\dfrac{\\partial \\, Y_1}{\\partial \\, Z_2}\n",
    "\\end{matrix}} \\\\\n",
    "\\hline\n",
    "\\vdots \\\\\n",
    "\\hline\n",
    "\\color{YellowGreen}{\\begin{matrix}\n",
    "Y_1 \\dfrac{\\partial \\, X_1}{\\partial \\, Z_p} + X_1 \\dfrac{\\partial \\, Y_1}{\\partial \\, Z_p} \\\\\n",
    "Y_1 \\dfrac{\\partial \\, X_2}{\\partial \\, Z_p} + X_2 \\dfrac{\\partial \\, Y_1}{\\partial \\, Z_p} \\\\\n",
    "\\vdots \\\\\n",
    "Y_1 \\dfrac{\\partial \\, X_m}{\\partial \\, Z_p} + X_m \\dfrac{\\partial \\, Y_1}{\\partial \\, Z_p}\n",
    "\\end{matrix}}\n",
    "\\end{array}\n",
    "\\right] \\quad = \\quad\n",
    "\\left[\n",
    "\\begin{array}{c}\n",
    "\\color{RoyalBlue}{\\begin{matrix}\n",
    "\\dfrac{\\partial \\, X_1 Y_1}{\\partial \\, Z_1} \\\\\n",
    "\\dfrac{\\partial \\, X_2 Y_1}{\\partial \\, Z_1} \\\\\n",
    "\\vdots \\\\\n",
    "\\dfrac{\\partial \\, X_m Y_1}{\\partial \\, Z_1}\n",
    "\\end{matrix}} \\\\\n",
    "\\hline\n",
    "\\color{OrangeRed}{\\begin{matrix}\n",
    "\\dfrac{\\partial \\, X_1 Y_1}{\\partial \\, Z_2} \\\\\n",
    "\\dfrac{\\partial \\, X_2 Y_1}{\\partial \\, Z_2} \\\\\n",
    "\\vdots \\\\\n",
    "\\dfrac{\\partial \\, X_m Y_1}{\\partial \\, Z_2}\n",
    "\\end{matrix}} \\\\\n",
    "\\hline\n",
    "\\vdots \\\\\n",
    "\\hline\n",
    "\\color{YellowGreen}{\\begin{matrix}\n",
    "\\dfrac{\\partial \\, X_1 Y_1}{\\partial \\, Z_p} \\\\\n",
    "\\dfrac{\\partial \\, X_2 Y_1}{\\partial \\, Z_p} \\\\\n",
    "\\vdots \\\\\n",
    "\\dfrac{\\partial \\, X_m Y_1}{\\partial \\, Z_p}\n",
    "\\end{matrix}}\n",
    "\\end{array}\n",
    "\\right]\n",
    "\\end{align}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "최종 결과를 (m)-transpose 시키면 야코비안이 된다.\n",
    "\n",
    "$$\n",
    "\\left[\\left( \\frac{\\partial \\,}{\\partial \\, \\mathbf{Z}}  \\otimes  \\mathbf{X} \\right) \\left( \\mathbf{I}_{1} \\otimes \\mathbf{Y} \\right) + \\left( \\mathbf{I}_{p} \\otimes \\mathbf{X} \\right)\\left( \\frac{\\partial \\, }{\\partial \\, \\mathbf{Z}} \\otimes \\mathbf{Y} \\right)\\right]^{(m)} =\n",
    "\\left[\n",
    "\\begin{array}{c}\n",
    "\\color{RoyalBlue}{\\begin{matrix}\n",
    "\\dfrac{\\partial \\, X_1 Y_1}{\\partial \\, Z_1} \\\\\n",
    "\\dfrac{\\partial \\, X_2 Y_1}{\\partial \\, Z_1} \\\\\n",
    "\\vdots \\\\\n",
    "\\dfrac{\\partial \\, X_m Y_1}{\\partial \\, Z_1}\n",
    "\\end{matrix}} \\\\\n",
    "\\hline\n",
    "\\color{OrangeRed}{\\begin{matrix}\n",
    "\\dfrac{\\partial \\, X_1 Y_1}{\\partial \\, Z_2} \\\\\n",
    "\\dfrac{\\partial \\, X_2 Y_1}{\\partial \\, Z_2} \\\\\n",
    "\\vdots \\\\\n",
    "\\dfrac{\\partial \\, X_m Y_1}{\\partial \\, Z_2}\n",
    "\\end{matrix}} \\\\\n",
    "\\hline\n",
    "\\vdots \\\\\n",
    "\\hline\n",
    "\\color{YellowGreen}{\\begin{matrix}\n",
    "\\dfrac{\\partial \\, X_1 Y_1}{\\partial \\, Z_p} \\\\\n",
    "\\dfrac{\\partial \\, X_2 Y_1}{\\partial \\, Z_p} \\\\\n",
    "\\vdots \\\\\n",
    "\\dfrac{\\partial \\, X_m Y_1}{\\partial \\, Z_p}\n",
    "\\end{matrix}}\n",
    "\\end{array}\n",
    "\\right] ^{(m)} = \\quad\n",
    "\\begin{bmatrix}\n",
    "\\color{RoyalBlue}{\\dfrac{\\partial \\, X_1 Y_1}{\\partial \\, Z_1}} & \\color{OrangeRed}{\\dfrac{\\partial \\, X_1 Y_1}{\\partial \\, Z_2}} & \\cdots & \\color{YellowGreen}{\\dfrac{\\partial \\, X_1 Y_1}{\\partial \\, Z_p}} \\\\\n",
    "\\color{RoyalBlue}{\\dfrac{\\partial \\, X_2 Y_1}{\\partial \\, Z_1}} & \\color{OrangeRed}{\\dfrac{\\partial \\, X_2 Y_1}{\\partial \\, Z_2}} & \\cdots & \\color{YellowGreen}{\\dfrac{\\partial \\, X_2 Y_1}{\\partial \\, Z_p}}\\\\\n",
    "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "\\color{RoyalBlue}{\\dfrac{\\partial \\, X_m Y_1}{\\partial \\, Z_1}} & \\color{OrangeRed}{\\dfrac{\\partial \\, X_m Y_1}{\\partial \\, Z_2}} & \\cdots & \\color{YellowGreen}{\\dfrac{\\partial \\, X_m Y_1}{\\partial \\, Z_p}}\n",
    "\\end{bmatrix} = \\frac{\\partial \\, \\mathbf{XY}}{\\partial \\, \\mathbf{Z}}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 행렬 미분의  몇가지 공식<sup>[4]</sup>\n",
    "\n",
    "- 머신러닝을 공부하다 보면 역전파 알고리즘이나 정규분포의 MLE를 구할 때 행렬 미분이 쓰이는 경우가 있는데 그때 유용하게 쓸 수 있는 몇개지 공식들을 정리했다.\n",
    "\n",
    "#### [1]  $\\dfrac{\\partial \\, \\mathbf{X}^{-1}}{\\partial \\,  x} = -\\mathbf{X}^{-1}\\dfrac{\\partial \\, \\mathbf{X}}{\\partial \\, x}\\mathbf{X}^{-1}$  (matrix cookbook eq.59)\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "&\\mathbf{X}^{-1} \\mathbf{X} = \\mathbf{I} \\\\[5pt]\n",
    "&\\mathbf{X}^{-1} \\frac{\\partial \\, \\mathbf{X}}{\\partial \\, x} + \\frac{\\partial \\, \\mathbf{X}^{-1}}{\\partial \\, x} \\mathbf{X} = \\mathbf{0} \\\\[5pt]\n",
    "& \\frac{\\partial \\, \\mathbf{X}^{-1}}{\\partial \\, x} \\mathbf{X} = - \\mathbf{X}^{-1} \\frac{\\partial \\, \\mathbf{X}}{\\partial \\, x} \\\\[5pt]\n",
    "&\\frac{\\partial \\, \\mathbf{X}^{-1}}{\\partial \\, x} = - \\mathbf{X}^{-1} \\frac{\\partial \\, \\mathbf{X}}{\\partial \\, x} \\mathbf{X}^{-1}\n",
    "\\end{align} \n",
    "$$\n",
    "\n",
    "#### [2] $\\dfrac{\\partial \\, }{\\partial \\, \\mathbf{A}} \\text{tr}(\\mathbf{AB}) = \\mathbf{B}^{\\text{T}}$  (matrix cookbook eq.100)\n",
    "\n",
    "$$\n",
    "\\text{tr}(\\mathbf{AB}) =\\sum_{i} \\sum_{j} (\\mathbf{A})_{ij}(\\mathbf{B})_{ji}\n",
    "$$\n",
    "\n",
    "이므로 인덱스 형식으로 쓰면 \n",
    "\n",
    "$$\n",
    "\\dfrac{\\partial \\, }{\\partial \\, (\\mathbf{A})_{ij}} \\sum_{i} \\sum_{j} (\\mathbf{A})_{ij}(\\mathbf{B})_{ji} = (\\mathbf{B})_{ji}\n",
    "$$\n",
    "\n",
    "따라서\n",
    "\n",
    "$$\\dfrac{\\partial \\, }{\\partial \\, \\mathbf{A}} \\text{tr}(\\mathbf{AB}) = \\mathbf{B}^{\\text{T}}$$\n",
    "\n",
    "같은 방법으로\n",
    "\n",
    "$$\n",
    "\\dfrac{\\partial \\, }{\\partial \\, (\\mathbf{B})_{ji}} \\sum_{i} \\sum_{j} (\\mathbf{A})_{ij}(\\mathbf{B})_{ji} = (\\mathbf{A})_{ij}\n",
    "$$\n",
    "\n",
    "따라서 \n",
    "\n",
    "$$\\dfrac{\\partial \\, }{\\partial \\, \\mathbf{B}} \\text{tr}(\\mathbf{AB}) = \\mathbf{A}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### [3] $\\dfrac{\\partial \\,  \\lvert \\mathbf{X} \\rvert}{\\partial \\, \\mathbf{X}} = \\lvert \\mathbf{X} \\rvert \\left(\\mathbf{X}^{-1}\\right)^{\\text{T}}$ (matrix cookbook eq.49)\n",
    "\n",
    "위 식을 보이기 위해서는 미리 알아야할 내용이 조금 있다. 우선 역행렬은 다음처럼 구할 수 있으며<sup>[5]</sup>\n",
    "\n",
    "$$\n",
    "\\mathbf{X}^{-1} = \\frac{1}{\\lvert \\mathbf{X} \\rvert } \\left[ C_{ij} \\right]^{\\text{T}}\n",
    "$$\n",
    "\n",
    "위 식에서 $C_{ij}$는 다음처럼 정의되는 여인수이다. $M_{ij}$는 $\\mathbf{X}$의 i행과 j열을 제외하여 얻은 부분 행렬의 행렬식을 나타낸다.\n",
    "\n",
    "$$\n",
    "C_{ij} = (-1)^{i+j}M_{ij}\n",
    "$$\n",
    "\n",
    "이 여인수의 행렬의 전치 $\\left[ C_{ij} \\right]^{\\text{T}}$를 adjugate 행렬<sup>[6]</sup>이라하고 다음처럼 표시한다.\n",
    "\n",
    "$$\n",
    "\\text{adj}(\\mathbf{X}) = \\left[ C_{ij} \\right]^{\\text{T}}\n",
    "$$\n",
    "\n",
    "이를 이용하여 역행렬을 다시 나타내면\n",
    "\n",
    "$$\n",
    "\\mathbf{X}^{-1} = \\frac{1}{\\lvert \\mathbf{X} \\rvert } \\text{adj}(\\mathbf{X}) \\tag{1}\n",
    "$$\n",
    "\n",
    "표준 미분소 표현<sup>Canonical differential form</sup>에 대해 이와 동등한 미분 또는 도함수표현<sup>Equivalent derivative form</sup>을 다음과 같이 몇가지를 써볼 수 있다.<sup>[7]</sup>\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "dy = a dx &\\implies \\frac{dy}{dx} = a \\\\[5pt]\n",
    "dy = \\mathbf{a} d\\mathbf{x} &\\implies \\frac{dy}{\\text{d}\\mathbf{x}} = \\mathbf{a} \\\\[5pt]\n",
    "dy = \\text{tr}(\\mathbf{A} \\text{d}\\mathbf{X}) &\\implies \\frac{dy}{\\text{d}\\mathbf{X}} = \\mathbf{A}\n",
    "\\end{align} \\tag{2}\n",
    "$$\n",
    "\n",
    "위 식에서 $dy$, $dx$, $\\text{d}\\mathbf{X}$는 미분소<sup>differential or infinitesimal</sup>로 변수의 미소변량을 나타내고 이 미소변량의 비율인 $\\frac{dy}{dx}$, $\\frac{dy}{\\text{d}\\mathbf{X}}$을 미분 또는 도함수<sup>derivative</sup>라 한다.<sup>[8],[9]</sup> \n",
    "\n",
    "세번째 식은 위에서 보인 $\\frac{\\partial \\, }{\\partial \\, \\mathbf{B}} \\text{tr}(\\mathbf{AB}) = \\mathbf{A}$를 이용하면\n",
    "\n",
    "$$\n",
    "\\frac{ \\text{tr}(\\mathbf{A} \\text{d}\\mathbf{X}) }{\\text{d}\\mathbf{X}} =  \\frac{ \\text{d}\\left(\\text{tr}(\\mathbf{A} \\mathbf{X})\\right) }{\\text{d}\\mathbf{X}}= \\mathbf{A}\n",
    "$$\n",
    "\n",
    "임을 바로 알 수 있다.\n",
    "\n",
    "한편 행렬식의 미분에 관한 야코비 공식<sup>Jacobi's_formula[10]</sup>이 있는데 여기서 이를 증명하기는 너무 길고 지루하므로 일단 다음 결과를 받아 들이도록 한다.\n",
    "\n",
    "$$\n",
    "\\text{d} \\lvert \\mathbf{X} \\rvert = \\text{tr} \\left( \\text{adj}(\\mathbf{X}) \\text{d}\\mathbf{X} \\right) \\tag{3}\n",
    "$$\n",
    "\n",
    "증명은 위키에 아주 자세히 나와 있다.\n",
    "\n",
    "이상의 내용을 이용하면 보이고자 하는 미분은 비교적 간단하게 보일 수 있다. 식(1)로 부터\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\mathbf{X}^{-1} &= \\frac{1}{\\lvert \\mathbf{X} \\rvert} \\text{adj}(\\mathbf{X}) \\\\[5pt]\n",
    "\\lvert \\mathbf{X} \\rvert  \\mathbf{X}^{-1} &= \\text{adj}(\\mathbf{X}) \\\\[5pt]\n",
    "\\lvert \\mathbf{X} \\rvert  \\mathbf{X}^{-1} \\partial \\mathbf{X} &= \\text{adj}(\\mathbf{X})\\partial \\mathbf{X} \\\\[5pt]\n",
    "\\text{tr}\\left(\\lvert \\mathbf{X} \\rvert  \\mathbf{X}^{-1} \\partial \\mathbf{X}\\right) &= \\text{tr}\\left(\\text{adj}(\\mathbf{X})\\partial \\mathbf{X}\\right)\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "이며 식(3)에 의해\n",
    "\n",
    "$$\n",
    "\\partial \\lvert \\mathbf{X} \\rvert = \\text{tr}\\left(\\lvert \\mathbf{X} \\rvert  \\mathbf{X}^{-1} \\partial \\mathbf{X}\\right)\n",
    "$$\n",
    "\n",
    "가 되고 식(2) 3번째 식에 의해\n",
    "\n",
    "$$\n",
    "\\partial \\lvert \\mathbf{X} \\rvert = \\text{tr}\\left(\\lvert \\mathbf{X} \\rvert  \\mathbf{X}^{-1}  \\partial \\mathbf{X}\\right) \\implies \\frac{ \\partial \\, \\lvert \\mathbf{X} \\rvert}{ \\partial\\mathbf{X}} = \\lvert \\mathbf{X} \\rvert  \\mathbf{X}^{-1}\n",
    "$$\n",
    "\n",
    "가 됨을 알 수 있다.\n",
    "\n",
    "\n",
    "또는 좀 더 풀어 써보면 $\\frac{ \\text{tr}(\\mathbf{A} \\text{d}\\mathbf{X}) }{\\text{d}\\mathbf{X}} =  \\mathbf{A}$에 의해\n",
    "\n",
    "$$\n",
    "\\frac{ \\partial \\, \\lvert \\mathbf{X} \\rvert}{ \\partial \\, \\mathbf{X}} = \\frac{ \\text{tr}\\left(\\lvert \\mathbf{X} \\rvert  \\mathbf{X}^{-1} \\color{RoyalBlue}{ \\partial \\mathbf{X}}\\right) }{\\color{RoyalBlue}{ \\partial\\mathbf{X}}} = \\lvert \\mathbf{X} \\rvert  \\mathbf{X}^{-1}\n",
    "$$\n",
    "\n",
    "한편 $\\text{tr}(\\mathbf{AB}) = \\text{tr}(\\mathbf{BA})$ 이므로\n",
    "\n",
    "$$\n",
    "\\frac{ \\partial \\, \\lvert \\mathbf{X} \\rvert}{ \\partial\\mathbf{X}} = \\frac{ \\text{tr}\\left(\\lvert \\mathbf{X} \\rvert  \\mathbf{X}^{-1} \\color{RoyalBlue}{ \\partial \\mathbf{X}}\\right) }{\\color{RoyalBlue}{ \\partial \\mathbf{X}}} = \\frac{ \\text{tr}\\left( \\color{RoyalBlue}{  \\partial \\mathbf{X}} \\lvert \\mathbf{X} \\rvert  \\mathbf{X}^{-1} \\right) }{\\color{RoyalBlue}{  \\partial \\mathbf{X}}} = \\left( \\lvert \\mathbf{X} \\rvert  \\mathbf{X}^{-1} \\right)^{\\text{T}} =  \\lvert \\mathbf{X} \\rvert \\left(  \\mathbf{X}^{-1} \\right)^{\\text{T}}\n",
    "$$\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### [4] $\\dfrac{\\partial \\, }{\\partial \\, \\mathbf{X}} \\ln \\lvert \\mathbf{X} \\rvert = \\left( \\mathbf{X}^{-1} \\right)^{\\text{T}}$  (matrix cookbook eq.57)\n",
    "\n",
    "위 결과를 이용하면\n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, }{\\partial \\, \\mathbf{X}} \\ln \\lvert \\mathbf{X} \\rvert  = \\frac{1}{ \\lvert \\mathbf{X} \\rvert } \\frac{\\partial \\, \\lvert \\mathbf{X} \\rvert}{\\partial \\, \\mathbf{X}} = \\frac{1}{ \\lvert \\mathbf{X} \\rvert }  \\lvert \\mathbf{X} \\rvert \\mathbf{X}^{-1}=\\mathbf{X}^{-1}\n",
    "$$\n",
    "\n",
    "또는\n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, }{\\partial \\, \\mathbf{X}} \\ln \\lvert \\mathbf{X} \\rvert  = \\frac{1}{ \\lvert \\mathbf{X} \\rvert } \\frac{\\partial \\, \\lvert \\mathbf{X} \\rvert}{\\partial \\, \\mathbf{X}} = \\frac{1}{ \\lvert \\mathbf{X} \\rvert }  \\lvert \\mathbf{X} \\rvert \\left(\\mathbf{X}^{-1}\\right)^{\\text{T}}= \\left(\\mathbf{X}^{-1}\\right)^{\\text{T}}\n",
    "$$\n",
    "\n",
    "#### [5] $\\dfrac{\\partial \\, \\mathbf{a}^{\\text{T}} \\mathbf{X} \\mathbf{b}}{\\partial \\, \\mathbf{X}}=\\mathbf{ab}^{\\text{T}}$ (matrix cookbook eq.70)\n",
    "\n",
    "$\\mathbf{a}^{\\text{T}} : 1 \\times m$, $\\mathbf{X} : m \\times n$, $\\mathbf{b} : n \\times 1$ 인 임의의 벡터와 행렬이라고 가정한다. \n",
    "\n",
    "$\\mathbf{a}^{\\text{T}} \\mathbf{X} \\mathbf{b}$는 결과가 숫자 이므로 $\\text{tr}(\\mathbf{a}^{\\text{T}} \\mathbf{X} \\mathbf{b})$로 트레이스를 씌워도 결과가 변하지 않는다. 따라서 $\\frac{\\partial \\, }{\\partial \\, \\mathbf{A}} \\text{tr}(\\mathbf{AB}) = \\mathbf{B}^{\\text{T}}$을 사용하면 다음처럼 간단히 보일 수 있다.\n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, \\mathbf{a}^{\\text{T}} \\mathbf{X} \\mathbf{b}}{\\partial \\, \\mathbf{X}} =  \\frac{\\partial \\, \\text{tr}\\left(\\mathbf{a}^{\\text{T}} \\mathbf{X} \\mathbf{b}\\right)}{\\partial \\, \\mathbf{X}} =  \\frac{\\partial \\, \\text{tr}\\left(\\mathbf{X} \\mathbf{b} \\mathbf{a}^{\\text{T}} \\right)}{\\partial \\, \\mathbf{X}} = \\left( \\mathbf{b} \\mathbf{a}^{\\text{T}} \\right)^{\\text{T}} = \\mathbf{a}\\mathbf{b}^{\\text{T}}\n",
    "$$\n",
    "\n",
    "또는 약간 번거롭지만 크로네커 곱과 곱의 미분법을 그대로 적용해서도 보일 수 있다.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\begin{align}\n",
    "\\frac{\\partial \\, \\mathbf{a}^{\\text{T}} \\mathbf{X} \\mathbf{b}}{\\partial \\, \\mathbf{X}} \n",
    "&= \\frac{\\partial \\, \\left(\\mathbf{a}^{\\text{T}} \\mathbf{X} \\right) \\mathbf{b}}{\\partial \\, \\mathbf{X}} \\\\[5pt]\n",
    "&= \\underbrace{\\frac{\\partial \\, \\mathbf{a}^{\\text{T}} \\mathbf{X}}{\\partial \\, \\mathbf{X}}}_{m \\times n^2} \\underbrace{\\left( \\mathbf{I}_n \\otimes \\mathbf{b} \\right)}_{n^2 \\times n} + \\underbrace{\\left(\\mathbf{I}_m \\otimes \\mathbf{a}^{\\text{T}} \\mathbf{X}\\right)}_{m \\times mn} \\underbrace{\\frac{\\partial \\, \\mathbf{b}}{\\partial \\, \\mathbf{X}}}_{mn \\times n} \\\\[5pt]\n",
    "&= \\left\\{ \\underbrace{\\frac{\\partial \\, \\mathbf{a}^{\\text{T}}}{\\partial \\, \\mathbf{X}}}_{m \\times mn} \\underbrace{ \\left(\\mathbf{I}_n \\otimes \\mathbf{X} \\right)}_{mn \\times n^2} + \\underbrace{\\left( \\mathbf{I}_m \\otimes \\mathbf{a}^{\\text{T}}\\right)}_{m \\times m^2} \\underbrace{ \\frac{\\partial \\, \\mathbf{X}}{\\partial \\, \\mathbf{X}}}_{m^2 \\times n^2} \\right\\} \\left( \\mathbf{I}_n \\otimes \\mathbf{b} \\right) \\\\[5pt]\n",
    "&= \\left( \\mathbf{I}_m \\otimes \\mathbf{a}^{\\text{T}}\\right) \\frac{\\partial \\, \\mathbf{X}}{\\partial \\, \\mathbf{X}} \\left( \\mathbf{I}_n \\otimes \\mathbf{b} \\right)\n",
    "\\end{align}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\\left( \\mathbf{I}_m \\otimes \\mathbf{a}^{\\text{T}}\\right)$은 $m \\times m$인 부분행렬이 행방향으로 $m$개 늘어선 형태로 $m \\times m^2$인 행렬이 된다. 부분행렬은 그 행렬이 전체 행렬에서 위치하는 곳의 행을 $\\mathbf{a}^{\\text{T}}$로 가지는 행렬이다.\n",
    "\n",
    "$$\n",
    "\\left( \\mathbf{I}_m \\otimes \\mathbf{a}^{\\text{T}}\\right) =\n",
    "\\begin{bmatrix}\n",
    "\\color{RoyalBlue}{\\begin{matrix}a_1 & a_2 & \\cdots & a_m\\end{matrix}} & | & \\mathbf{0}^{\\text{T}} & | & \\cdots & | & \\mathbf{0}^{\\text{T}} \\\\\n",
    "\\mathbf{0}^{\\text{T}} & | & \\color{RoyalBlue}{\\begin{matrix}a_1 & a_2 & \\cdots & a_m\\end{matrix}} & | & \\cdots &  | & \\mathbf{0}^{\\text{T}} \\\\\n",
    "\\vdots & | & \\vdots & | & \\ddots & | & \\vdots \\\\\n",
    "\\mathbf{0}^{\\text{T}} & | & \\mathbf{0}^{\\text{T}} & | & \\cdots & | & \\color{RoyalBlue}{\\begin{matrix}a_1 & a_2 & \\cdots & a_m\\end{matrix}} \n",
    "\\end{bmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\\frac{\\partial \\, \\mathbf{X}}{\\partial \\, \\mathbf{X}}$은 $m \\times n$ 부분 행렬이 $m \\times n$으로 바둑판 형식으로 늘어선 행렬로 $m^2 \\times n^2$행렬이 되며 여기서 각 부분 행렬은 전체 행렬에서 그 부분행렬이 위치하는 자리만 1이고 나머지는 모두 0인 행렬이 된다.\n",
    "\n",
    "즉, 아래 식처럼 첫번째 부분행렬은 (1,1)만 1이고 나머지는 모두 0인 부분행렬이고, 그 오른쪽 옆 행렬은 (1,2)만 1이고 나머지는 모두 0인 부분행렬이 되는 식이다.\n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, \\mathbf{X}}{\\partial \\, \\mathbf{X}} =\n",
    "\\left[\n",
    "\\begin{array}{c|c|c|c}\n",
    "\\begin{matrix}\n",
    "1 & 0 & \\cdots & 0 \\\\ 0 & 0 & \\cdots & 0 \\\\ \\vdots & \\vdots & \\ddots & \\vdots \\\\  0 & 0 & \\cdots & 0\n",
    "\\end{matrix} & \n",
    "\\begin{matrix}\n",
    "0 & 1 & \\cdots & 0 \\\\ 0 & 0 & \\cdots & 0 \\\\ \\vdots & \\vdots & \\ddots & \\vdots \\\\  0 & 0 & \\cdots & 0\n",
    "\\end{matrix} &\n",
    "\\begin{matrix}\\cdots \\\\ \\cdots \\\\ \\cdots \\\\ \\cdots \\end{matrix} &\n",
    "\\begin{matrix}\n",
    "0 & 0 & \\cdots & 1 \\\\ 0 & 0 & \\cdots & 0 \\\\ \\vdots & \\vdots & \\ddots & \\vdots \\\\  0 & 0 & \\cdots & 0\n",
    "\\end{matrix} \\\\\n",
    "\\begin{matrix}-&-&-&-\\end{matrix} & \\begin{matrix}-&-&-&-\\end{matrix} & \\begin{matrix}-&-&-&-\\end{matrix} & \\begin{matrix}-&-&-&-\\end{matrix} \\\\\n",
    "\\begin{matrix}\n",
    "0 & 0 & \\cdots & 0 \\\\ 1 & 0 & \\cdots & 0 \\\\ \\vdots & \\vdots & \\ddots & \\vdots \\\\  0 & 0 & \\cdots & 0\n",
    "\\end{matrix} & \n",
    "\\begin{matrix}\n",
    "0 & 0 & \\cdots & 0 \\\\ 0 & 1 & \\cdots & 0 \\\\ \\vdots & \\vdots & \\ddots & \\vdots \\\\  0 & 0 & \\cdots & 0\n",
    "\\end{matrix} &\n",
    "\\begin{matrix}\\cdots \\\\ \\cdots \\\\ \\cdots \\\\ \\cdots \\end{matrix} &\n",
    "\\begin{matrix}\n",
    "0 & 0 & \\cdots & 0 \\\\ 0 & 0 & \\cdots & 1 \\\\ \\vdots & \\vdots & \\ddots & \\vdots \\\\  0 & 0 & \\cdots & 0\n",
    "\\end{matrix} \\\\\n",
    "\\begin{matrix}-&-&-&-\\end{matrix} & \\begin{matrix}-&-&-&-\\end{matrix} & \\begin{matrix}-&-&-&-\\end{matrix} & \\begin{matrix}-&-&-&-\\end{matrix} \\\\\n",
    "\\begin{matrix}\\vdots&\\vdots&\\vdots&\\vdots\\end{matrix} & \\begin{matrix}\\vdots&\\vdots&\\vdots&\\vdots\\end{matrix} & \\begin{matrix}\\vdots&\\vdots&\\vdots&\\vdots\\end{matrix} & \\begin{matrix}\\vdots&\\vdots&\\vdots&\\vdots\\end{matrix} \\\\\n",
    "\\begin{matrix}-&-&-&-\\end{matrix} & \\begin{matrix}-&-&-&-\\end{matrix} & \\begin{matrix}-&-&-&-\\end{matrix} & \\begin{matrix}-&-&-&-\\end{matrix} \\\\\n",
    "\\begin{matrix}\n",
    "0 & 0 & \\cdots & 0 \\\\ 0 & 0 & \\cdots & 0 \\\\ \\vdots & \\vdots & \\ddots & \\vdots \\\\  1 & 0 & \\cdots & 0\n",
    "\\end{matrix} & \n",
    "\\begin{matrix}\n",
    "0 & 0 & \\cdots & 0 \\\\ 0 & 0 & \\cdots & 0 \\\\ \\vdots & \\vdots & \\ddots & \\vdots \\\\  0 & 1 & \\cdots & 0\n",
    "\\end{matrix} &\n",
    "\\begin{matrix}\\cdots \\\\ \\cdots \\\\ \\cdots \\\\ \\cdots \\end{matrix} &\n",
    "\\begin{matrix}\n",
    "0 & 0 & \\cdots & 0 \\\\ 0 & 0 & \\cdots & 0 \\\\ \\vdots & \\vdots & \\ddots & \\vdots \\\\  0 & 0 & \\cdots & 1\n",
    "\\end{matrix}\n",
    "\\end{array}\n",
    "\\right]\n",
    "$$\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "위 두 행렬을 먼저 곱하면 $m \\times n$ 부분행렬이 $n$개 만큼 행방향으로 늘어선 $m \\times n^2$인 행렬이 되는데 전체 행렬에서 부분행렬이 있는 위치의 열이 $\\mathbf{a}$가 되는 행렬이다.\n",
    "\n",
    "$$\n",
    "\\left( \\mathbf{I}_m \\otimes \\mathbf{a}^{\\text{T}}\\right) \\frac{\\partial \\, \\mathbf{X}}{\\partial \\, \\mathbf{X}} = \n",
    "\\left[\n",
    "\\begin{array}{c|c|c|c}\n",
    "\\begin{matrix}\n",
    "a_1 & 0 & \\cdots & 0 \\\\ a_2 & 0 & \\cdots & 0 \\\\ \\vdots & \\vdots & \\ddots & \\vdots \\\\  a_m & 0 & \\cdots & 0\n",
    "\\end{matrix} &\n",
    "\\begin{matrix}\n",
    "0 & a_1 & \\cdots & 0 \\\\ 0 & a_2 & \\cdots & 0 \\\\ \\vdots & \\vdots & \\ddots & \\vdots \\\\  0 & a_m & \\cdots & 0\n",
    "\\end{matrix} & \n",
    "\\begin{matrix}\\cdots \\\\ \\cdots \\\\ \\cdots \\\\ \\cdots \\end{matrix} &\n",
    "\\begin{matrix}\n",
    "0 & 0 & \\cdots & a_1 \\\\ 0 & 0 & \\cdots & a_2 \\\\ \\vdots & \\vdots & \\ddots & \\vdots \\\\  0 & 0 & \\cdots & a_m\n",
    "\\end{matrix}\n",
    "\\end{array}\n",
    "\\right]\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "한편 $\\left( \\mathbf{I}_n \\otimes \\mathbf{b} \\right)$은 $n \\times n$ 부분행렬이 열방향으로 늘어선 $n^2 \\times n$인 행렬로 전체 행렬에서 부분행렬이 있는 위치의 열이 $\\mathbf{b}$가 되는 행렬이다.\n",
    "\n",
    "$$\n",
    "\\left( \\mathbf{I}_n \\otimes \\mathbf{b} \\right) = \n",
    "\\begin{bmatrix}\n",
    "\\begin{matrix}\n",
    "b_1 & 0 & \\cdots & 0 \\\\ b_2 & 0 & \\cdots & 0 \\\\ \\vdots & \\vdots & \\ddots & \\vdots \\\\ b_n & 0 & \\cdots & 0\n",
    "\\end{matrix} \\\\\n",
    "\\begin{matrix} - & - & - & - \\end{matrix} \\\\\n",
    "\\begin{matrix}\n",
    "0 & b_1 & \\cdots & 0 \\\\ 0 & b_2 & \\cdots & 0 \\\\ \\vdots & \\vdots & \\ddots & \\vdots \\\\ 0 & b_n & \\cdots & 0\n",
    "\\end{matrix} \\\\\n",
    "\\begin{matrix} - & - & - & - \\end{matrix} \\\\\n",
    "\\begin{matrix} \\vdots & \\vdots & \\vdots & \\vdots \\end{matrix}\\\\\n",
    "\\begin{matrix} - & - & - & - \\end{matrix} \\\\\n",
    "\\begin{matrix}\n",
    "0 & 0 & \\cdots & b_1 \\\\ 0 & 0 & \\cdots & b_2 \\\\ \\vdots & \\vdots & \\ddots & \\vdots \\\\ 0 & 0 & \\cdots & b_n\n",
    "\\end{matrix} \n",
    "\\end{bmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "마지막으로 두 행렬을 곱하면 원하는 결과를 얻을 수 있다.\n",
    "\n",
    "$$\n",
    "\\left( \\mathbf{I}_m \\otimes \\mathbf{a}^{\\text{T}}\\right) \\frac{\\partial \\, \\mathbf{X}}{\\partial \\, \\mathbf{X}} \\left( \\mathbf{I}_n \\otimes \\mathbf{b} \\right) = \n",
    "\\begin{bmatrix}\n",
    "a_1 b_1 & a_1 b_2 & \\cdots & a_1 b_n \\\\\n",
    "a_2 b_1 & a_2 b_2 & \\cdots & a_2 b_n \\\\\n",
    "\\vdots  & \\vdots  & \\ddots & \\vdots \\\\\n",
    "a_m b_1 & a_m b_2 & \\cdots & a_m b_n\n",
    "\\end{bmatrix} = \\mathbf{a} \\mathbf{b}^{\\text{T}}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### [6] $\\dfrac{\\partial \\, \\mathbf{a}^{\\text{T}} \\mathbf{X}^{-1} \\mathbf{b}}{\\partial \\, \\mathbf{X}}= -\\mathbf{X}^{\\text{T}} \\mathbf{ab}^{\\text{T}} \\mathbf{X}^{-\\text{T}}$ (matrix cookbook eq.61)\n",
    "\n",
    "위 미분은 앞선 matrix cookbook eq.70 미분 공식과 크로네커 곱의 두 성질<sup>[11]</sup>\n",
    "\n",
    "$$\n",
    "\\left( \\mathbf{A} \\otimes \\mathbf{B} \\right)^{-1} = \\mathbf{A}^{-1} \\otimes \\mathbf{B}^{-1}\n",
    "$$\n",
    "\n",
    "$$\n",
    "\\left( \\mathbf{A} \\otimes \\mathbf{B} \\right)\\left( \\mathbf{C} \\otimes \\mathbf{D} \\right) = \\mathbf{A}\\mathbf{C} \\otimes  \\mathbf{B}\\mathbf{D}\n",
    "$$\n",
    "\n",
    "을 이용하여 보일 수 있다.\n",
    "\n",
    "역행렬을 가지는 $\\mathbf{X} : m \\times m$와 $\\mathbf{a}^{\\text{T}} : 1 \\times m$, $\\mathbf{b} : m \\times 1$ 임의의 벡터를 가정한다. \n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\frac{\\partial \\, \\mathbf{a}^{\\text{T}} \\mathbf{X}^{-1} \\mathbf{b}}{\\partial \\, \\mathbf{X}} \n",
    "&= \\frac{\\partial \\, \\mathbf{a}^{\\text{T}} \\mathbf{X}^{-1}}{\\partial \\, \\mathbf{X}} \\left( \\mathbf{I}_m \\otimes \\mathbf{b} \\right) + \\left( \\mathbf{I}_m \\otimes \\mathbf{a}^{\\text{T}} \\mathbf{X}^{-1} \\right) \\frac{\\partial \\, \\mathbf{b}}{\\partial \\, \\mathbf{X}} \\\\[5pt]\n",
    "&= \\left( \\frac{\\partial \\, \\mathbf{a}^{\\text{T}}}{\\partial \\, \\mathbf{X}} \\left( \\mathbf{I}_m \\otimes \\mathbf{X}^{-1} \\right) + \\left( \\mathbf{I}_m \\otimes \\mathbf{a}^{\\text{T}} \\right) \\frac{\\partial \\, \\mathbf{X}^{-1}}{\\partial \\, \\mathbf{X}} \\right) \\left( \\mathbf{I}_m \\otimes \\mathbf{b} \\right) \\\\[5pt]\n",
    "&= \\left( \\mathbf{I}_m \\otimes \\mathbf{a}^{\\text{T}} \\right)  \\frac{\\partial \\, \\mathbf{X}^{-1}}{\\partial \\, \\mathbf{X}} \\left( \\mathbf{I}_m \\otimes \\mathbf{b} \\right)\n",
    "\\end{align} \\tag{1}\n",
    "$$\n",
    "\n",
    "한편 $\\mathbf{X}^{-1} \\mathbf{X} = \\mathbf{I}$에서\n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\,\\mathbf{X}^{-1} \\mathbf{X}}{\\partial \\, \\mathbf{X}} = \\frac{\\partial \\, \\mathbf{I}}{\\partial \\, \\mathbf{X}} \\\\[5pt]\n",
    "\\frac{\\partial \\,\\mathbf{X}^{-1} }{\\partial \\, \\mathbf{X}}\\left( \\mathbf{I}_m \\otimes \\mathbf{X} \\right) + \\left( \\mathbf{I}_m \\otimes \\mathbf{X}^{-1} \\right) \\frac{\\partial \\, \\mathbf{X}}{\\partial \\, \\mathbf{X}} = \\mathbf{0} \\\\[5pt]\n",
    "\\frac{\\partial \\,\\mathbf{X}^{-1} }{\\partial \\, \\mathbf{X}} \\left( \\mathbf{I}_m \\otimes \\mathbf{X} \\right) \\left( \\mathbf{I}_m \\otimes \\mathbf{X} \\right)^{-1} + \\left( \\mathbf{I}_m \\otimes \\mathbf{X}^{-1} \\right) \\frac{\\partial \\, \\mathbf{X}}{\\partial \\, \\mathbf{X}} \\left( \\mathbf{I}_m \\otimes \\mathbf{X} \\right)^{-1}= \\mathbf{0} \\\\[5pt]\n",
    "\\frac{\\partial \\,\\mathbf{X}^{-1} }{\\partial \\, \\mathbf{X}}  =  - \\left( \\mathbf{I}_m \\otimes \\mathbf{X}^{-1} \\right) \\frac{\\partial \\, \\mathbf{X}}{\\partial \\, \\mathbf{X}} \\left( \\mathbf{I}_m \\otimes \\mathbf{X} \\right)^{-1}\n",
    "$$\n",
    "\n",
    "이제 $ \\left( \\mathbf{A} \\otimes \\mathbf{B} \\right)^{-1} = \\mathbf{A}^{-1} \\otimes \\mathbf{B}^{-1}$를 이용하면 다음처럼 된다.\n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\,\\mathbf{X}^{-1} }{\\partial \\, \\mathbf{X}}  =  - \\left( \\mathbf{I}_m \\otimes \\mathbf{X}^{-1} \\right) \\frac{\\partial \\, \\mathbf{X}}{\\partial \\, \\mathbf{X}} \\left( \\mathbf{I}_m \\otimes \\mathbf{X}^{-1} \\right) \\tag{2}\n",
    "$$\n",
    "\n",
    "(2)를 (1)에 대입하면\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\frac{\\partial \\, \\mathbf{a}^{\\text{T}} \\mathbf{X}^{-1} \\mathbf{b}}{\\partial \\, \\mathbf{X}} &=  - \\left( \\mathbf{I}_m \\otimes \\mathbf{a}^{\\text{T}} \\right) \\left( \\mathbf{I}_m \\otimes \\mathbf{X}^{-1} \\right) \\frac{\\partial \\, \\mathbf{X}}{\\partial \\, \\mathbf{X}} \\left( \\mathbf{I}_m \\otimes \\mathbf{X}^{-1} \\right) \\left( \\mathbf{I}_m \\otimes \\mathbf{b} \\right) \\\\[5pt]\n",
    "&= - \\left( \\mathbf{I}_m \\otimes \\mathbf{a}^{\\text{T}} \\mathbf{X}^{-1} \\right) \\frac{\\partial \\, \\mathbf{X}}{\\partial \\, \\mathbf{X}} \\left( \\mathbf{I}_m \\otimes \\mathbf{X}^{-1} \\mathbf{b} \\right) \\quad \\because \\left( \\mathbf{A} \\otimes \\mathbf{B} \\right)\\left( \\mathbf{C} \\otimes \\mathbf{D} \\right) = \\mathbf{A}\\mathbf{C} \\otimes  \\mathbf{B}\\mathbf{D}\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "위 식과 앞선 미분공식 \n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\frac{\\partial \\, \\mathbf{a}^{\\text{T}} \\mathbf{X} \\mathbf{b}}{\\partial \\, \\mathbf{X}} \n",
    "= \\left( \\mathbf{I}_m \\otimes \\mathbf{a}^{\\text{T}}\\right) \\frac{\\partial \\, \\mathbf{X}}{\\partial \\, \\mathbf{X}} \\left( \\mathbf{I}_n \\otimes \\mathbf{b} \\right) = \\mathbf{a} \\mathbf{b}^{\\text{T}}\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "을 보일 때의 과정을 비교하면 최종적으로 다음을 보일 수 있다.\n",
    "\n",
    "\n",
    "$$\n",
    "\\frac{\\partial \\, \\mathbf{a}^{\\text{T}} \\mathbf{X}^{-1} \\mathbf{b}}{\\partial \\, \\mathbf{X}} = - \\mathbf{X}^{-\\text{T}} \\mathbf{a} \\mathbf{b}^{\\text{T}} \\mathbf{X}^{-\\text{T}} \n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 참고문헌\n",
    "\n",
    "1. COURSE NOTES: STATISTICS 550 ADVANCED MATHEMATICAL STATISTICS SPRING 2008, Robert J. Boik, Department of Mathematical Sciences\n",
    "Montana State University, 2012\n",
    "\n",
    "2. Old and New Matrix Algebra Useful for Statistics, Thomas P., Minka (December 28, 2000), MIT Media Lab note (1997; revised 12/00). Retrieved 5 February 2016.\n",
    "\n",
    "3. Linear Algebra & Matrix Calculus:https://www.slideshare.net/ssuser7e10e4/matrix-calculus, 임성빈\n",
    "\n",
    "3. The Matrix Cookbook, Kaare Brandt Petersen & Michael Syskind Pedersen, 2012\n",
    "\n",
    "5. Advanced Engineering Mathematics 7.7 & 7.8, Erwin Kreyszig, Wiley\n",
    "\n",
    "6. Adjugate_matrix:https://en.wikipedia.org/wiki/Adjugate_matrix\n",
    "\n",
    "7. Matrix_calculus:https://en.wikipedia.org/wiki/Matrix_calculus\n",
    "\n",
    "8. Differential_(infinitesimal):https://en.wikipedia.org/wiki/Differential_(infinitesimal) 주의:(infinitesimal)까지 모두 주소 '(' 앞에 _ 있음\n",
    "\n",
    "9. Derivative:https://en.wikipedia.org/wiki/Derivative\n",
    "\n",
    "10. Jacobi's_formula:https://en.wikipedia.org/wiki/Jacobi%27s_formula\n",
    "\n",
    "11. Kronecker product:https://en.wikipedia.org/wiki/Kronecker_product"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<link href='https://fonts.googleapis.com/earlyaccess/nanummyeongjo.css' rel='stylesheet' type='text/css'>\n",
       "<link href='https://fonts.googleapis.com/earlyaccess/nanumgothiccoding.css' rel='stylesheet' type='text/css'>\n",
       "<link href='https://fonts.googleapis.com/earlyaccess/notosanskr.css' rel='stylesheet' type='text/css'>\n",
       "<!--https://github.com/kattergil/NotoSerifKR-Web/stargazers-->\n",
       "<link href='https://cdn.rawgit.com/kattergil/NotoSerifKR-Web/5e08423b/stylesheet/NotoSerif-Web.css' rel='stylesheet' type='text/css'>\n",
       "<style>\n",
       "    h1     { font-family: 'Noto Sans KR' !important; color:#348ABD !important;   }\n",
       "    h2     { font-family: 'Noto Sans KR' !important; color:#467821 !important;   }\n",
       "    h3, h4 { font-family: 'Noto Sans KR' !important; color:#A60628 !important;   }\n",
       "    p:not(.navbar-text) { font-family: 'Noto Serif KR', 'Nanum Myeongjo'; font-size: 12pt; line-height: 200%;  text-indent: 10px; }\n",
       "    li:not(.dropdown){ font-family: 'Noto Serif KR', 'Nanum Myeongjo'; font-size: 12pt; line-height: 200%; }\n",
       "    table  { font-family: 'Noto Sans KR' !important;  font-size: 11pt !important; }           \n",
       "    li > p  { text-indent: 0px; }\n",
       "    li > ul { margin-top: 0px !important; }       \n",
       "    sup { font-family: 'Noto Sans KR'; font-size: 9pt; } \n",
       "    code, pre  { font-family: 'Nanum Gothic Coding', monospace !important; font-size: 12pt !important; line-height: 130% !important;}\n",
       "    .code-body { font-family: 'Nanum Gothic Coding', monospace !important; font-size: 12pt !important;}\n",
       "    .ns        { font-family: 'Noto Sans KR'; font-size: 15pt;}\n",
       "    .summary   {\n",
       "                   font-family: 'Georgia'; font-size: 12pt; line-height: 200%; \n",
       "                   border-left:3px solid #FF0000; \n",
       "                   padding-left:20px; \n",
       "                   margin-top:10px; \n",
       "               }\n",
       "    .green { color:#467821 !important; }\n",
       "    .comment { font-family: 'Noto Sans KR'; font-size: 10pt; }\n",
       "</style>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%%html\n",
    "<link href='https://fonts.googleapis.com/earlyaccess/nanummyeongjo.css' rel='stylesheet' type='text/css'>\n",
    "<link href='https://fonts.googleapis.com/earlyaccess/nanumgothiccoding.css' rel='stylesheet' type='text/css'>\n",
    "<link href='https://fonts.googleapis.com/earlyaccess/notosanskr.css' rel='stylesheet' type='text/css'>\n",
    "<!--https://github.com/kattergil/NotoSerifKR-Web/stargazers-->\n",
    "<link href='https://cdn.rawgit.com/kattergil/NotoSerifKR-Web/5e08423b/stylesheet/NotoSerif-Web.css' rel='stylesheet' type='text/css'>\n",
    "<style>\n",
    "    h1     { font-family: 'Noto Sans KR' !important; color:#348ABD !important;   }\n",
    "    h2     { font-family: 'Noto Sans KR' !important; color:#467821 !important;   }\n",
    "    h3, h4 { font-family: 'Noto Sans KR' !important; color:#A60628 !important;   }\n",
    "    p:not(.navbar-text) { font-family: 'Noto Serif KR', 'Nanum Myeongjo'; font-size: 12pt; line-height: 200%;  text-indent: 10px; }\n",
    "    li:not(.dropdown){ font-family: 'Noto Serif KR', 'Nanum Myeongjo'; font-size: 12pt; line-height: 200%; }\n",
    "    table  { font-family: 'Noto Sans KR' !important;  font-size: 11pt !important; }           \n",
    "    li > p  { text-indent: 0px; }\n",
    "    li > ul { margin-top: 0px !important; }       \n",
    "    sup { font-family: 'Noto Sans KR'; font-size: 9pt; } \n",
    "    code, pre  { font-family: 'Nanum Gothic Coding', monospace !important; font-size: 12pt !important; line-height: 130% !important;}\n",
    "    .code-body { font-family: 'Nanum Gothic Coding', monospace !important; font-size: 12pt !important;}\n",
    "    .ns        { font-family: 'Noto Sans KR'; font-size: 15pt;}\n",
    "    .summary   {\n",
    "                   font-family: 'Georgia'; font-size: 12pt; line-height: 200%; \n",
    "                   border-left:3px solid #FF0000; \n",
    "                   padding-left:20px; \n",
    "                   margin-top:10px; \n",
    "               }\n",
    "    .green { color:#467821 !important; }\n",
    "    .comment { font-family: 'Noto Sans KR'; font-size: 10pt; }\n",
    "</style>"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}