<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#-Array-operations-in-PyTorch" data-toc-modified-id="-Array-operations-in-PyTorch-1"><span class="toc-item-num">1&nbsp;&nbsp;</span> Array operations in PyTorch</a></span><ul class="toc-item"><li><span><a href="#Multiplication" data-toc-modified-id="Multiplication-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Multiplication</a></span><ul class="toc-item"><li><span><a href="#Example-1." data-toc-modified-id="Example-1.-1.1.1"><span class="toc-item-num">1.1.1&nbsp;&nbsp;</span>Example 1.</a></span></li><li><span><a href="#Example-2." data-toc-modified-id="Example-2.-1.1.2"><span class="toc-item-num">1.1.2&nbsp;&nbsp;</span>Example 2.</a></span></li><li><span><a href="#Example-3." data-toc-modified-id="Example-3.-1.1.3"><span class="toc-item-num">1.1.3&nbsp;&nbsp;</span>Example 3.</a></span></li></ul></li><li><span><a href="#Transpose" data-toc-modified-id="Transpose-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Transpose</a></span><ul class="toc-item"><li><span><a href="#Example-1." data-toc-modified-id="Example-1.-1.2.1"><span class="toc-item-num">1.2.1&nbsp;&nbsp;</span>Example 1.</a></span></li><li><span><a href="#Example-2." data-toc-modified-id="Example-2.-1.2.2"><span class="toc-item-num">1.2.2&nbsp;&nbsp;</span>Example 2.</a></span></li><li><span><a href="#Example-3." data-toc-modified-id="Example-3.-1.2.3"><span class="toc-item-num">1.2.3&nbsp;&nbsp;</span>Example 3.</a></span></li></ul></li><li><span><a href="#Gradient" data-toc-modified-id="Gradient-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Gradient</a></span><ul class="toc-item"><li><span><a href="#Example-1." data-toc-modified-id="Example-1.-1.3.1"><span class="toc-item-num">1.3.1&nbsp;&nbsp;</span>Example 1.</a></span></li><li><span><a href="#Example-2." data-toc-modified-id="Example-2.-1.3.2"><span class="toc-item-num">1.3.2&nbsp;&nbsp;</span>Example 2.</a></span></li><li><span><a href="#Example-3." data-toc-modified-id="Example-3.-1.3.3"><span class="toc-item-num">1.3.3&nbsp;&nbsp;</span>Example 3.</a></span></li></ul></li><li><span><a href="#Eigenvalues,-eigenvectors" data-toc-modified-id="Eigenvalues,-eigenvectors-1.4"><span class="toc-item-num">1.4&nbsp;&nbsp;</span>Eigenvalues, eigenvectors</a></span><ul class="toc-item"><li><span><a href="#Example-1." data-toc-modified-id="Example-1.-1.4.1"><span class="toc-item-num">1.4.1&nbsp;&nbsp;</span>Example 1.</a></span></li><li><span><a href="#Example-2." data-toc-modified-id="Example-2.-1.4.2"><span class="toc-item-num">1.4.2&nbsp;&nbsp;</span>Example 2.</a></span></li><li><span><a href="#Example-3." data-toc-modified-id="Example-3.-1.4.3"><span class="toc-item-num">1.4.3&nbsp;&nbsp;</span>Example 3.</a></span></li></ul></li><li><span><a href="#Least-square-norm" data-toc-modified-id="Least-square-norm-1.5"><span class="toc-item-num">1.5&nbsp;&nbsp;</span>Least square norm</a></span><ul class="toc-item"><li><span><a href="#Example-1." data-toc-modified-id="Example-1.-1.5.1"><span class="toc-item-num">1.5.1&nbsp;&nbsp;</span>Example 1.</a></span></li><li><span><a href="#Example-2." data-toc-modified-id="Example-2.-1.5.2"><span class="toc-item-num">1.5.2&nbsp;&nbsp;</span>Example 2.</a></span></li><li><span><a href="#Example-3." data-toc-modified-id="Example-3.-1.5.3"><span class="toc-item-num">1.5.3&nbsp;&nbsp;</span>Example 3.</a></span></li></ul></li><li><span><a href="#Summary" data-toc-modified-id="Summary-1.6"><span class="toc-item-num">1.6&nbsp;&nbsp;</span>Summary</a></span></li><li><span><a href="#Reference-Links" data-toc-modified-id="Reference-Links-1.7"><span class="toc-item-num">1.7&nbsp;&nbsp;</span>Reference Links</a></span></li></ul></li></ul></div>

<h1> Array operations in PyTorch</h1>
<p class="author">(Szabó Sándor, 27. May 2020)</p>

<p class="abstract">We will consider some common operations for matrices:</p>
<p>
<ul class="square">
    <li>multiplication</li>
    <li>transpose</li>
    <li>gradient</li>
    <li>eigenvalues, eigenvectors</li>
    <li>least square norm</li>
</ul>
</p> 

In [1]:
# Import torch and other required modules
import torch

<h2>Multiplication</h2>

<div class="background">
<p class="normal">
    Since multiplication is <span style="font-style: italic;">not</span> commutative, we need to take care of the order.
</p>
</div>

<h3>Example 1.</h3>

In [28]:
# Example 1
A = torch.randn(2, 3)
B = torch.randn(3, 4)
torch.mm(A, B)

tensor([[ 0.1024, -0.6093,  0.2240, -0.1838],
        [-0.7241, -0.7749,  0.2628,  1.9635]])

<h3>Example 2.</h3>

<div class="background">
<p class="normal">However if you change the order you obtain</p>
</div>

In [23]:
# Example 2 - breaking
torch.mm(B, A)

RuntimeError: size mismatch, m1: [3 x 4], m2: [2 x 3] at C:\Users\builder\AppData\Local\Temp\pip-req-build-9msmi1s9\aten\src\TH/generic/THTensorMath.cpp:197

<div class="remarks">
    <p class="normal">
    <span class="code">torch.mm</span> does not broadcast. 
    For broadcasting matrix products, see <span class="code">torch.matmul()</span>.
    </p>
</div>

<h3>Example 3.</h3>

<div class="background">
<p class="normal">If your keyboard has the character '@', then you can write </p>
</div>

In [24]:
# Example 3
A @ B

tensor([[-0.5379, -0.1539, -0.2042, -0.5444],
        [ 0.0280, -0.8160,  0.5760,  2.8664]])

<h2>Transpose</h2>

<div class="background">
<p class="normal">When we transpose a matrix we transform rows to columns, and columns to rows.</p>
</div>

<h3>Example 1.</h3>

In [26]:
# Example 1 
torch.t(A)

tensor([[-0.7157,  1.2350],
        [ 0.0141, -1.2724],
        [ 0.1572,  0.7450]])

<h3>Example 2.</h3>

<div class="background">
<p class="normal">Again, you should thoughtful, because $(AB)^T\neq A^T B^T$, indeed</p>
</div>

In [30]:
# Example 2
C = torch.randn(3, 3)
D = torch.randn(3, 3)
torch.t(C @ D) - (torch.t(C) @ torch.t(D))

tensor([[ 1.5273,  2.2196,  0.6171],
        [-0.0384, -1.0361, -1.0175],
        [ 2.1029, -0.2353, -0.4913]])

<div class="background">
<p class="normal">The correct result is $(AB)^T=B^T A^T$.</p>
</div>

<h3>Example 3.</h3>

In [31]:
# Example 3
torch.t(C @ D) - (torch.t(D) @ torch.t(C))

tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])

<h2>Gradient</h2>

<div class="background">
    <p class="normal">In machine learning we should minimize different type 
        of error function.</p>
    <p class="normal">
    The derivative of a multivariable scalar valued function is a matrix, the so-called 
    <a href="https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant">
    Jacobi matrix</a>.
    </p>
</div>

<div class="background">
<p class="normal">If you want to calculate the derivative, you should give this option 
    in the definition of the tensor, using <span class="code">requires_grad=True</span></p>
<p class="normal">In Example 1 we use the vector norm (in fact, the Frobenius norm, which gives the 
    Eucledian vector norm in case of vectors).</p>
</div>

<h3>Example 1.</h3>

In [122]:
# Example 1
# Create tensors.
A = torch.randn(3, 3, requires_grad=True)
x = torch.randn(3, 1, requires_grad=True)
b = torch.randn(3, 1, requires_grad=True)
y = torch.norm(A @ x - b, p='fro')
print("A: \n", A)
print("x: \n", x)
print("b: \n", b)
print("y: \n", y)

A: 
 tensor([[-0.3062, -0.5272,  1.2869],
        [-1.8781, -1.9212,  0.1393],
        [-1.0218,  1.2361, -0.1057]], requires_grad=True)
x: 
 tensor([[ 0.5530],
        [-0.6164],
        [ 0.8250]], requires_grad=True)
b: 
 tensor([[-1.0424],
        [-1.3509],
        [ 0.8377]], requires_grad=True)
y: 
 tensor(3.5743, grad_fn=<NormBackward0>)


<div class="background">
    <p class="normal">Now we calculate the derivatives $\dfrac{\partial y}{\partial A}$, 
    $\dfrac{\partial y}{\partial x}$, $\dfrac{\partial y}{\partial b}$.</p>
    <p class="normal">To compute the derivatives, we call the <span class="code">.backward 
        </span> method on our result $y$. </p>
</div>

In [123]:
# Compute derivatives
y.backward()

# Display gradients
print('∂y/∂A: \n', A.grad)
print('∂y/∂x: \n', x.grad)
print('∂y/∂b: \n', b.grad)

∂y/∂A: 
 tensor([[ 0.3496, -0.3897,  0.5216],
        [ 0.2493, -0.2780,  0.3720],
        [-0.3484,  0.3884, -0.5198]])
∂y/∂x: 
 tensor([[-0.3966],
        [-1.9784],
        [ 0.9430]])
∂y/∂b: 
 tensor([[-0.6322],
        [-0.4509],
        [ 0.6301]])


<h3>Example 2.</h3>

<div class="background">
    <p class="normal">Instead of Frobenius norm we can choose any $1\leq p<\infty$ norm.
    In the next example we choose $p=1$.</p>
</div>

In [124]:
# Example 2 
w = torch.norm(A @ x - b, p=1)
              
# Compute derivatives
w.backward()

# Display gradients
print('∂w/∂A: \n', A.grad)
print('∂w/∂x: \n', x.grad)
print('∂w/∂b: \n', b.grad)

∂w/∂A: 
 tensor([[ 0.9026, -1.0062,  1.3467],
        [ 0.8023, -0.8944,  1.1970],
        [-0.9014,  1.0048, -1.3449]])
∂w/∂x: 
 tensor([[-1.5591],
        [-5.6630],
        [ 2.4750]])
∂w/∂b: 
 tensor([[-1.6322],
        [-1.4509],
        [ 1.6301]])


<h3>Example 3.</h3>

<div class="background">
    <p class="normal">At this moment the infinity norm $p=\infty$ does not work.</p>
</div>

In [125]:
# Example 3 - breaking 
z = torch.norm(A @ x - b, p=inf)
              
# Compute derivatives
z.backward()

# Display gradients
print('∂z/∂A: \n', A.grad)
print('∂z/∂x: \n', x.grad)
print('∂z/∂b: \n', b.grad)

NameError: name 'inf' is not defined

<h2>Eigenvalues, eigenvectors</h2>

<div class="background">
    <p class="normal">When we work on linear operators (matrices) many times is a must to 
    determine their eigenvalues and eigenvectors. 
    To do this, we use <span class="code">torch.eig</span></p>
</div>

<h3>Example 1.</h3>

In [111]:
# Example 1 
(eigvalues, eigvectors) = torch.eig(A, eigenvectors=True)
for i in range(3):
    print('eigenvalue: ', eigvalues[i])
    print('eigvector: ', eigvectors[i])
    print('\n')

eigenvalue:  tensor([1.1316, 0.0000], grad_fn=<SelectBackward>)
eigvector:  tensor([-0.9053,  0.4609,  0.0952], grad_fn=<SelectBackward>)


eigenvalue:  tensor([-0.2300,  0.0000], grad_fn=<SelectBackward>)
eigvector:  tensor([-0.1106, -0.8415, -0.8702], grad_fn=<SelectBackward>)


eigenvalue:  tensor([-0.9825,  0.0000], grad_fn=<SelectBackward>)
eigvector:  tensor([-0.4100,  0.2818, -0.4834], grad_fn=<SelectBackward>)




<div class="remarks">
    <p class="normal">Since an eigenvalue can be complex, the first element in the tensor is 
    the real part and the second element is the imaginary part of it.
    </p>
</div>

<h3>Example 2.</h3>

<div class="background">
    <p class="normal">
        Here is an example when the tensor has one eigenvalue with three different 
        eigenvectors.
    </p>
</div>

In [126]:
# Example 2 
C = torch.tensor([[0., -1., 0], [4., 4., 0], [2., 1., 2.]])

(eigvalues, eigvectors) = torch.eig(C, eigenvectors=True)
for i in range(3):
    print('eigenvalue: ', eigvalues[i])
    print('eigenvector: ', eigvectors[i])
    print('\n')

eigenvalue:  tensor([2., 0.])
eigenvector:  tensor([ 0.0000, -0.4472,  0.4082])


eigenvalue:  tensor([2.0000, 0.0000])
eigenvector:  tensor([ 0.0000,  0.8944, -0.8165])


eigenvalue:  tensor([2.0000, 0.0000])
eigenvector:  tensor([ 1.0000,  0.0000, -0.4082])




<h3>Example 3.</h3>

<div class="background">
    <p class="normal">Only square matrices can have eigenvalues.</p>
</div>

In [114]:
# Example 3 
D = torch.tensor([[0., -1., 0], [4., 4., 0]])

(eigvalues, eigvectors) = torch.eig(D, eigenvectors=True)

RuntimeError: invalid argument 1: A should be square at C:\Users\builder\AppData\Local\Temp\pip-req-build-9msmi1s9\aten\src\TH/generic/THTensorLapack.cpp:195

<h2>Least square norm</h2>

<div class="background">
    <p class="normal">Many times we should minimize the quantity $\Vert AX-B\Vert_2$, 
        where $A,B$ are given matrices, and $X$ is the one we want to calculate. 
        We use the <span class="code">torch.lstsq</span> function.
    </p>
</div>

<h3>Example 1.</h3>

In [131]:
# Example 1
A = torch.randn(3, 4)
B = torch.randn(3, 1) # a vector
print("A: \n", A)
print("B: \n", B)
print("\n")

X, _ = torch.lstsq(B, A)

error = torch.norm(A @ X - B, p=2)
print("The solution of the least square problem is: \n\n", X)
print("\n")
print("The error is: ", error)

A: 
 tensor([[-0.2769,  1.0674,  0.6662, -1.5984],
        [ 0.8249,  1.3140, -0.0192, -0.3846],
        [ 2.4818,  0.5684, -1.7736,  0.6815]])
B: 
 tensor([[ 0.4494],
        [-0.9212],
        [-1.1869]])


The solution of the least square problem is: 

 tensor([[-0.2753],
        [-0.7950],
        [-0.3150],
        [-0.8957]])


The error is:  tensor(1.8849e-07)


<h3>Example 2.</h3>

<div class="background">
    <p class="normal">
        Now consider the previous example, but with $p=1$ norm.
    </p>
</div>

In [132]:
# Example 2 
A = torch.randn(3, 4)
B = torch.randn(3, 1) # a vector
print("A: \n", A)
print("B: \n", B)
print("\n")

X, _ = torch.lstsq(B, A)

error = torch.norm(A @ X - B, p=1)
print("The solution of the least square problem is: \n\n", X)
print("\n")
print("The error is: ", error)

A: 
 tensor([[-0.0644, -0.7350,  0.9334, -0.2362],
        [-0.5515, -0.2604,  0.7796, -1.1156],
        [-1.2727,  1.4277,  1.5215,  1.3682]])
B: 
 tensor([[-1.8821],
        [-0.5725],
        [-0.3240]])


The solution of the least square problem is: 

 tensor([[-0.3108],
        [ 1.2100],
        [-1.1999],
        [-0.4541]])


The error is:  tensor(8.3447e-07)


<div class="background">
    <p class="normal">The error is greater.</p>
</div>

<h3>Example 3.</h3>

<div class="background">
    <p class="normal">If we give matrices of wrong dimensions, we obtain 
    a 'size mismatch' error message.</p>
</div>

In [133]:
# Example 3 - breaking
A = torch.randn(5, 4)
B = torch.randn(5, 1) # a vector
print("A: \n", A)
print("B: \n", B)
print("\n")

X, _ = torch.lstsq(B, A)

error = torch.norm(A @ X - B, p=1)
print("The solution of the least square problem is: \n\n", X)
print("\n")
print("The error is: ", error)

A: 
 tensor([[-0.7085, -0.5104,  0.8651, -1.5750],
        [-0.6783, -0.8917, -0.3745,  0.8802],
        [-0.1210,  0.4588,  1.7196, -1.7847],
        [ 0.5258, -0.0043,  1.3598, -0.2012],
        [-0.3974,  0.0806,  0.2112, -0.7372]])
B: 
 tensor([[ 2.1218],
        [ 2.0587],
        [ 0.4716],
        [-0.7330],
        [ 1.3271]])




RuntimeError: size mismatch, m1: [5 x 4], m2: [5 x 1] at C:\Users\builder\AppData\Local\Temp\pip-req-build-9msmi1s9\aten\src\TH/generic/THTensorMath.cpp:197

<h2>Summary</h2>

<div class="background">
    <p class="normal">We considered some basic linear algebraic problems, that PyTorch can 
    solve very effectively, and we will use them many times later.
</div>

<h2>Reference Links</h2>

<p class="bibl my-bibl">Official documentation for <span class="code">torch.Tensor</span> 
 <a href="https://pytorch.org/docs/stable/tensors.html">
   https://pytorch.org/docs/stable/tensors.html </a></p>
<p class="bibl my-bibl">Official documentation for <span class="code">torch.Autograd</span> 
 <a href="https://pytorch.org/docs/stable/autograd.html">
   https://pytorch.org/docs/stable/autograd.html </a></p>

In [175]:
from IPython.core.display import HTML
import urllib.request
response = urllib.request.urlopen('https://raw.githubusercontent.com/wesszabo/Pytorch-basics/master/CSS/pytorch_basics.css')
HTML(response.read().decode("utf-8"))