---
title: Using Scala to compute the Kantorovich distance
date : 2014-06-09
--- &lead
```{r setup0, echo=FALSE}
opts_chunk$set(fig.path="assets/fig/KantorovichWithScala", tidy=FALSE)
```
Recall that we previously provided an example of calculation of a Kantorovich distance [using R](http://stla.github.io/stlapblog/posts/KantorovichWithR.html) and
[using Julia](http://stla.github.io/stlapblog/posts/KantorovichWithJulia.html).
Now it's time to do it with [Scala](http://www.scala-lang.org/) and
its [breeze library](https://github.com/scalanlp/breeze),
a [ScalaNLP projet](http://www.scalanlp.org/).
Our example is a linear programing problem given as follows.
The unknown variables $p_{ij}$ have a tabular structure
$$P=\begin{pmatrix}
p_{11} & p_{12} & p_{13} \\
p_{21} & p_{22} & p_{23} \\
p_{31} & p_{32} & p_{33}
\end{pmatrix},$$
are constrained by the following
linear equality/inequality constraints:
$$\begin{cases}
{\rm (1a) } \quad \sum_j p_{ij} = \mu(a_i) & \forall i \\
{\rm (1b) } \quad \sum_i p_{ij} = \nu(a_j) & \forall j \\
{\rm (2) } \quad p_{ij} \geq 0 & \forall i,j \\
\end{cases}$$
where $\mu$ and $\nu$ are two probability vectors, taken to be
$$ \mu=\left(\frac{1}{7},\frac{1}{7},\frac{1}{7}\right)
\qquad
\text{and } \qquad
\mu=\left(\frac{1}{4},\frac{1}{4},\frac{1}{2}\right)
$$
in our example, and the desideratum is to minimize the linear combination
$$ \sum_{i,j} D_{ij}p_{ij}$$
where $D$ is a distance matrix, taken to be
$$ D = \begin{pmatrix}
0 & 1 & 1 \\
1 & 0 & 1 \\
1 & 1 & 0
\end{pmatrix}$$
in our example.
## Breeze expressions
The linear programming solver provided by `Breeze` is based on
[Apache's Simplex Solver](http://google-opensource.blogspot.be/2009/06/introducing-apache-commons-math.html).
Most linear programming solvers are conceived to take as input the
matrix of linear constraints, and this is quite annoying to construct
this matrix in our situation: for the linear equality constraints this matrix is
$$
A=\begin{pmatrix} M_1 \\ M_2 \end{pmatrix}
$$
where
$$ M_1 = \begin{pmatrix}
1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 & 1 & 1 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1
\end{pmatrix} $$
defines the linear equality constraints corresponding to $\mu$
and
$$ M_2 = \begin{pmatrix}
1 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 \\
0 & 1 & 0 & 0 & 1 & 0 & 0 & 1 & 0 \\
0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 & 1
\end{pmatrix}. $$
defines the linear equality constraints corresponding to $\nu$.
See how I defined these matrices
[using R](http://stla.github.io/stlapblog/posts/KantorovichWithR.html) and
[using Julia](http://stla.github.io/stlapblog/posts/KantorovichWithJulia.html) (with the
`GLPK` library), it was a bit tricky or a bit painful.
[As we have seen](http://stla.github.io/stlapblog/posts/KantorovichWithJulia.html), the Julia
`JuMP` package provides a very convenient way to *write*, and maybe more importantly, to *read*,
the linear programing problem, using
*expressions* to define the unknown variables $p_{ij}$, the objective function and the
constraints, which avoids in particular to construct the $A$ matrix :
```{r, eval=FALSE}
using JuMP
mu = [1/7, 2/7, 4/7]
nu = [1/4, 1/4, 1/2]
n = length(mu)
m = Model()
@defVar(m, p[1:n,1:n] >= 0)
@setObjective(m, Min, sum{p[i,j], i in 1:n, j in 1:n; i != j})
for k in 1:n
@addConstraint(m, sum(p[k,:]) == mu[k])
@addConstraint(m, sum(p[:,k]) == nu[k])
end
solve(m)
```
Using the Scala `breeze` library we also set the problem by using expressions.
This is not as concise as the Julia JuMP library and this requires
a bit of gymnastics with the Scala language, but essentially the spirit is the same:
we write the problem, using expressions.
Another Java library, [WVLPsolver](http://www.win-vector.com/blog/2012/11/yet-another-java-linear-programming-library/), allows to encode the problem with a syntax closer to the one
of `JuMP`, but I have never tried it yet.
Anyway there is no difficuly with `breeze` once you get a basic knowledge in the
Scala language: mainly you only have to use a new type, `Expression`, a member of the
`lp` object.
## Scala breeze in action
First, we define the probability measures $\mu$ and $\nu$ between which we want the Kantorovich distance to be computed, and the distance matrix $D$:
```{r, eval=FALSE}
var Mu = Array(1.0/7, 2.0/7, 4.0/7)
var Nu = Array(1.0/4, 1.0/4, 1.0/2)
val n = Mu.length
val D = Array.fill(n,n)(0)
for( i <- 0 to n-1 )
for( j <- 0 to n-1 )
if( !(i==j) ) D(i)(j) = 1
```
To show an array in Scala, do:
```{r, eval=FALSE}
scala> println(D.deep.mkString("\n"))
Array(0, 1, 1)
Array(1, 0, 1)
Array(1, 1, 0)
```
Before introducing the $P$ matrix of variables, we need to create a new linear
programing object:
```{r, eval=FALSE}
val lp = new breeze.optimize.linear.LinearProgram()
```
and then we can define $P$ as an array of `Real` variables:
```{r, eval=FALSE}
val P = Array.fill(n,n)(lp.Real())
```
Here is our matrix of unknown variables:
```{r, eval=FALSE}
scala> println(P.deep.mkString("\n"))
Array(x_0, x_1, x_2)
Array(x_3, x_4, x_5)
Array(x_6, x_7, x_8)
```
The following code defines the objective function as an expression:
```{r, eval=FALSE}
val Objective = (
for( i <- 0 to 2 )
yield (
for( (x, a) <- P(i) zip D(i) )
yield (x * a)
).reduce(_ + _)
).reduce(_ + _)
```
```{r, eval=FALSE}
scala> println(Objective)
(x_0) * 0.0 + (x_1) * 1.0 + (x_2) * 1.0 + (x_3) * 1.0 + (x_4) * 0.0 + (x_5) * 1.0 + (x_6) * 1.0 + (x_7) * 1.0 + (x_8) * 0.0
```
Let us explain the above code. The following command returns the scalar product of two vectors
`V1` and `V2`:
```{r, eval=FALSE}
for( (x, a) <- V1 zip V2 ) yield (x * a) ).reduce(_ + _)
```
Then our code firstly generates the scalar product (as an expression) of the $i$-th line `P(i)` of $P$ and the
$i$-th line `D(i)` of $D$, for every $i=0,1,2$ (indexation starts at $0$ in Scala, not at $1$),
and then it generates the sum of these three scalar products, which is the
expression for our objective function.
Now we define the constraints in a `Constraint` array.
There are $2n$ equality constraints and $n^2$ positivity (inequality) constraints,
we will store them in this array:
```{r, eval=FALSE}
var Constraints = new Array[lp.Constraint](2*n+n*n)
```
The equality constraints are about the sum of the variables in each row and each column of
the $P$ matrix. We set them as follows:
```{r, eval=FALSE}
val tP=P.transpose
for (i <- 0 to n-1) {
Constraints(i) = P(i).reduce[lp.Expression](_ + _) =:= Mu(i)
Constraints(i+n) = tP(i).reduce[lp.Expression](_ + _) =:= Nu(i)
}
```
And finally we set the positivity constraints as follows:
```{r, eval=FALSE}
for (i <- 0 to n-1) {
var k = 2*n + n*i
for (j <- 0 to n-1) Constraints(k+j) = P(i)(j) >=0
}
```
There's quite bit of gymnastics in the previous code... I'll possibly simplify it in a future
version, as I'm at my first steps with Scala `breeze`.
Since the library only allows to maximize the objective function, whereas we aim to
minimize it, we take its opposite in the following `lpp` object which fully
represents our linear programing problem :
```{r, eval=FALSE}
val lpp = (
-Objective
subjectTo( Constraints:_* )
)
val result = lp.maximize(lpp)
```
And then we get the Kantorovich distance:
```{r, eval=FALSE}
scala> println(-result.value)
0.1071428571428571
```
which almost achieves the best 64-bit precision approximation of the exact value $3/28$:
```{r, eval=FALSE}
scala> println(3.0/28)
0.10714285714285714
```