# EasyVVUQ and Cloud Execution via Kubernetes

**Author**: Vytautas Jancauskas, LRZ (jancauskas@lrz.de)

To run code examples in this tutorial you will want to download it first and run it on a Jupyter notebook server running locally. That is because the examples assume you have configured access to a Kubernetes cluster. So if you are viewing this in our Binder you will want to instead open the copy of this tutorial that is located in the EasyVVUQ source code under ```tutorials/kubernetes``` And then you will want to open that notebook in your local Jupyter instance.

This tutorial assumes that you have access to a Kubernetes cluster. Like the ones provided by Google or Amazon. The next thing you need to do is to build a Docker container for your application. I have found that most online resources don't explain it adequately in the context relevant to us so I will outline the required steps here.

Our focus here is the Kubernetes execution method - the code executed is based on our "[Vector Quantities of Interest](./vector_qoi_tutorial.ipynb)" tutorial. If you are unfamiliar with EasyVVUQ we recommend that you read this previous tutorial before continuing with this one.

The first thing you need is a Dockerfile providing instructions on how the execution environment should be setup (which software to build and install etc.). Here is the Dockerfile we made for EasyVVUQ. Yours will look different but I hope you will see that it is fairly straightforward. For further information please consult this [guide](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/).

In [None]:
!cat kubernetes/Dockerfile

You will need to register for an account on [DockerHub](https://hub.docker.com/). Or if you don't want your Docker image to be publicly accessible you need to look into private registries, for example [here](https://cloud.google.com/container-registry/docs/quickstart), but this will usually be provider specific. Then you need to build your container, login to DockerHub and push the image. After that it will be accessible to run in your Kubernetes cluster.

```docker build -t user/imagename:tag .```

```docker login```

```docker push user/imagename:tag```

In order for the code examples below to work you need to have a valid ```~/.kube/config``` file. Details will differ based on your provider, but in the case of Google Cloud you can do it as shown below. Here ```easyvvuq``` is a cluster name. It is created from a cluster template. Again, this will be provider specific, consult their documentation.

In [None]:
!gcloud container clusters create easyvvuq

The code in the cells below sets up an EasyVVUQ campaign to analyse a simple epedimiological model (using the `sir` code) - for more details see the "[Vector Quantities of Interest](./vector_qoi_tutorial.ipynb)" tutorial. We will only explain the differences caused by Kubernetes execution in this document.

In [1]:
import easyvvuq as uq
import chaospy as cp
import matplotlib.pyplot as plt
from easyvvuq.actions import CreateRunDirectory, Encode, Decode, ExecuteKubernetes, Actions

In [2]:
params = {
 "S0": {"type": "float", "default": 997}, 
 "I0": {"type": "float", "default": 3}, 
 "beta": {"type": "float", "default": 0.2}, 
 "gamma": {"type": "float", "default": 0.04, "min": 0.0, "max": 1.0},
 "iterations": {"type": "integer", "default": 100},
 "outfile": {"type": "string", "default": "output.csv"}
}

In [3]:
encoder = uq.encoders.GenericEncoder(template_fname='sir.template', delimiter='$', target_filename='input.json')
decoder = uq.decoders.SimpleCSV(target_filename='output.csv', output_columns=['I'])
execute = ExecuteKubernetes(
 "orbitfold/easyvvuq:latest",
 "/EasyVVUQ/tutorials/sir /config/input.json && cat output.csv",
 output_file_name='output.csv')

In [4]:
actions = Actions(CreateRunDirectory('/tmp'), Encode(encoder), execute, Decode(decoder))

In [5]:
campaign = uq.Campaign(name='sir', params=params, actions=actions)

In [6]:
vary = {
 "beta": cp.Uniform(0.15, 0.25),
 "gamma": cp.Normal(0.04, 0.01),
}

In [7]:
campaign.set_sampler(uq.sampling.PCESampler(vary=vary, polynomial_order=5))

The only difference between this example and the one where we run the simulation locally is the type of action we pass to `sample_and_apply`. In this case we use `ExecuteKubernetes` (in contrast to `ExecuteLocalV2` employed for local exection). For this simple application we only need to specify two arguments to `ExecuteKubernetes` - the image to be pulled from DockerHub and a way to run the simulation. 

The first argument to ```ExecuteKubernetes``` is the image specified using the tag specified in your build command in the format `user/imagename:tag` (below we use `orbitfold/easyvvuq:latest`). The input configuration is automatically transferred to the Kubernetes pod using the Kubernetes API. Input files will be stored under the ```/config``` directory. You need to have this in mind when running the simulation. Likewise the results are retrieved from standard output of the pod. This output is sent directly to the Decoder. Which might mean you will want to exercise some care when designing decoders for these cases. In this case there is nothing special to be done. If your simulation produces a lot of data you might have to use a script inside the container to extract the quantities of interest and print them to ```stdout```.

The second argument to ```ExecuteKubernetes``` is the command to be executed inside the running container. If you look at the way we have created the Docker image, EasyVVUQ is cloned to the root directory, which means the path to the simulation code is ```/EasyVVUQ/tutorials/sir```. We run the ```sir``` simulation and then we print the ```output.csv``` file to ```stdout``` using the ```cat``` command. It will be picked up by our Kubernetes backend.

In [8]:
execution = campaign.execute()

In [19]:
execution.progress()

{'ready': 28, 'active': 8, 'finished': 0, 'failed': 0}

The remaining steps are exactly the same as we would have in the case of local execution.

In [None]:
result = campaign.analyse(qoi_cols=['I'])

In [None]:
result.plot_sobols_first('I', xlabel='t')

In [None]:
result.plot_moments('I', xlabel='t')