---
title: "Describe and Execute CWL Tools/Workflows in R"
date: "`r Sys.Date()`"
output:
  rmarkdown::html_document:
    toc: true
    toc_float: true
    toc_depth: 4
    number_sections: true
    theme: "flatly"
    highlight: "textmate"
    css: "sevenbridges.css"
vignette: >
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteIndexEntry{Describe and Execute CWL Tools/Workflows in R}
---

```{r include=FALSE}
knitr::opts_chunk$set(comment = "")
```

# Prerequisite

This tutorial assume you have basic knowledge about Docker concept.

**Note**: Right now we are supporting CWL draft 2 with SBG extension, but we will support CWL v1.0 soon.

# Apps, Workflows, and Tools

In our terminology, a __workflow__ is composed of one or more __tool__, both of them are just __app__ to users. You can imagine some raw input data go through a pipeline with many nodes that each step perform a function on the data in the flow, and in the end, you got want you want: a fully processed data or result (plot, report, action).

Here are some key ideas:

- Tool is the unit or a single node of workflow, so different tools could be connected into a workflow. That's how we achieve reusability of components.
- Tool is described with key components: input, output, parameters and requirements and more details. You understand the tool more like a black box (container) which digest some input(s) with specified setup and output another format.
- When input files matches output between two tools, they could be connected.
- Input is composed of files and parameters, we call it types: File, Enum, Integer, String and so on.
- App could be described in JSON/YAML format, following Common Workflow Language (CWL) open-source standard.
- CWL is just the collection of logic and schema. To execute this __pure text file__, we need executor in the cloud or local. With Seven Bridges platform, you can simply execute it at scale.

Looks like full of jargons and hard to understand. Here is an example. You have a csv table, full of missing value and you want to process it in 3 steps:

1. Replace missing value
2. Filtering out rows that column "age" is smaller than 10
3. Output 3 item: a processed table csv file, a plot and a summary report in PDF format.

You can describe each step into a single module or tool then connect them one by one to form a flow. You can put everything into one single "tool", then downside is that other user cannot use your step1 for missing value problem. So it's both art and sciense to leverage between flexibility and efficiency.

Why we are using CWL? Imagine a single file represeting a tool or workflow,  could be executed anywhere in a reproducible manner and you don't have to install anything because Docker container is imaged, that's going to change the world of computational scientific research and how we do research and publish results. In this package we are trying to hide CWL details as much as possible, so user can just use it like a typical R function.

# Describe Tools in R

`Tool` is the basic unit, and also your "lego brick" you usually start with. As developer you also want to provide those "lego" piecies to users to directly run it or make their own flow with it.

The main interface provided by `sevenbridges` package is `Tool` function, it's much more straight forward to describe than composing your raw CWL JSON file from scratch. A "Tool" object in R could be exported into JSON or imported from a CWL JSON file.

I highly recommend going over documentation [The Tool Editor](http://docs.cancergenomicscloud.org/docs/the-tool-editor) chapter for the Cancer Genomics Cloud to understand how it works, and even try it on the platform with the GUI. This will help understand our R interface better.

## Import from JSON file

Sometimes people share Tool in pure JSON text format. You can simply load it into R by using `convert_app` function, this will recognize your JSON file class (Tool or Workflow) automatically.

```{r}
library("sevenbridges")
t1 <- system.file("extdata/app", "tool_star.json", package = "sevenbridges")
# # convert JSON file into a Tool object
t1 <- convert_app(t1)
# # try print it out
# t1
```

In this way, you can load it, revise it, use it with API or edit and export it back to JSON file. However, in this tutorial, the most important thing is that you learn how to desribe it directly in R.

## Utilitites for Tool object

We provide couple utitlities to help construct your own CWL tool quickly in R. For all availale utiles please check out `help("Tool")`

Some utiles you will find it useful when you execute a task, you need to know what is the input type and what is the input id and if it's required or not, so you can execute the task with parameters it need. Try play with `input_matrix` or `input_type` as shown below.

```{r}
# get input type information
head(t1$input_type())
# get output type information
head(t1$output_type())
# return a input matrix with more information
head(t1$input_matrix())
# return only a few fields
head(t1$input_matrix(c("id", "type", "required")))
# return only required
t1$input_matrix(required = TRUE)
# return a output matrix with more information
t1$output_matrix()
# return only a few fields
t1$output_matrix(c("id", "type"))
# get required input id
t1$get_required()
# set new required input with ID, # or without #
t1$set_required(c("#reads", "winFlankNbins"))
t1$get_required()
# turn off requirements for input node #reads
t1$set_required("reads", FALSE)
t1$get_required()
# get input id
head(t1$input_id())
# get full input id with Tool name
head(t1$input_id(TRUE))
# get output id
head(t1$output_id())
# get full output id
head(t1$output_id(TRUE))
# get input and output object
t1$get_input(id = "#winFlankNbins")
t1$get_input(name = "ins")
t1$get_output(id = "#aligned_reads")
t1$get_output(name = "gene")
```

## Create your own tool in R

### Introduction

Before we continue, this is how it looks like for full tool description, you don't always need to describe all those details, following section will walk you through simple examples to full examples like this one.

```{r, eval = TRUE, comment=''}
fl <- system.file("docker/rnaseqGene/rabix", "generator.R", package = "sevenbridges")
cat(readLines(fl), sep = "\n")
```

Now let's break it down:

Some key arguments used in `Tool` function.

- __baseCommand__: Specifies the program to execute.
- __stdout__: Capture the command's standard output stream to a file written to the designated output directory. You don't need this, if you specify output files to collect.
- __inputs__: inputs for your command line
- __outputs__: outputs you want to collect
- __Requirements__ and __hints__: in short, hints are not _required_ for execution. We now accept following requirement items `cpu`, `mem`, `docker`, `fileDef`; and you can easily construct them via `requirements()` constructor. This is how you describe the resources you need to execute the tool, so the system knows what type of instances suit your case best.

To specify inputs and outpus, usually your command line interface accept extra arguments as input, for example, file(s), string, enum, int, float, boolean. So to specify that in your tool, you can use `input` function, then pass it to the `inputs` arguments as a list or single item. You can even construct them as data.frame with less flexibility. `input()`  require arguments `id` and `type`. `output()`  require arguments `id` because `type` by default is file.

There are some special type: ItemArray and enum. For ItemArray the type could be an array of single type, the most common case is that if your input is a list of files, you can do something like `type = ItemArray("File")` or as simple as `type = "File..."` to diffenciate from a single file input. When you add "..." suffix, R will know it's an `ItemArray`.

We also provide an __enum__ type, when you specify the enum, please pass the required name and symbols like this `type = enum("format", c("pdf", "html"))` then in the UI on the platform you will be poped with drop down when you execute the task.

Now let's work though from simple case to most flexible case.

### Using existing Docker images and command

If you already have a Docker image in mind that provide the functionality you need, you can just use it. The `baseCommand` is the command line you want to execute in that container. `stdout` specify the output file you want to capture the standard output and collect it on the platform.

In this simple example, we know Docker image `rocker/r-base` has a function called `runif` we can directly call in the command line with `Rscript -e`. Then we want the ouput to be collected in `stdout` and ask the file system to capture the output files that matches the pattern `*.txt`. Please pay attention to this, your tool may produce many intermediate files in the current folder, if you don't tell which output you need, they will all be ignored, so make sure you collect those files via the `outputs` parameter.

```{r}
library("sevenbridges")

rbx <- Tool(
  id = "runif",
  label = "runif",
  hints = requirements(docker(pull = "rocker/r-base")),
  baseCommand = "Rscript -e 'runif(100)'",
  stdout = "output.txt",
  outputs = output(id = "random", glob = "*.txt")
)

rbx
rbx$toJSON()
```

By default, the tool object shows YAML, but you can simply convert it to JSON and copy it to your seven bridges platform graphic editor by importing JSON.

```{r}
rbx$toJSON()
rbx$toJSON(pretty = TRUE)
rbx$toYAML()
```

### Add customized script to existing Docker image

Now you may want to run your own R script, but you still don't want to create new command line and a new Docker image. You just want to run your script with new input files in existing container, it's time to introduce `fileDef`. You can either directly write script as string or just import a R file to `content`. And provided as `requirements`.

```{r}
# Make a new file
fd <- fileDef(
  name = "runif.R",
  content = "set.seed(1); runif(100)"
)

# read via reader
.srcfile <- system.file("docker/sevenbridges/src/runif.R", package = "sevenbridges")

fd <- fileDef(
  name = "runif.R",
  content = readr::read_file(.srcfile)
)

# add script to your tool
rbx <- Tool(
  id = "runif",
  label = "runif",
  hints = requirements(docker(pull = "rocker/r-base")),
  requirements = requirements(fd),
  baseCommand = "Rscript runif.R",
  stdout = "output.txt",
  outputs = output(id = "random", glob = "*.txt")
)
```

How about multiple script?

```{r}
# or simply readLines
.srcfile <- system.file("docker/sevenbridges/src/runif.R", package = "sevenbridges")

fd1 <- fileDef(
  name = "runif.R",
  content = readr::read_file(.srcfile)
)
fd2 <- fileDef(
  name = "runif2.R",
  content = "set.seed(1); runif(100)"
)

rbx <- Tool(
  id = "runif_twoscript",
  label = "runif_twoscript",
  hints = requirements(docker(pull = "rocker/r-base")),
  requirements = requirements(fd1, fd2),
  baseCommand = "Rscript runif.R",
  stdout = "output.txt",
  outputs = output(id = "random", glob = "*.txt")
)
```


### Create formal interface for your command line

All those examples above, many parameters are hard-coded in your script, you don't have flexiblity to control how many numbers to generate. Most often, your tools or command line tools expose some inputs arguments to users. You need a better way to describe a command line with input/output.

Now we bring the example to next level. For example, we prepare a Docker image called `RFranklin/runif` on Docker Hub. This container has a exeutable command called `runif.R`, you don't have to know what is inside, you only have to know when you run the command line in that container it looks like this

```
runif.R --n=100 --max=100 --min=1 --seed=123
```

This command outpus two files directly, so you don't need standard output to capture random number.

- output.txt
- report.html

So the goal here is to describe this command and expose all input parameters and collect all two files.

To define input, you can specify

- `id` : unique identifier to this input node.
- `description`: description, also visible on UI.
- `type`: required to specify input types, files, integer, or character.
- `label`: human readable label for this input node.
- `prefix`: the prefix in command line for this input parameter.
- `default`: default value for this input.
- `required`: is this input parameter required or not. If required, when you execte the tool you have to provide a value for the parameter.
- `cmdInclude`: included in command line or not.

Output is similar, espeicaly when you want to collect file, you can use `glob` for pattern matching.

```{r}
# pass a input list
in.lst <- list(
  input(
    id = "number",
    description = "number of observations",
    type = "integer",
    label = "number",
    prefix = "--n",
    default = 1,
    required = TRUE,
    cmdInclude = TRUE
  ),
  input(
    id = "min",
    description = "lower limits of the distribution",
    type = "float",
    label = "min",
    prefix = "--min",
    default = 0
  ),
  input(
    id = "max",
    description = "upper limits of the distribution",
    type = "float",
    label = "max",
    prefix = "--max",
    default = 1
  ),
  input(
    id = "seed",
    description = "seed with set.seed",
    type = "float",
    label = "seed",
    prefix = "--seed",
    default = 1
  )
)

# the same method for outputs
out.lst <- list(
  output(
    id = "random",
    type = "file",
    label = "output",
    description = "random number file",
    glob = "*.txt"
  ),
  output(
    id = "report",
    type = "file",
    label = "report",
    glob = "*.html"
  )
)

rbx <- Tool(
  id = "runif",
  label = "Random number generator",
  hints = requirements(docker(pull = "RFranklin/runif")),
  baseCommand = "runif.R",
  inputs = in.lst, # or ins.df
  outputs = out.lst
)
```

Alternatively you can use data.frame as example for input and output, but it's less flexible.

```{r}
in.df <- data.frame(
  id = c("number", "min", "max", "seed"),
  description = c(
    "number of observation",
    "lower limits of the distribution",
    "upper limits of the distribution",
    "seed with set.seed"
  ),
  type = c("integer", "float", "float", "float"),
  label = c("number", "min", "max", "seed"),
  prefix = c("--n", "--min", "--max", "--seed"),
  default = c(1, 0, 10, 123),
  required = c(TRUE, FALSE, FALSE, FALSE)
)

out.df <- data.frame(
  id = c("random", "report"),
  type = c("file", "file"),
  glob = c("*.txt", "*.html")
)

rbx <- Tool(
  id = "runif",
  label = "Random number generator",
  hints = requirements(docker(pull = "RFranklin/runif"), cpu(1), mem(2000)),
  baseCommand = "runif.R",
  inputs = in.df, # or ins.df
  outputs = out.df
)
```

### Quick command line interface with `commandArgs` (position and named args)

Now you must be wondering, I have a Docker container with R, but I don't have any existing command line that I could directly use. Can I provide a script with a formal and quick command line interface to make an App for existing container. The anwser is yes. When you add script to your tool, you can always use some trick to do so, one popular one you may already head of is `commandArgs`. More formal one is called "docopt" which I will show you later.

Suppose you have a R script "runif2spin.R" with three arguments using position mapping

1. `numbers`
2. `min`
3. `max`

My base command will be somethine like

```
Rscript runif2spin.R 10 30 50
```

This is how you do in your R script

```{r, eval = TRUE, comment = ""}
fl <- system.file("docker/sevenbridges/src", "runif2spin.R",
  package = "sevenbridges"
)
cat(readLines(fl), sep = "\n")
```

Ignore the comment part, I will introduce spin/stich later.

Then just describe my tool in this way, add your script as you learned in previous sections.

```{r}
fd <- fileDef(
  name = "runif.R",
  content = readr::read_file(fl)
)

rbx <- Tool(
  id = "runif",
  label = "runif",
  hints = requirements(docker(pull = "rocker/r-base"), cpu(1), mem(2000)),
  requirements = requirements(fd),
  baseCommand = "Rscript runif.R",
  stdout = "output.txt",
  inputs = list(
    input(
      id = "number",
      type = "integer",
      position = 1
    ),
    input(
      id = "min",
      type = "float",
      position = 2
    ),
    input(
      id = "max",
      type = "float",
      position = 3
    )
  ),
  outputs = output(id = "random", glob = "output.txt")
)
```


How about named argumentments? I will still recommend use "docopt" package, but for simple way. You want command line looks like this

```
Rscript runif_args.R --n=10 --min=30 --max=50
```

Here is how you do in R script.

```{r, eval = TRUE, comment=''}
fl <- system.file("docker/sevenbridges/src", "runif_args.R", package = "sevenbridges")
cat(readLines(fl), sep = "\n")
```


Then just describe my tool in this way, note, I use `separate=FALSE` and add `=` to my prefix as a hack.

```{r}
fd <- fileDef(
  name = "runif.R",
  content = readr::read_file(fl)
)

rbx <- Tool(
  id = "runif",
  label = "runif",
  hints = requirements(docker(pull = "rocker/r-base"), cpu(1), mem(2000)),
  requirements = requirements(fd),
  baseCommand = "Rscript runif.R",
  stdout = "output.txt",
  inputs = list(
    input(
      id = "number",
      type = "integer",
      separate = FALSE,
      prefix = "--n="
    ),
    input(
      id = "min",
      type = "float",
      separate = FALSE,
      prefix = "--min="
    ),
    input(
      id = "max",
      type = "float",
      separate = FALSE,
      prefix = "--max="
    )
  ),
  outputs = output(id = "random", glob = "output.txt")
)
```

### docopt: a better and formal way to make command line interface

### Generate reports

**Quick report: Spin and Stich**

You can use spin/stich from knitr to generate report directly from a Rscript with special format. For example, let's use above example

```{r, eval = TRUE, comment=''}
fl <- system.file("docker/sevenbridges/src", "runif_args.R", package = "sevenbridges")
cat(readLines(fl), sep = "\n")
```

You command is something like this

```
Rscript -e "rmarkdown::render(knitr::spin('runif_args.R', FALSE))" --args --n=100 --min=30 --max=50
```

And so I describe my tool like this with Docker image `rocker/tidyverse` which contians the knitr and rmarkdown packages.

```{r}
fd <- fileDef(
  name = "runif.R",
  content = readr::read_file(fl)
)

rbx <- Tool(
  id = "runif",
  label = "runif",
  hints = requirements(docker(pull = "rocker/tidyverse"), cpu(1), mem(2000)),
  requirements = requirements(fd),
  baseCommand = "Rscript -e \"rmarkdown::render(knitr::spin('runif.R', FALSE))\" --args",
  stdout = "output.txt",
  inputs = list(
    input(
      id = "number",
      type = "integer",
      separate = FALSE,
      prefix = "--n="
    ),
    input(
      id = "min",
      type = "float",
      separate = FALSE,
      prefix = "--min="
    ),
    input(
      id = "max",
      type = "float",
      separate = FALSE,
      prefix = "--max="
    )
  ),
  outputs = list(
    output(id = "stdout", type = "file", glob = "output.txt"),
    output(id = "random", type = "file", glob = "*.csv"),
    output(id = "report", type = "file", glob = "*.html")
  )
)
```


You will get a report in the end.

### Misc

**Inherit metadata and additional metadata**

Sometimes if you want your output files inherit from particular input file, just use `inheritMetadataFrom` in your output() call and pass the input file id. If you want to add additional metadata, you could pass `metadata` a list in your output() function call. For example, I want my output report inherit all metadata from my "bam_file" input node (which I don't have in this example though) with two additional metadata fields.

```{r}
out.lst <- list(
  output(
    id = "random",
    type = "file",
    label = "output",
    description = "random number file",
    glob = "*.txt"
  ),
  output(
    id = "report",
    type = "file",
    label = "report",
    glob = "*.html",
    inheritMetadataFrom = "bam_file",
    metadata = list(
      author = "RFranklin",
      sample = "random"
    )
  )
)
out.lst
```

**Example with file/files as input node**

```{r, eval = TRUE, comment=''}
fl <- system.file("docker/rnaseqGene/rabix", "generator.R", package = "sevenbridges")
cat(readLines(fl), sep = "\n")
```

Note the stageInput example in the above script, you can set it to "copy" or "link".

**Input node batch mode**

Batch by File (the long output has been omitted here):

```{r, results = 'hide'}
f1 <- system.file("extdata/app", "flow_star.json", package = "sevenbridges")
f1 <- convert_app(f1)
f1$set_batch("sjdbGTFfile", type = "ITEM")
```

Batch by other critieria such as metadta, following example, is using `sample_id` and `library_id` (the long output has been omitted here):

```{r, results = 'hide'}
f1 <- system.file("extdata/app", "flow_star.json", package = "sevenbridges")
f1 <- convert_app(f1)
f1$set_batch("sjdbGTFfile", c("metadata.sample_id", "metadata.library_id"))
```

When you push your app to the platform, you will see the batch available at task page or workflow editor.

# Describe Wokrflow in R

**Note**: The [GUI Tool Editor](https://docs.sevenbridges.com/docs/the-tool-editor) on Seven Bridges Platform is more conventient for this purpose.

## Import from a JSON file

Yes, you could use the same function `convert_app` to import JSON files.

```{r}
f1 <- system.file("extdata/app", "flow_star.json", package = "sevenbridges")
f1 <- convert_app(f1)
# show it
# f1
```

## Utilities for `Flow` objects

Just like `Tool` object, you also have convenient utils for it, especially useful when you execute task.

```{r}
f1 <- system.file("extdata/app", "flow_star.json", package = "sevenbridges")
f1 <- convert_app(f1)
# input matrix
head(f1$input_matrix())
# by name
head(f1$input_matrix(c("id", "type", "required")))
# return only required
head(f1$input_matrix(required = TRUE))
# return everything
head(f1$input_matrix(NULL))
# return a output matrix with more information
head(f1$output_matrix())
# return only a few fields
head(f1$output_matrix(c("id", "type")))
# return everything
head(f1$output_matrix(NULL))
# flow inputs
f1$input_type()
# flow outouts
f1$output_type()
# list tools
f1$list_tool()
# f1$get_tool("STAR")
```

There are more utilities please check example at `help(Flow)`

## Create your own flow in R

### Introduction

To create a workflow, we provide simple interface to pipe your tool into a single workflow, it works under situation like

- Simple linear tool connection and chaining
- Connect flow with one or more tools

**Note**: for complicated workflow construction, I highly recommend just use our graphical interface to do it, there is no better way.

### Connect simple linear tools

Let's create tools from scratch to perform a simple task

1. Tool 1 output 1000 random number
2. Tool 2 take log on it
3. Tool 3 do a mean calculation of everything

```{r}
library("sevenbridges")
# A tool that generates 100 random numbers
t1 <- Tool(
  id = "runif new test 3", label = "random number",
  hints = requirements(docker(pull = "rocker/r-base")),
  baseCommand = "Rscript -e 'x = runif(100); write.csv(x, file = 'random.txt', row.names = FALSE)'",
  outputs = output(
    id = "random",
    type = "file",
    glob = "random.txt"
  )
)

# A tool that takes log
fd <- fileDef(
  name = "log.R",
  content = "args = commandArgs(TRUE)
                         x = read.table(args[1], header = TRUE)[,'x']
                         x = log(x)
                         write.csv(x, file = 'random_log.txt', row.names = FALSE)
                         "
)

t2 <- Tool(
  id = "log new test 3", label = "get log",
  hints = requirements(docker(pull = "rocker/r-base")),
  requirements = requirements(fd),
  baseCommand = "Rscript log.R",
  inputs = input(
    id = "number",
    type = "file"
  ),
  outputs = output(
    id = "log",
    type = "file",
    glob = "*.txt"
  )
)

# A tool that do a mean
fd <- fileDef(
  name = "mean.R",
  content = "args = commandArgs(TRUE)
                         x = read.table(args[1], header = TRUE)[,'x']
                         x = mean(x)
                         write.csv(x, file = 'random_mean.txt', row.names = FALSE)"
)

t3 <- Tool(
  id = "mean new test 3", label = "get mean",
  hints = requirements(docker(pull = "rocker/r-base")),
  requirements = requirements(fd),
  baseCommand = "Rscript mean.R",
  inputs = input(
    id = "number",
    type = "file"
  ),
  outputs = output(
    id = "mean",
    type = "file",
    glob = "*.txt"
  )
)

f <- t1 %>>% t2
f <- link(t1, t2, "#random", "#number")

# # you cannot directly copy-paste it
# # please push it using API, we will register each tool for you
# clipr::write_clip(jsonlite::toJSON(f, pretty = TRUE))

t2 <- Tool(
  id = "log new test 3", label = "get log",
  hints = requirements(docker(pull = "rocker/r-base")),
  requirements = requirements(fd),
  baseCommand = "Rscript log.R",
  inputs = input(
    id = "number",
    type = "file",
    secondaryFiles = sevenbridges:::set_box(".bai")
  ),
  outputs = output(
    id = "log",
    type = "file",
    glob = "*.txt"
  )
)

# clipr::write_clip(jsonlite::toJSON(t2, pretty = TRUE))
```

**Note**: this workflow contains tools that do not exist on the platform, so if you copy and paste the JSON directly into the GUI, it won't work properly. However, a simple way is to push your app to the platform via API. This will add new tools one by one to your project before add your workflow app on the platform. Alternatively, if you connect two tools you know that exist on the platform, you don't need to do so.

```{r, eval = FALSE}
# auto-check tool info and push new tools
p$app_add("new_flow_log", f)
```

### Connecting tools by input and output id

Now let's connect two tools

1. Unpakcing a compressed fastq files
2. STAR aligner

Checking potential mapping is easy with function `link_what`, it will print matched input and outputs. Then the generic function `link` will allow you to connect two `Tool` objects

If you don't specify which input/ouput to expose at flow level for new `Flow` object, it will expose all availabl ones and print the message, otherwise, please provide parameters for `flow_input` and `flow_output` with full id.

```{r}
t1 <- system.file("extdata/app", "tool_unpack_fastq.json",
  package = "sevenbridges"
)
t2 <- system.file("extdata/app", "tool_star.json",
  package = "sevenbridges"
)
t1 <- convert_app(t1)
t2 <- convert_app(t2)
# check possible link
link_what(t1, t2)
# link
f1 <- link(t1, t2, "output_fastq_files", "reads")
# link
t1$output_id(TRUE)
t2$input_id(TRUE)
f2 <- link(t1, t2, "output_fastq_files", "reads",
  flow_input = "#SBG_Unpack_FASTQs.input_archive_file",
  flow_output = "#STAR.log_files"
)

# clipr::write_clip(jsonlite::toJSON(f2))
```

### Connecting tool with workflow by input and output id

```{r}
tool.in <- system.file("extdata/app", "tool_unpack_fastq.json", package = "sevenbridges")
flow.in <- system.file("extdata/app", "flow_star.json", package = "sevenbridges")

t1 <- convert_app(tool.in)
f2 <- convert_app(flow.in)
# consulting link_what first
f2$link_map()
# then link

f3 <- link(t1, f2, c("output_fastq_files"), c("#SBG_FASTQ_Quality_Detector.fastq"))

link_what(f2, t1)
f4 <- link(f2, t1, c("#Picard_SortSam.sorted_bam", "#SBG_FASTQ_Quality_Detector.result"), c("#input_archive_file", "#input_archive_file"))

# # TODO
# # all outputs
# # flow + flow
# # print message when name wrong
# clipr::write_clip(jsonlite::toJSON(f4))
```

### Using pipe to construct complicated workflow

```{r}

```

# Execution

## Execute the tool and flow in the cloud

With API function, you can directly load your Tool into the account. Run a task, for "how-to", please check the complete guide for API client.

Here is a quick demo:

```{r, eval = FALSE}
a <- Auth(platform = "platform_name", token = "your_token")
p <- a$project("demo")
app.runif <- p$app_add("runif555", rbx)
aid <- app.runif$id
tsk <- p$task_add(
  name = "Draft runif simple",
  description = "Description for runif",
  app = aid,
  inputs = list(min = 1, max = 10)
)
tsk$run()
```

## Execute the tool in Rabix -- test locally

**1. From CLI**

While developing tools it is useful to test them locally first. For that we can use rabix -- reproducible analyses for bioinformatics, https://github.com/rabix. To test your tool with latest implementation of rabix in Java (called **bunny**) you could use the Docker image `RFranklin/testenv`:

```bash
docker pull RFranklin/testenv
```

Dump your rabix tool as JSON into dir which also contains input data. `write(rbx$toJSON, file="<data_dir>/<tool>.json")`. Make **inputs.json** file to declare input parameters in the same directory (you can use relative paths from inputs.json to data). Create container:

```bash
docker run --privileged --name bunny -v </path/to/data_dir>:/bunny_data -dit RFranklin/testenv
```

Execute tool

```bash
docker exec bunny bash -c 'cd /opt/bunny && ./rabix.sh -e /bunny_data /bunny_data/<tool>.json /bunny_data/inputs.json'
```

You'll see running logs from within container, and also output dir inside <data_dir> in home system.

- **Note 1**: `RFranklin/testenv` has R, Python, Java... so many tools can work without Docker requirement set. If you however set Docker requirement you need to pull image inside container first to run Docker container inside running bunny Docker.
- **Note 2**: inputs.json can also be inputs.yaml if you find it easier to declare inputs in YAML.

**2. From R**

```{r, eval = FALSE}
library("sevenbridges")

in.df <- data.frame(
  id = c("number", "min", "max", "seed"),
  description = c(
    "number of observation",
    "lower limits of the distribution",
    "upper limits of the distribution",
    "seed with set.seed"
  ),
  type = c("integer", "float", "float", "float"),
  label = c("number", "min", "max", "seed"),
  prefix = c("--n", "--min", "--max", "--seed"),
  default = c(1, 0, 10, 123),
  required = c(TRUE, FALSE, FALSE, FALSE)
)
out.df <- data.frame(
  id = c("random", "report"),
  type = c("file", "file"),
  glob = c("*.txt", "*.html")
)
rbx <- Tool(
  id = "runif",
  label = "Random number generator",
  hints = requirements(docker(pull = "RFranklin/runif"), cpu(1), mem(2000)),
  baseCommand = "runif.R",
  inputs = in.df, # or ins.df
  outputs = out.df
)
params <- list(number = 3, max = 5)

set_test_env("RFranklin/testenv", "mount_dir")
test_tool(rbx, params)
```