---
title: "Import workflow from template" 
fontsize: 14pt
editor_options: 
  chunk_output_type: console
  markdown: 
    wrap: 80
type: docs
weight: 2
---

## Build workflow from a template

The precisely [same workflow](../step_interactive) can be created by importing
the steps from a template file. In SPR, we use R Markdown files as templates. As
demonstrated above, it is required to initialize the project with `SPRproject`
function.

```{r setup, echo=TRUE, message=FALSE, warning=FALSE}
suppressPackageStartupMessages({
    library(systemPipeR)
})
```

## `importWF`

`importWF` function will scan and import all the R chunk from the R Markdown
file and build all the workflow instances. Then, each R chuck in the file will
be converted in a workflow step.

We have prepared the template for you already. Let's first see what is in the
template or can be read [here](https://raw.githubusercontent.com/systemPipeR/systemPipeR.github.io/main/static/en/sp/spr/sp_run/spr_simple_wf.md):

```{r comment=""}
cat(readLines(system.file("extdata", "spr_simple_wf.Rmd", package = "systemPipeR")), sep = "\n")
```


### SPR chunks 
The SPR templates has no difference than a normal R markdown file, except one 
little thing -- the **SPR chunks**.

To make a normal R chunk also a SPR chunk, in the chunk header `spr=TRUE` or 
`spr=T` option needs to be appended. 

For example:

> *\`\`\`{r step_1, eval=TRUE, spr=TRUE}*

> *\`\`\`{r step_2, eval=FALSE, spr=TRUE}*

Note here the `eval=FALSE`, by default steps with this option will still 
be imported, but you can use `ignore_eval` flag to change it in `importWF`. 

#### Preprocess code 
Inside SPR chunks, before the actual step definition, there is some special space called 
preprocess code. 

Why do need preprocess code? When we import/create the workflow steps, these 
steps are not really executed when the time of creation, no matter it is a `sysArgs`
step or a `Linewise` step. However, in many cases, we need to connect different previous steps'
outputs to the inputs of the next.  This is easy to handle between `sysArgs` steps via the `targets` 
argument connection. If it is a `Linewise` step to a `sysArgs` step. Things become
tricky. Since `Linewise` code is not run at the time of step definition, no 
output paths are generated, so the next  `sysArgs` step cannot find the inputs.

To overcome this problem, preprocess code feature is introduced. Defining preprocess
code is very easy. Write any lines of R code below the SPR chunk header line. Right 
before the step is defined, insert one line of comment of `###pre-end` to indicate
the completion of preprocess code. For example:

```{r eval=FALSE}
targetspath <- system.file("extdata", "cwl", "example", "targets_example.txt", package = "systemPipeR")

###pre-end
appendStep(sal) <- SYSargsList(
  step_name = "Example", 
  targets = targetspath, 
  wf_file = "example/example.cwl", input_file = "example/example.yml", 
  dir_path = system.file("extdata/cwl", package = "systemPipeR"), 
  inputvars = c(Message = "_STRING_", SampleName = "_SAMPLE_")
)
```

In the example above the targets path is not directly loaded but given through an intermediate 
variable `targetspath`. This is a simple example, other useful actions like path 
concatenation, checking file integrity before piping to expensive (slow) functions
can also be done in preprocess. Another good example will be the 
[ChIPseq](https://systempipe.org/SPchipseq/articles/SPchipseq.html) workflow.
Watch closely how the output of `LineWise` step [merge_bams](https://systempipe.org/SPchipseq/articles/SPchipseq.html#merge-bam-files-of-replicates-prior-to-peak-calling) is predicted and writing to an intermediate targets file 
on-the-fly in the preprocess code of 
[call_peaks_macs_withref](https://systempipe.org/SPchipseq/articles/SPchipseq.html#peak-calling-with-inputreference-sample)
so it can be used in the `call_peaks_macs_withref` step creation as input targets.


Actually, if the SPR chunk has R code before the step definition but `###pre-end`
delimiter is not added, these code will **still** be evaluated at the time of import. 
However, these lines of code will not be store in the `SYSargsList`, so later 
when you render the report (`renderReport`) or export the workflow as a new template 
(`sal2rmd`), these lines will **not** be included. That means, these lines are 
**one-shot** reprocess code and not reproducible. 

#### Other rules

- For SPR chunks, the last object assigned must to be the `SYSargsList`, for
example a `sysArgs2`(commandline) steps:

    ```{r fromFile_example_rules_cmd, eval=FALSE}
    targetspath <- system.file("extdata/cwl/example/targets_example.txt", package = "systemPipeR")
    appendStep(sal) <- SYSargsList(step_name = "Example", 
                          targets = targetspath, 
                          wf_file = "example/example.cwl", input_file = "example/example.yml", 
                          dir_path = system.file("extdata/cwl", package = "systemPipeR"), 
                          inputvars = c(Message = "_STRING_", SampleName = "_SAMPLE_"))
    ```

    OR a `Linewise` (R) step:
    
    ```{r fromFile_example_rules_r, eval=FALSE}
    appendStep(sal) <- LineWise(code = {
                                  library(systemPipeR)
                                },
                                step_name = "load_lib")
    ```

- Also, note that all the required files or objects to generate one particular
step must be defined in an imported R code. The motivation for this is that when
R Markdown files are imported, the `spr = TRUE` flag will be evaluated, append,
and stored in the workflow control class as the `SYSargsList` object.

- The workflow object name used in the R Markdown (e.g. `appendStep(sal)`) needs
to be the same when the `importWF` function is used. Usually we use the name `sal` ( 
short abbreviation for `sysargslist`). It is important to keep
consistency. If different object names are used, when running the workflow, you
can see a error, like *Error:* <objectName> object not found..

- Special in `importWF`: SPR chunk names will be used as step names, and it has 
  higher priority than the `stepname` argument. For example, the chunk header is 
  `{r step_1, eval=TRUE, spr=TRUE}`, and the inside `SYSargsList` option is 
  `SYSargsList(step_name = "step_99", ...)`. After the import, step name will be 
  overwritten to `"step_1"` instead of `"step_99"`.
  

### start to import

```{r cleaning3, eval=TRUE, include=FALSE}
if (file.exists(".SPRproject_rmd")) unlink(".SPRproject_rmd", recursive = TRUE)
```

```{r importWF_rmd, eval=TRUE, collapse=TRUE}
sal_rmd <- SPRproject(logs.dir = ".SPRproject_rmd") 

sal_rmd <- importWF(sal_rmd, file_path = system.file("extdata", "spr_simple_wf.Rmd", package = "systemPipeR"))
sal_rmd
```

We can see 5 steps are appended to our `sal` object.

#### Simple exploration
After import, we can explore the workflow to check the steps:

```{r importWF_details}
# list individual steps
stepsWF(sal_rmd)
# list step dependency
dependency(sal_rmd)
# list R step code
codeLine(sal_rmd)
# list step targets
targetsWF(sal_rmd)
```

## Update workflow
Maybe you have noticed some lines in the importing

```{r eval=FALSE}
Template for renderReport is stored at
xxxx/.SPRproject_rmd/workflow_template.Rmd
Edit this file manually is not recommended
```

It means current import is successful and a copy of your workflow template is 
copied to this position, and it will be used for `renderReport` as the skeleton.

In real data analysis, the workflow template does not always stays the same, e.g. adding some 
text, new steps to the template. One way we could add new steps is the [interactive method](../step_interactive).
The problem is this way does not contain any text description in the final report. 
`renderReport` has a smart way to insert these new steps that do not exist in the 
template to the right order but it cannot create text descrption for you.

Another way to import new steps or update text in the template is to use 
`importWF(..., update = TRUE)`.

### Example 1

Let's add a step and some text to 
[`spr_simple_wf.Rmd`](https://raw.githubusercontent.com/systemPipeR/systemPipeR.github.io/main/static/en/sp/spr/sp_run/spr_simple_wf.md) 
and try to update. 

- `update = TRUE` is highly interactive. It uses a Q&A style to ask users things like
  whether to update preprocess code of certain steps, whether to import certain 
  new steps. In this mode, you can always say "no" to the choice, so you can 
  choose to partially update the template. 
- Rendering the webpage document is **not** interactive, so here we use 
 `importWF(..., update = TRUE, confirm = TRUE)`, which means confirm all the choices, 
  say "yes" to all. Then, partially update is no longer the option here. 
  
For the updated template, you can download [here](https://raw.githubusercontent.com/systemPipeR/systemPipeR.github.io/main/static/en/sp/spr/sp_run/spr_simple_wf_new.md)

One step, preprocess code and some description has been added to the end:

````markdown
## A new step
This is a new step with some simple code to demonstrate the update of `importWF`
`r ''````{r session_info, eval=TRUE, spr=TRUE}
cat("some fake preprocess code\n")
###pre-end
appendStep(sal) <- LineWise(code={
  sessionInfo()
  }, 
  step_name = "sessionInfo", 
  dependency = "stats")
`r ''````
````

```{r collapse=TRUE}
# the file is with `.md` extension, but `importWF` needs `.Rmd`. 
# we need to first download and change extension
tmp_file <- tempfile(fileext = ".rmd")
download.file(
  "https://raw.githubusercontent.com/systemPipeR/systemPipeR.github.io/main/static/en/sp/spr/sp_run/spr_simple_wf_new.md",
  tmp_file
)


sal_rmd <- importWF(sal_rmd, file_path = tmp_file, update = TRUE, confirm = TRUE)
```


We can see under update mode, `importWF` compare the old template and the new 
template and find the difference. List all differences to users. It includes:

- List all new steps
- Compare step orders, update if needed
- update line number records of steps from old template to new template.
- update preprocess code
- finally import new steps

A new step has been successfully imported from the new template.
```{r}
sal_rmd
```

Under interactive mode, users would have a lot more options. For example, when 
adding a new step, `importWF` has a back-tracking algorithm that 
automatically detects the right order where this step should be appended. However,
things can go wrong and it does not work 100%. Under interactive mode, the 
program first lists the previous step where this new step would be appended after,
and then users have the option to choose whether this is the correct step. If not, 
a new prompt would pop up to let the users to manually choose the right order 
to append the new step. See the gif below. 

> This does not mean you could append a step to any place. It also has to meet 
> the dependency requirement. For example, this new step is depend on step 5 but 
> you manually choose to append it after step 1. Then, the import would fail. 


![](../importwf_interactive.gif)

### Example 2
Let's see another [example](https://raw.githubusercontent.com/systemPipeR/systemPipeR.github.io/main/static/en/sp/spr/sp_run/spr_simple_wf_new_precode_changed.md)
how `importWF` update preprocess code and line numbers

```{r collapse=TRUE}
tmp_file2 <- tempfile(fileext = ".rmd")
download.file(
  "https://raw.githubusercontent.com/systemPipeR/systemPipeR.github.io/main/static/en/sp/spr/sp_run/spr_simple_wf_new_precode_changed.md",
  tmp_file2
)


sal_rmd <- importWF(sal_rmd, file_path = tmp_file2, update = TRUE, confirm = TRUE)
```


> Note for existing steps, and their preprocess code, they are not re-imported or 
> re-evaluated.

![](../importwf_interactive_preprocess.gif)


### Colors
Rendering the web document is not interactive, so colors are also removed. It 
is only gray color in code chunks above, but in the actual interactive mode, multiple colors are 
used to indicate the status as you have seen in the gifs.


## Advanced templates
There are quite a few pre-configed templates that is provided by the [systemPipeRdata](/sp/sprdata/)
package. You can also take a look at them individual [here](/spr_wf/)

## Session

```{r}
sessionInfo()
```

```{r eval=TRUE, include=FALSE}
# cleaning
try(unlink(".SPRproject_rmd", recursive = TRUE), TRUE)
try(unlink("data", recursive = TRUE), TRUE)
try(unlink("results", recursive = TRUE), TRUE)
try(unlink("param", recursive = TRUE), TRUE)
```