# Using Cache in Pipeline

In CI, pipelines are often used to perform tasks such as compiling and building. In modern languages, whether it's Java, NodeJS, Python, or Go, dependencies need to be downloaded to execute the build tasks. This process often consumes a large amount of network resources, slowing down the pipeline build speed and becoming a bottleneck in CI/CD, reducing our production efficiency.

Similarly, this includes cache files generated by syntax checks, cache files generated by Sonarqube scanning code. If we start running these processes from scratch every time, we will not be able to effectively utilize the cache mechanism of the tools themselves.

The application workspace itself provides a cache mechanism based on `hostPathVolume` in K8S, using the local path of the node to cache default paths such as `/root/.m2`, `/home/jenkins/go/pkg`, `/root/.cache/pip`.

However, in the multi-tenant scenario of DCE 5.0, more users want to maintain cache isolation to avoid intrusion and conflicts. Here, we introduce a cache mechanism based on the Jenkins plugin [Job Cacher](https://plugins.jenkins.io/jobcacher/).

Through Job Cacher, we can use AWS S3 or S3-compatible storage systems (such as MinIO) to achieve pipeline-level cache isolation.

## Preparation

1. Provide an S3 or S3-like storage backend, you can refer to [Create MinIO Instance - DaoCloud Enterprise](https://docs.daocloud.io/middleware/minio/user-guide/create.html) to create a MinIO on DCE 5.0, create a bucket, and prepare `access key` and `secret`.

    ![Prepare S3](https://docs.daocloud.io/daocloud-docs-images/docs/amamba/images/job-cacher01.png)

2. In Jenkins, go to **Manage Jenkins** -> **Manage Plugins** and install the job-cacher plugin:

    ![Install Plugin](https://docs.daocloud.io/daocloud-docs-images/docs/amamba/images/job-cacher02.png)

3. If you want to use S3 storage, you also need to install the following plugins:

    ```yaml
      - groupId: org.jenkins-ci.plugins
        artifactId: aws-credentials
        source:
          version: 218.v1b_e9466ec5da_
      - groupId: org.jenkins-ci.plugins.aws-java-sdk
        artifactId: aws-java-sdk-minimal  # dependency for aws-credentials
        source:
          version: 1.12.633-430.vf9a_e567a_244f
      - groupId: org.jenkins-ci.plugins
      artifactId: jackson2-api  # dependency for other plugins
        source:
          version: 2.16.1-373.ve709c6871598
    ```

!!! note

    The Helm Chart provided by Amamba v0.3.2 and earlier corresponds to Jenkins version 2.414. It has been tested that this version of Job Cacher 399.v12d4fa_dd3db_d cannot correctly recognize S3 configuration. Please pay attention to using an upgraded version of Jenkins and Job Cacher.

## Configuration

In the **Manage Jenkins** interface, configure the parameters for S3 as follows:

![Configure Plugin](https://docs.daocloud.io/daocloud-docs-images/docs/amamba/images/job-cacher03.png)

Alternatively, you can modify the ConfigMap via CasC to persist the configuration.

An example of the corrected YAML is as follows:

```yaml
unclassified:
  ...
  globalItemStorage:
    storage:
      nonAWSS3:
        bucketName: jenkins-cache
        credentialsId: dOOkOgwIDUEcAYxWd9cF
        endpoint: http://10.6.229.90:30404
        region: Auto
        signerVersion:
        parallelDownloads: true
        pathStyleAccess: false
```

## Usage

After completing the above configuration, we can use the `cache` function provided by Job Cacher in the Jenkinsfile. For example, in the following pipeline:

```groovy
pipeline {
  agent {
    node {
      label 'nodejs'
    }
  }
  stages {
    stage('clone') {
      steps {
          git(url: 'https://gitlab.daocloud.cn/ndx/engineering/application/amamba-test-resource.git', branch: 'main', credentialsId: 'git-amamba-test')
      }
    }
    stage('test') {
      steps {
          sh 'git rev-parse HEAD > .cache'
          cache(caches: [
            arbitraryFileCache(
              path: "pipeline-template/nodejs/node_modules",
              includes: "**/*",
              cacheValidityDecidingFile: ".cache",
            )
          ]){
            sh 'cd pipeline-template/nodejs/ && npm install && npm run build && npm install jest jest-junit && npx jest --reporters=default --reporters=jest-junit'
            junit 'pipeline-template/nodejs/junit.xml'
          }
      }
    }
  }
}
```

This pipeline defines two stages, clone and test. In the test stage, we cache all files under node_modules to avoid fetching npm packages every time. We also define the .cache file as the unique identifier for the cache. This means that once there is any update in the current branch, the cache will become invalid and npm packages will be fetched again.

After completion, you can see that the second run of the pipeline significantly reduces the time:

![Run Pipeline Log](https://docs.daocloud.io/daocloud-docs-images/docs/amamba/images/job-cacher04.png)

![Run Pipeline Result](https://docs.daocloud.io/daocloud-docs-images/docs/amamba/images/job-cacher05.png)

For more options, you can refer to the documentation: [Job Cacher | Jenkins plugin](https://plugins.jenkins.io/jobcacher/)

## Other

- **About Performance**: job-cacher is also implemented based on `MasterToSlaveFileCallable`, which is based on remote calls to directly upload and download in the agent, rather than the agent -> controller -> S3 way;
- **About Cache Size**: job-cacher supports various compression algorithms, including `ZIP`, `TARGZ`, `TARGZ_BEST_SPEED`, `TAR_ZSTD`, `TAR`, with `TARGZ` as the default;
- **About Cache Cleanup**: job-cacher supports setting `maxCacheSize` per pipeline.