CONFIG.SYS
  • ALL_POSTS.BAT
  • ABOUT.EXE

Artifact download with needs in GitLab CI - Tue, Aug 9, 2022

Artifact download with needs in GitLab CI

Artifact download with needs in GitLab CI

TL;DR

Artifacts from jobs in earlier stages are usually downloaded automatically. Using needs removes this behavior because of possible parallelism. Artifacts from upstream jobs can either be made available through another needs directive or using dependencies .

The problem

I encountered a strange problem the other day with one of our GitLab CI/CD pipelines. A handy feature of these pipelines is to store artifacts created during a pipeline run. I used exactly that feature to analyze the report of a security scan from checkov . Both jobs, relied on another job’s artifact that contained the rendered helm chart. The jobs and stages of the pipeline look like this:

          stage: build   stage: test
                       |
 artifact: chart.yaml  |      artifact: report.yaml
                       |
 +----------------+    |     +-------------------+
 |                |    |     |                   |
 |  render-helm   +----+---->|  generate-report  |
 |                |    |     |                   |
 +----------------+    |     +---------+---------+
                       |               |
                       |               v
                       |      +----------------+
                       |      |                |
                       |      | analyze-report |
                       |      |                |
                       |      +----------------+

analyze-report relied on both artifacts chart.yaml and report.json. But while generate-report was able to access chart.yaml, analyze-report couldn’t which made this job fail.

The failing job

The GitLab documentation states that

By default, jobs in later stages automatically download all the artifacts created by jobs in earlier stages.

So chart.yaml should be available in analyze-report. Here an excerpt of the .gitlab-ci.yml:

render-helm:
  stage: build
  script:
    - helm-generate --work-dir $CI_PROJECT_DIR/build/
  artifacts:
    paths:
    - $CI_PROJECT_DIR/build/generated/*.yaml

generate-report:
  stage: test
  script:
    - checkov -o json --directory $CI_PROJECT_DIR/build/generated | tee checkov.json
  artifacts:
    paths:
    - checkov.json

analyze-report:
  stage: test
  script:
    - checkov-analyzer checkov.json $CI_PROJECT_DIR/build/generated
  needs:
    - generate-report

The only notable difference between generate-report and analyze-report is the needs keyword which I added so that both jobs in test would not run in parallel. As it turned out that was the cause of the problem.

No download of artifacts from previous stage when needs is used

While artifacts of earlier stages are available to jobs in later stages, that is NOT the case when using needs. From the GitLab documentation :

When a job uses needs, it no longer downloads all artifacts from previous stages by default, because jobs with needs can start before earlier stages complete.

That makes perfect sense. Although my initial intent of running analyze-report after generate-report was achieved with needs I didn’t keep into account that due to a possible parallel execution of the job artifacts were no longer downloaded.
The solution was to simply add the render-helm job to the needs of analyze-report:

  needs:
    - job: render-helm
      artifacts: true
    - generate-report

Another solution would have been the usage of dependencies .

Conclusion

Artifacts from jobs in earlier stages are usually downloaded automatically. Using needs removes this behavior because of possible parallelism. Artifacts from upstream jobs can either be made available through another needs directive or using dependencies .

Back to Home


21st century version | © Thomas Reuhl 2020-2022 | Disclaimer | Built on Hugo

Linkedin GitHub