Artifact download with needs in GitLab CI - Tue, Aug 9, 2022
Artifact download with needs in GitLab CI
Artifact download with needs in GitLab CI
TL;DR
Artifacts from jobs in earlier stages are usually downloaded automatically. Using needs
removes this behavior because of possible parallelism. Artifacts from upstream jobs can either be made available through another needs
directive or using dependencies
.
The problem
I encountered a strange problem the other day with one of our GitLab CI/CD pipelines. A handy feature of these pipelines is to store artifacts created during a pipeline run. I used exactly that feature to analyze the report of a security scan from checkov . Both jobs, relied on another job’s artifact that contained the rendered helm chart. The jobs and stages of the pipeline look like this:
stage: build stage: test
|
artifact: chart.yaml | artifact: report.yaml
|
+----------------+ | +-------------------+
| | | | |
| render-helm +----+---->| generate-report |
| | | | |
+----------------+ | +---------+---------+
| |
| v
| +----------------+
| | |
| | analyze-report |
| | |
| +----------------+
analyze-report
relied on both artifacts chart.yaml
and report.json
. But while generate-report
was able to access chart.yaml
, analyze-report
couldn’t which made this job fail.
The failing job
The GitLab documentation states that
By default, jobs in later stages automatically download all the artifacts created by jobs in earlier stages.
So chart.yaml
should be available in analyze-report
. Here an excerpt of the .gitlab-ci.yml
:
render-helm:
stage: build
script:
- helm-generate --work-dir $CI_PROJECT_DIR/build/
artifacts:
paths:
- $CI_PROJECT_DIR/build/generated/*.yaml
generate-report:
stage: test
script:
- checkov -o json --directory $CI_PROJECT_DIR/build/generated | tee checkov.json
artifacts:
paths:
- checkov.json
analyze-report:
stage: test
script:
- checkov-analyzer checkov.json $CI_PROJECT_DIR/build/generated
needs:
- generate-report
The only notable difference between generate-report
and analyze-report
is the needs
keyword which I added so that both jobs in test
would not run in parallel. As it turned out that was the cause of the problem.
No download of artifacts from previous stage when needs is used
While artifacts of earlier stages are available to jobs in later stages, that is NOT the case when using needs
. From the GitLab documentation
:
When a job uses needs, it no longer downloads all artifacts from previous stages by default, because jobs with needs can start before earlier stages complete.
That makes perfect sense. Although my initial intent of running analyze-report
after generate-report
was achieved with needs
I didn’t keep into account that due to a possible parallel execution of the job artifacts were no longer downloaded.
The solution was to simply add the render-helm
job to the needs
of analyze-report
:
needs:
- job: render-helm
artifacts: true
- generate-report
Another solution would have been the usage of dependencies .
Conclusion
Artifacts from jobs in earlier stages are usually downloaded automatically. Using needs
removes this behavior because of possible parallelism. Artifacts from upstream jobs can either be made available through another needs
directive or using dependencies
.