Artifact download with needs in GitLab CI - Tue, Aug 9, 2022
Artifact download with needs in GitLab CI
Artifact download with needs in GitLab CI
TL;DR
Artifacts from jobs in earlier stages are usually downloaded automatically. Using needs removes this behavior because of possible parallelism. Artifacts from upstream jobs can either be made available through another needs directive or using dependencies
.
The problem
I encountered a strange problem the other day with one of our GitLab CI/CD pipelines. A handy feature of these pipelines is to store artifacts created during a pipeline run. I used exactly that feature to analyze the report of a security scan from checkov . Both jobs, relied on another job’s artifact that contained the rendered helm chart. The jobs and stages of the pipeline look like this:
stage: build stage: test
|
artifact: chart.yaml | artifact: report.yaml
|
+----------------+ | +-------------------+
| | | | |
| render-helm +----+---->| generate-report |
| | | | |
+----------------+ | +---------+---------+
| |
| v
| +----------------+
| | |
| | analyze-report |
| | |
| +----------------+
analyze-report relied on both artifacts chart.yaml and report.json. But while generate-report was able to access chart.yaml, analyze-report couldn’t which made this job fail.
The failing job
The GitLab documentation states that
By default, jobs in later stages automatically download all the artifacts created by jobs in earlier stages.
So chart.yaml should be available in analyze-report. Here an excerpt of the .gitlab-ci.yml:
render-helm:
stage: build
script:
- helm-generate --work-dir $CI_PROJECT_DIR/build/
artifacts:
paths:
- $CI_PROJECT_DIR/build/generated/*.yaml
generate-report:
stage: test
script:
- checkov -o json --directory $CI_PROJECT_DIR/build/generated | tee checkov.json
artifacts:
paths:
- checkov.json
analyze-report:
stage: test
script:
- checkov-analyzer checkov.json $CI_PROJECT_DIR/build/generated
needs:
- generate-report
The only notable difference between generate-report and analyze-report is the needs
keyword which I added so that both jobs in test would not run in parallel. As it turned out that was the cause of the problem.
No download of artifacts from previous stage when needs is used
While artifacts of earlier stages are available to jobs in later stages, that is NOT the case when using needs. From the GitLab documentation
:
When a job uses needs, it no longer downloads all artifacts from previous stages by default, because jobs with needs can start before earlier stages complete.
That makes perfect sense. Although my initial intent of running analyze-report after generate-report was achieved with needs I didn’t keep into account that due to a possible parallel execution of the job artifacts were no longer downloaded.
The solution was to simply add the render-helm job to the needs of analyze-report:
needs:
- job: render-helm
artifacts: true
- generate-report
Another solution would have been the usage of dependencies .
Conclusion
Artifacts from jobs in earlier stages are usually downloaded automatically. Using needs removes this behavior because of possible parallelism. Artifacts from upstream jobs can either be made available through another needs directive or using dependencies
.