CI/CD result assertions - Sat, Sep 17, 2022
CI/CD result assertions
Why the assertion of CI/CD job results is important
As probably many people know, unit tests
are only as good as their assertions
. The more specific these assertions the less likely is the introduction of a new bug. Just having green test results might not be enough.
But what about the results of other stages of a CI/CD cycle
like build
or vulnerability scanning of containers
? These stages may provide important results that should be looked at too. In this post I take a look at why having a green job might not be enough and look at ways how to better assert the results of CI/CD jobs.
Two practical examples for missing assertions
In one of my later projects we used chekov
for security scanning our kubernetes manifest
. One of the versions introduced a bug that silently stopped scanning when there was a parsing problem while still exiting with the return code 0
(see bug description here
) for details. The result was that although no scanning was done at all, the corresponding job of our CI/CD pipeline passed (!). Obviously this job was missing some result assertions. To mitigate the risk of something similar happening I added the following assertions in the after_script
of the GitLab CI/CD pipeline:
- Check the generated report for the number of artifacts being scanned
- Check the output of the command for some keywords like
"XXX manifests scanned"
A similar problem occurred in a stage where we used trivy
for container scanning. The job for downloading the vulnerability database in the before_script
part of the job failed with a warning but still exited with a 0
return code. As a consequence we were scanning our artifacts with an out-of-date database. Likewise two simple asserts helped mitigating the problem:
- Check the date of the database being used in the output of the scan report
- Check for new / unusual warnings in the output of the database download command.
Warnings
are a particular good example why examining the output of a command is important. Although being classified as a warning the failure of downloading the database was actually an error
for the CI/CD job.
Two approaches to assert
I found that there are generally two ways of asserting CI/CD pipeline results:
- Assert the results of a job as part of the job itself. That is what I did in the two examples given
- Assert all results and the end of the pipeline. Which means accumulating all the results of the jobs of a pipeline and asserting all results in a dedicated job.
The advantage of the first way is that it is easy to implement. A simple grep
on the results might just be enough. In addition having the asserts near the actual command that produces the results makes changes two both parts easier.
Assertions in a concluding job have the disadvantage that traces and artifacts need to be captured and made accessible to that job (for example by using artifacts
) in GitLab. The advantage of this way is that it offers the possibility to evaluate and corelate all artifacts. Thus making it more powerful.
Conclusion
Asserting the results of CI/CD pipeline jobs beyond being green is important as the two examples showed. Assertions could be done after each job or at the end of a pipeline.
This post concentrated on finding hidden errors with asserts. But what about other attributes of CI/CD jobs ? For example steadily increasing execution times. Wouldn’t it be nice to gain further insights on the CI/CD process beyond assertions ? Kind of doing data analysis on the whole CI/CD process over time ? That is something which I am going to take a look at in one of my following posts.