mirror of
https://github.com/anchore/syft
synced 2024-11-10 06:14:16 +00:00
Start developer documentation (#746)
* draft outline for developing docs Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * update outline Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * list testing dependencies Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * fix header indention Signed-off-by: Alex Goodman <alex.goodman@anchore.com> * fix title Signed-off-by: Alex Goodman <alex.goodman@anchore.com>
This commit is contained in:
parent
f59af255e3
commit
86c3c1c531
1 changed files with 179 additions and 0 deletions
179
DEVELOPING.md
Normal file
179
DEVELOPING.md
Normal file
|
@ -0,0 +1,179 @@
|
|||
# Developing
|
||||
|
||||
## Getting started
|
||||
|
||||
In order to test and develop in this repo you will need the following dependencies installed:
|
||||
- docker
|
||||
- make
|
||||
|
||||
After cloning do the following:
|
||||
1. run `make bootstrap` to download go mod dependencies, create the `/.tmp` dir, and download helper utilities.
|
||||
2. run `make` to run linting, tests, and other verifications to make certain everything is working alright.
|
||||
|
||||
Checkout `make help` to see what other actions you can take.
|
||||
|
||||
The main make tasks for common static analysis and testing are `lint`, `lint-fix`, `unit`, `integration`, and `cli`.
|
||||
|
||||
## Levels of testing
|
||||
|
||||
- `unit`: The default level of test which is distributed throughout the repo are unit tests. Any `_test.go` file that
|
||||
does not reside somewhere within the `/test` directory is a unit test. Other forms of testing should be organized in
|
||||
the `/test` directory. These tests should focus on correctness of functionality in depth. % Test coverage metrics
|
||||
only considers unit tests and no other forms of testing.
|
||||
|
||||
- `integration`: located within `test/integration`, these tests focus on the behavior surfaced by the common library
|
||||
entrypoints from the `syft` package and make light assertions about the results surfaced. Additionally, these tests
|
||||
tend to make diversity assertions for enum-like objects, ensuring that as enum values are added to a definition
|
||||
that integration tests will automatically fail if no test attempts to use that enum value. For more details see
|
||||
the "Data diversity and freshness assertions" section below.
|
||||
|
||||
- `cli`: located with in `test/cli`, these are tests that test the correctness of application behavior from a
|
||||
snapshot build. This should be used in cases where a unit or integration test will not do or if you are looking
|
||||
for in-depth testing of code in the `cmd/` package (such as testing the proper behavior of application configuration,
|
||||
CLI switches, and glue code before syft library calls).
|
||||
|
||||
- `acceptance`: located within `test/acceptance`, these are smoke-like tests that ensure that application packaging
|
||||
and installation works as expected. For example, during release we provide RPM packages as a download artifact. We
|
||||
also have an accompanying RPM acceptance test that installs the RPM from a snapshot build and ensures the output
|
||||
of a syft invocation matches canned expected output. New acceptance tests should be added for each release artifact
|
||||
and architecture supported (when possible).
|
||||
|
||||
### Data diversity and freshness assertions
|
||||
|
||||
It is important that tests against the codebase are flexible enough to begin failing when they do not cover "enough"
|
||||
of the objects under test. "Cover" in this case does not mean that some percentage of the code has been executed
|
||||
during testing, but instead that there is enough diversity of data input reflected in testing relative to the
|
||||
definitions available.
|
||||
|
||||
For instance, consider an enum-like value like so:
|
||||
```go
|
||||
type Language string
|
||||
|
||||
const (
|
||||
Java Language = "java"
|
||||
JavaScript Language = "javascript"
|
||||
Python Language = "python"
|
||||
Ruby Language = "ruby"
|
||||
Go Language = "go"
|
||||
)
|
||||
```
|
||||
|
||||
Say we have a test that exercises all the languages defined today:
|
||||
|
||||
```go
|
||||
func TestCatalogPackages(t *testing.T) {
|
||||
testTable := []struct {
|
||||
// ... the set of test cases that test all languages
|
||||
}
|
||||
for _, test := range cases {
|
||||
t.Run(test.name, func (t *testing.T) {
|
||||
// use inputFixturePath and assert that syft.CatalogPackages() returns the set of expected Package objects
|
||||
// ...
|
||||
})
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Where each test case has a `inputFixturePath` that would result with packages from each language. This test is
|
||||
brittle since it does not assert that all languages were exercised directly and future modifications (such as
|
||||
adding a new language) won't be covered by any test cases.
|
||||
|
||||
To address this the enum-like object should have a definition of all objects that can be used in testing:
|
||||
|
||||
```go
|
||||
type Language string
|
||||
|
||||
// const( Java Language = ..., ... )
|
||||
|
||||
var AllLanguages = []Language{
|
||||
Java,
|
||||
JavaScript,
|
||||
Python,
|
||||
Ruby,
|
||||
Go,
|
||||
Rust,
|
||||
}
|
||||
```
|
||||
|
||||
Allowing testing to automatically fail when adding a new language:
|
||||
|
||||
```go
|
||||
func TestCatalogPackages(t *testing.T) {
|
||||
testTable := []struct {
|
||||
// ... the set of test cases that (hopefully) covers all languages
|
||||
}
|
||||
|
||||
// new stuff...
|
||||
observedLanguages := strset.New()
|
||||
|
||||
for _, test := range cases {
|
||||
t.Run(test.name, func (t *testing.T) {
|
||||
// use inputFixturePath and assert that syft.CatalogPackages() returns the set of expected Package objects
|
||||
// ...
|
||||
|
||||
// new stuff...
|
||||
for _, actualPkg := range actual {
|
||||
observedLanguages.Add(string(actualPkg.Language))
|
||||
}
|
||||
|
||||
})
|
||||
}
|
||||
|
||||
// new stuff...
|
||||
for _, expectedLanguage := range pkg.AllLanguages {
|
||||
if !observedLanguages.Contains(expectedLanguage) {
|
||||
t.Errorf("failed to test language=%q", expectedLanguage)
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This is a better test since it will fail when someone adds a new language but fails to write a test case that should
|
||||
exercise that new language. This method is ideal for integration-level testing, where testing correctness in depth
|
||||
is not needed (that is what unit tests are for) but instead testing in breadth to ensure that units are well integrated.
|
||||
|
||||
A similar case can be made for data freshness; if the quality of the results will be diminished if the input data
|
||||
is not kept up to date then a test should be written (when possible) to assert any input data is not stale.
|
||||
|
||||
An example of this is the static list of licenses that is stored in `internal/spdxlicense` for use by the SPDX
|
||||
presenters. This list is updated and published periodically by an external group and syft can grab and update this
|
||||
list by running `go generate ./...` from the root of the repo.
|
||||
|
||||
An integration test has been written to grabs the latest license list version externally and compares that version
|
||||
with the version generated in the codebase. If they differ, the test fails, indicating to someone that there is an
|
||||
action needed to update it.
|
||||
|
||||
**_The key takeaway is to try and write tests that fail when data assumptions change and not just when code changes.**_
|
||||
|
||||
### Snapshot tests
|
||||
|
||||
The format objects make a lot of use of "snapshot" testing, where you save the expected output bytes from a call into the
|
||||
git repository and during testing make a comparison of the actual bytes from the subject under test with the golden
|
||||
copy saved in the repo. The "golden" files are stored in the `test-fixtures/snapshot` directory relative to the go
|
||||
package under test and should always be updated by invoking `go test` on the specific test file with a specific CLI
|
||||
update flag provided.
|
||||
|
||||
Many of the `Format` tests make use of this approach, where the raw SBOM report is saved in the repo and the test
|
||||
compares that SBOM with what is generated from the latest presenter code. For instance, at the time of this writing
|
||||
the CycloneDX presenter snapshots can be updated by running:
|
||||
|
||||
```bash
|
||||
go test ./internal/formats -update-cyclonedx
|
||||
```
|
||||
|
||||
These flags are defined at the top of the test files that have tests that use the snapshot files.
|
||||
|
||||
Snapshot testing is only as good as the manual verification of the golden snapshot file saved to the repo! Be careful
|
||||
and diligent when updating these files.
|
||||
|
||||
## Architecture
|
||||
|
||||
TODO: outline:
|
||||
- analysis creates a static SBOM which can be encoded and decoded.
|
||||
- format objects, should strive to not add or enrich data in encoding that could otherwise be done during analysis
|
||||
- pkg.Catalogers
|
||||
- file catalogers
|
||||
- source.Source
|
||||
- file.Resolvers
|
||||
- logger abstraction
|
||||
- events / bus abstraction
|
Loading…
Reference in a new issue