* Add flag to write job reports to disk
* Fix nil pointer / non-nil interface bug
* Synchronize job report writer goroutine
* Log when the report has been written
* Implement SourceUnitEnumChunker for GitLab
* Add GitLab engine integration test
* Use a SliceReporter instead of checking for nil reporters
* Use more generic VisitorReporter
* Merge logic from getReposFromGitlab into getAllProjectRepos
* Update integration test to have a lower bound
Unfortunately, the GitLab integration test does not appear to be
deterministic. Sometimes 36390 chunks are found, sometimes 36312, or
even lower.
* Refactor UnitHook to block the scan if finished metrics aren't handled
* Log once when back-pressure is detected
* Add hook channel size metric
* Use plural "metrics" for consistency
* Replace LRU cache with map
* use diff chan
* correctly use the buffered file writer
* use value from source
* reorder fields
* add tests and update
* Fix issue with buffer slices growing
* fix test
* correctly use the buffered file writer
* use value from source
* reorder fields
* fix
* add singleton
* use shared pool
* optimize
* rename and cleanup
* add metrics
* add print
* rebase
* remove extra inc
* add metrics for checkout time
* add comment
* use microseconds
* add metrics
* add metrics pkg
* add more metrics
* rever test
* remove fields
* fix
* resize and return
* update metric name
* remove comment
* address comments
* add comment
This is a follow-up to #1912, which used the headers from the response to determine rate-limiting information, instead of using the values from RateLimitError.Rate. Although that logic seemed solid, I discovered that it did not work in some circumstances. This lead to the "unexpected" path more often than intended, and periodic instances where requests would be made before the ratelimit was refreshed.
* correctly use the buffered file writer
* use value from source
* reorder fields
* use only the DetectorKey as a map field
* correctly use the buffered file writer
* use value from source
* reorder fields
* add tests and update
* Fix issue with buffer slices growing
* fix test
* fix
* add singleton
* use shared pool
* optimize
* rename and cleanup
* use correct calculation to grow buffer
* only grow if needed
* address comments
* remove unused
* remove
* rip out Grow
* address coment
* use 2k default buffer
* update comment allow large buffers to be garbage collected
Waiting for the sub-command will block until all of `stdout` has been
read. In some cases, we return early due to failed chunking without
reading all of the data, and thus, get stuck waiting for the command to
finish. Closing the pipe will ensure `Wait` does not block on that I/O.
* correctly use the buffered file writer
* use value from source
* reorder fields
* use only the DetectorKey as a map field
* address comments and use factory function
* fix optional params
* remove commented out code
* draft reverify chunks
* remove
* remove
* reduce dupe map cap
* do not verify chunk
* cli arg and use val for dupe lut
* remove counter
* skipp empty results]
* working on test and normalizing val for comparison
* forgot to save file
* optimize normalize
* reuse map
* remove print
* use levenshtein distance to check dupes
* forgot to leave in emptying map
* use slice
* small tweak
* comment
* use bytes
* praise
* use ctx logger
* add len check
* add comments
* use 8x concurrency for reverifier workers
* revert worker count
* use more workers
* process result directly for any collisions
* continue after decoder match for reverifying
* use map
* use map
* otimization and fix the bug.
* revert worker count
* better option naming
* handle identical secrets in chunks
* update comment
* update comment
* fix test
* use DetecotrKey
* rm out of scope tests and testdata
* rename all reverification elements
* don't re-write map entry
* use correct key
* rename worker, remove log val
* test likelydupe, add eq detector check in loop
* add test
* add comment
* add test
* Set verification error
* Update tests
---------
Co-authored-by: Zachary Rice <zachary.rice@trufflesec.com>
Co-authored-by: Dustin Decker <dustin@trufflesec.com>
* Write large diffs to tmp files
* address comments
* Move bufferedfilewriter to own pkg
* update test
* swallow write err
* use buffer pool
* use size vs len
* use interface
* fix test
* update comments
* fix test
* remove unused
* remove
* remove unused
* move parser and commit struct closer to where they are used
* linter change
* add more kvp pairs to error
* fix test
* update
* address comments
* remove bufferedfile writer
* address comments
* adjust interface
* fix finalize
* address comments
* lint
* remove guard
* fix
* add TODO
* updating alibaba
* updating agora
* updating aeroworkflow
* updating aha
* updating artifactory
* updating abbysale
* updating abstract
* updating abuseipdb
* updating accuweather
* updating adafruitio
* updating adzuna
* cleanup on abuseipdb
* cleanup on aha
* cleanup on abuseipdb
* cleanup on aeroworkflow
* cleanup on adzuna
* cleanup on accuweather
* cleanup/refactor
* update token pattern to be explicitly 73char (old) or 64char (new)
* comment to clarify 403 on Aha
* mocking out verified case for aha + adding inactive account test
* using contact response instead of gock
* update 403 to be determinate
* added azurefunctionkey detector
* update raw field to include url
* clean up and added prefix on key pattern
* update bench script
* update imports, snifftest, and gen proto
---------
Co-authored-by: Dustin Decker <dustin@trufflesec.com>
* added azuredevopspersonalaccesstoken detector
* fix comment
* update raw field to include all parts of the credential
---------
Co-authored-by: Dustin Decker <dustin@trufflesec.com>
* Walk directories in filesystem source enumeration
* Ignore all directories instead of just the root
* Fix bug with multiple directories
* Skip filesystem TestEnumerate
* Update filesystem enumeration test to create files and folders
* Extend memory cache to allow for configuring custom expiration and purge interval
* use any for value type
* fix test
* fix test
* address comments
* address
* make new construct more clear
* reduce duplication
* fix test
The source manager attaches some context keys, but in certain circumstances, they're already present, resulting in duplicate keys. This PR changes the attachment to be conditional. It also adds some new log messages to track source startup progress.
* Update metabase verification to check for a valid JSON response
* added test tokens + cleanup
---------
Co-authored-by: ahmed <ahmed.zahran@trufflesec.com>
* add tempfile creation
- break PID retrieval into sep. function
* add tmpfile cleanup func
* add file cleanup to main cleanup func
* refactor file logic to only return name string
* add temp buffer naming to gcs
* add temp buffer naming to s3
* add temp buffer naming to filesystem
* add temp buffer naming to git
* consolidate cleanup functions
- have single function handle both files and dirs
- remove interface(not needed with a single func implementation)
- change calls to `New(...)` to reflect config implementation
- simplify automation in main.go
- update disk-buffer-reader dependency
* integrate changes from pr #2133
* merge main
* checkout from main to revert conflict issues
* re-add buffer logic to git
* interface no longer needed
* move string format to global const
---------
Co-authored-by: Ahrav Dutta <ahrav.dutta@trufflesec.com>
* add rotation guides to SlackWebhook tests
* begin cleaning up tests
* have slack webhook detector use malformed json
* update test secrets
---------
Co-authored-by: Ahrav Dutta <ahrav.dutta@trufflesec.com>
A previous commit (5d0196957f) added .jar/.war/.ear files to the ignored extensions list, but these are archive files that we can scan, so we shouldn't exclude them.