#1711 inadvertently removed the ability to match multiple custom detectors, or multiple detectors of the same type but different version, to a given keyword. (#2060 re-added support for multiple versions of detectors globally, and #2064 re-added support for multiple custom detectors globally, but neither fixed trufflehog's inability to support multiple such detectors for a given keyword match.) This PR re-adds the removed functionality (and narrows the AhoCorasickCore interface in the process.)
* Add TravisCI source
* update test to use sourcestest
* Remove jobPage loop
ListByBuild does not support pagination, so this was infinitely
repeating. https://developer.travis-ci.com/resource/jobs#find
* Continue chunking on error
* review updates
* update readme
---------
Co-authored-by: Miccah Castorina <m.castorina93@gmail.com>
The Aho-Corasick wrapper we have tracks information about whether verification should be enabled on an individual detector basis, but that functionality isn't related to the matching functionality of Aho-Corasick, and including it complicates the implementation. This PR removes it to simplify some things.
This PR removes some code that supported a potential future implementation of detector-specific verification settings, but that feature has not actually been implemented yet, so there's no loss of functionality. If we want that feature we can add it back on top of this in a more separated way.
* pre filter detectors that include the keywords in the chunk.
* Optimize the engine to prevent iterating overing all detectors.
* use sync.Map for concurrent access.
* lint.
* use correct verify.
* allow versioned detectors.
* Break apart Start.
* cleanup.
* Update benchmark.
* add comment.
* remove Engine prefix.
* update comments.
* use regular map.
* delete the pool.
* remove old code.
* refactor ahocorasickcore into own file.
* update comments
* move structs to ahocorasickcore
* update comments
* fix
* address comments
* exported some methods and constructor since it will need to be be used by the enterprise pipeline as well
* remove extra log
* Add functionality to update a source's link in the metadata with the updated line number.
* update comment.
* add logic to engine.
* only update link for non empty links.
* add tests for bb.
With the introduction of the SourceManager, the chunks channel became
private and read-only. This provides a method to write chunks into the
channel as we transition away from needing to do that.
* Add SourceManager to Engine struct
* Update Engine methods to use the SourceManager
* Fix GCS test
The original was testing that `Init()` errors weren't surfaced in
`Finish()`, but the `SourceManager` changed that behavior.
* JobProgress race fixes
* Add contextual values
* Remove unused code
* Add debug logs
* Rename WithConcurrency to WithConcurrentSources
* Always forward chunks to the output chunks channel
* Exit with non-zero exit code on chunk source error
* Exit with a non-zero exit code whenever we hit an error getting
chunks. Previously the error would be logged but trufflehog would exit
with a 0 (success) status code.
* fix gcs test
---------
Co-authored-by: Dustin Decker <dustin@trufflesec.com>
Co-authored-by: ahrav <ahravdutta02@gmail.com>
* Add ability to include and exclude detectors
* Trim space before checking for empty items
* Explicitly check for integer overflow
* Use strconv.ParseInt instead of strconv.Atoi
* Address comments
* Remove the check to filter and return only a single unverified result.
* Revert "Remove the check to filter and return only a single unverified result."
This reverts commit 494e432803.
* Add new CLI flag to filter unverified results.