#1711 inadvertently removed the ability to match multiple custom detectors, or multiple detectors of the same type but different version, to a given keyword. (#2060 re-added support for multiple versions of detectors globally, and #2064 re-added support for multiple custom detectors globally, but neither fixed trufflehog's inability to support multiple such detectors for a given keyword match.) This PR re-adds the removed functionality (and narrows the AhoCorasickCore interface in the process.)
#1711 accidentally removed the ability to support multiple custom detectors. This PR partially adds back this capability: Multiple custom detectors are now supported overall, but only one custom detector can be returned for a given keyword match.
* Add TravisCI source
* update test to use sourcestest
* Remove jobPage loop
ListByBuild does not support pagination, so this was infinitely
repeating. https://developer.travis-ci.com/resource/jobs#find
* Continue chunking on error
* review updates
* update readme
---------
Co-authored-by: Miccah Castorina <m.castorina93@gmail.com>
The Aho-Corasick wrapper we have tracks information about whether verification should be enabled on an individual detector basis, but that functionality isn't related to the matching functionality of Aho-Corasick, and including it complicates the implementation. This PR removes it to simplify some things.
This PR removes some code that supported a potential future implementation of detector-specific verification settings, but that feature has not actually been implemented yet, so there's no loss of functionality. If we want that feature we can add it back on top of this in a more separated way.
* Detector-Competition-Fix: Fix/Remove DataFire, API retired
* Detector-Competition-Fix: Depreciate Datafire Proto
* make protos for deprecating datafire
---------
Co-authored-by: āh̳̕mͭͭͨͩ̐e̘ͬ́͋ͬ̊̓͂d <13666360+0x1@users.noreply.github.com>
Co-authored-by: ahmed <ahmed.zahran@trufflesec.com>
* loggly detector
* fixed the loggly_test.go
* fixed the test file to pass the test
---------
Co-authored-by: dsingdev-rocketx <bughunter00@protonmail.com>
* pre filter detectors that include the keywords in the chunk.
* Optimize the engine to prevent iterating overing all detectors.
* use sync.Map for concurrent access.
* lint.
* use correct verify.
* allow versioned detectors.
* Break apart Start.
* cleanup.
* Update benchmark.
* add comment.
* remove Engine prefix.
* update comments.
* use regular map.
* delete the pool.
* remove old code.
* refactor ahocorasickcore into own file.
* update comments
* move structs to ahocorasickcore
* update comments
* fix
* address comments
* exported some methods and constructor since it will need to be be used by the enterprise pipeline as well
* remove extra log
* update gitlabv2 to tri-state
* updating secret to s1 to match convention
* consolidating both versions of the gitlab detector
* remove gitlabV2 references
* Delete temp.txt
delete test file (note: these are not real secrets)
* updating gitlabV1 detector to only work w/ v1 secrets, and v2 detector only w/ v2 secrets
* update package name and add to defaults
* cleanup nesting
* lowercase package names
* update v1 detector to explicitly ignore results with glpat
* nit
* update package name
* added PR and Issue body scanning; adjusted CLI args to fit
* removed print statement from debugging
* removed exclude-commits; adjusted CLI flags
* minor changes to match main branch
* fixing logic
* updating README for --issues and --prs
* Add functionality to update a source's link in the metadata with the updated line number.
* update comment.
* add logic to engine.
* only update link for non empty links.
* add tests for bb.
The previous implementation used int64 for both, which can be mixed up
easily. Using distinct types adds a layer of type safety checked by the
compiler.
* Refactor SourceManager to remove Enrollment
Initializing the Source will be the responsibility of the caller. The
SourceManager exposes a GetIDs method for getting a source and job ID.
* Update tests
* Update engine usage
* Update apiClient interface to have one GetIDs method
* Update SourceManager usage in engine
* add role assumption for s3 source
* refactor role assumption to repeatable string
user can pass array of roles to assume
* refactor s3 chunks to handle passed roleARNs
* add role-session name
use timestamp to make dynamic
* add docstring for rolearn strings()
* make sure role ars are passed into source
* refactor role assumption functionality
break s3 bucket scanning into sep. function
* add log check on assume role
* fix role iteration
- Make sure s3 struct is populated with roles
- add separate new client instantiation for role-based access
- iterates through each role
* add comment
* protobuf revert for merge
* re-run make proto
* lint cleanup
* cleanup TODOs
* drop redundant switch case in assumerole client
* use less verbose 'ctx' designator
* breakout functionality from Chunks
- separate functions for:
- enumerating buckets to scan
- scanning objects within the buckets
* remake protobuf defs
* allow scan to continue on single bucket err
* add readme docs
* minor fixups
With the introduction of the SourceManager, the chunks channel became
private and read-only. This provides a method to write chunks into the
channel as we transition away from needing to do that.
* Add SourceManager to Engine struct
* Update Engine methods to use the SourceManager
* Fix GCS test
The original was testing that `Init()` errors weren't surfaced in
`Finish()`, but the `SourceManager` changed that behavior.
* JobProgress race fixes
* Add contextual values
* Remove unused code
* Add debug logs
* Rename WithConcurrency to WithConcurrentSources
* Always forward chunks to the output chunks channel
* feat: initial support for bare repositories
* feat: use concatenation instead of formatting and os.Getenv instead of os.Environ
Signed-off-by: Savely Krasovsky <savely@krasovs.ky>
* fix: go-git update with pre-receive hooks fix
Signed-off-by: Savely Krasovsky <savely@krasovs.ky>
* fix: remove info about pre-receive hook from README.md for now
Signed-off-by: Savely Krasovsky <savely@krasovs.ky>
* fix: don't scan staged while using --bare option, fixes to make it work with the latest master
Signed-off-by: Savely Krasovsky <savely@krasovs.ky>
* fix: small refactor according to #1518
Signed-off-by: Savely Krasovsky <savely@krasovs.ky>
---------
Signed-off-by: Savely Krasovsky <savely@krasovs.ky>
* Github Oauth2 verification
* Use prefix and include RawV2
* Make gh_oauth2 a new detector
* Remove unused struct
* Remove versioner
* Remove unused code
* Refactor git source to allow ScanOptions and use source in engine
Refactor the Chunks method of the git Source to call out to two helper
methods: scanRepos and scanDirs which scans s.conn.Repositories and
s.conn.Directories respectively. The only notable change in behavior is
that a credential is no longer necessary if there are no
s.conn.Repositories to scan.
* Preserve ScanGit functionality of not cleaning up temporary files
* Implement SourceManager basics
* Rename identifiers and add a default headlessAPI implementation
* Rewrite to use SourceInitFunc
* Update variable name to accurately reflect its value