* Implemented gitlab inclusion globbing.
Included test.
* implemented two new flags for gitlab scan, includeRepo and excludeRepo to support globbing.
Apply globbing filter when repos is not provided.
* implemented integration test for inclusion globbing
remove test to check errors if globs are invalid.
* made changes to support glob compile errors.
modified changes to support glob compilation errors.
* removed unused context from few functions.
Some source use client libraries that can emit errors that contain sensitive information - in particular, git-facing libraries that embed tokens into repository URLs. This PR introduces a way of redacting them - starting with GitLab (where we've seen this most recently), but in theory extensible to other sources as needed.
This implementation uses a custom zap core; this might also be possible with a custom zap encoder, but I didn't test it out.
(The deleted core.go file was entirely unused.)
* alpha feature for scanning hidden commits on github
* improvements re: git operations
* lint updates
* updating with exec block due to no gh token
* reworked logic into new source
* fixed collisions threshold flag input
* fixed IOutil issues
* removed additions from GH config
---------
Co-authored-by: Joe Leon <joe.leon@trufflesec.com>
* Add POC analyze sub-command
* Address lint errors
* [chore] Embed scopes at compile time
* [chore] Move subcommand check up to prevent printing metrics
* added http logging to most analyzers
* Use custom RoundTripper with default http.Client
* Create framework of interfaces, structs, and protos
* Merge main
* Add AnalysisInfo to detectors.Result
* Hide analyze subcommand
* Update gen_proto.sh
* Update protos
* Make protos
* Update analyzer data types
* Rename argument to credentialInfo
---------
Co-authored-by: Joe Leon <joe.leon@trufflesec.com>
* initial spike on hf
* added in user and org enum
* adding huggingface source
* updated with lint suggestions
* updated readme
* addressing resources that require org approval to access
* removing unneeded code
* updating with new error msg for 403
* deleted unused code + added resource check in main
* Add flag to get information if trufflehog being ran from TUI
Co-authored-by: mcastorina <m.castorina93@gmail.com>
* Always use version.BuildVersion
---------
Co-authored-by: mcastorina <m.castorina93@gmail.com>
* Add stub source and elastic API funcs
* Spawn workers and ship chunks
* Now successfully detects a credential
- Added tests
- Added some documentation comments
- Threaded the passed context through to all the API requests
* Linting fixes
* Add integration tests and resolve some bugs they uncovered
* Logstash -> Elasticsearch
* Add support for --index-pattern
* Add support for --query-json
* Use structs instead of string building to construct a search body
* Support --since-timestamp
* Implement additional authentication methods
* Fix some small bugs
* Refactoring to support --best-effort-scan
* Finish implementation of --best-effort-scan
* Implement scan catch-up
* Finish connecting support for nodes CLI arg
* Add some integration tests around the catchup mechanism
* go mod tidy
* Fix some linting issues
* Remove some debugging Prints
* Move off of _doc
* Remove informational Printf and add informational logging
* Remove debugging logging
* Copy the index from the outer loop as well
* Don't burn up the ES API with rapid requests if there's no work to do in subsequent scans
* No need to export UnitOfWork.AddSearch
* Use a better name for the range query variable when building the timestamp range clause in searches
* Replace some unlocking defers with explicit unlocks to make the synchronized part of the code clearer
* found -> ok
* Remove superfluous buildElasticClient method
---------
Co-authored-by: Charlie Gunyon <charlie@spectral.energy>
This is a follow-up to #2107 and #2335. It adds a new (hidden) --results flag that allows a user to show any combination of verified, unverified, and indeterminate secrets.
This PR adds the ability to exclude buckets from S3 scans. The capability is pretty rudimentary right now, and does not support globbing. If both lists are specified the source to fail to initialize.
* Add flag to write job reports to disk
* Fix nil pointer / non-nil interface bug
* Synchronize job report writer goroutine
* Log when the report has been written
* draft reverify chunks
* remove
* remove
* reduce dupe map cap
* do not verify chunk
* cli arg and use val for dupe lut
* remove counter
* skipp empty results]
* working on test and normalizing val for comparison
* forgot to save file
* optimize normalize
* reuse map
* remove print
* use levenshtein distance to check dupes
* forgot to leave in emptying map
* use slice
* small tweak
* comment
* use bytes
* praise
* use ctx logger
* add len check
* add comments
* use 8x concurrency for reverifier workers
* revert worker count
* use more workers
* process result directly for any collisions
* continue after decoder match for reverifying
* use map
* use map
* otimization and fix the bug.
* revert worker count
* better option naming
* handle identical secrets in chunks
* update comment
* update comment
* fix test
* use DetecotrKey
* rm out of scope tests and testdata
* rename all reverification elements
* don't re-write map entry
* use correct key
* rename worker, remove log val
* test likelydupe, add eq detector check in loop
* add test
* add comment
* add test
* Set verification error
* Update tests
---------
Co-authored-by: Zachary Rice <zachary.rice@trufflesec.com>
Co-authored-by: Dustin Decker <dustin@trufflesec.com>
* add tempfile creation
- break PID retrieval into sep. function
* add tmpfile cleanup func
* add file cleanup to main cleanup func
* refactor file logic to only return name string
* add temp buffer naming to gcs
* add temp buffer naming to s3
* add temp buffer naming to filesystem
* add temp buffer naming to git
* consolidate cleanup functions
- have single function handle both files and dirs
- remove interface(not needed with a single func implementation)
- change calls to `New(...)` to reflect config implementation
- simplify automation in main.go
- update disk-buffer-reader dependency
* integrate changes from pr #2133
* merge main
* checkout from main to revert conflict issues
* re-add buffer logic to git
* interface no longer needed
* move string format to global const
---------
Co-authored-by: Ahrav Dutta <ahrav.dutta@trufflesec.com>
* Add TravisCI source
* update test to use sourcestest
* Remove jobPage loop
ListByBuild does not support pagination, so this was infinitely
repeating. https://developer.travis-ci.com/resource/jobs#find
* Continue chunking on error
* review updates
* update readme
---------
Co-authored-by: Miccah Castorina <m.castorina93@gmail.com>
The Aho-Corasick wrapper we have tracks information about whether verification should be enabled on an individual detector basis, but that functionality isn't related to the matching functionality of Aho-Corasick, and including it complicates the implementation. This PR removes it to simplify some things.
This PR removes some code that supported a potential future implementation of detector-specific verification settings, but that feature has not actually been implemented yet, so there's no loss of functionality. If we want that feature we can add it back on top of this in a more separated way.
* adds func to get scannerPIDs
* add cleanup and call to get pids
* move pid handling to git module
* remove PID logic from main
* refactor testing code to handle different exec name
* cleanup linting errors
* add better logging, fix dir if clause
* some PR fixups
* mod fixup
* add interfaces for helper funcs
* refactor cleanup into main, getPID into git
* lint and test fixups, remove fail on n<2 pids
* simplify pid sorting
* use filepath.Join
* use Args[0] for exec name, fix logger
* formatting fixup
* move functionality into cleantemp pkg
* go mod fixup
* remove redundant testing comment
* fix go.sum issues
* add 15m ticker loop for cleanup
* enclose ticker in function for goroutine defer
fix cleantemp interface
* make time more readable
* add check for non-local Trufflehog PIDs
* allow deletion even if no non-local pids found
* bundle intial cleanup into runCleanup func
* add explicit regex check for tempdir format