Commit graph

75 commits

Author SHA1 Message Date
ahrav
52ffab1034
[chore] - fix import name clashes (#2143)
* fix import name clashes

* fix missing var
2023-12-01 06:53:15 -08:00
Miccah
7ecd43ab1e
[chore] Minor cleanup of source_manager.go (#2134) 2023-11-29 11:08:25 -08:00
Cody Rose
7a156330b5
Support multiple detectors per match (#2065)
#1711 inadvertently removed the ability to match multiple custom detectors, or multiple detectors of the same type but different version, to a given keyword. (#2060 re-added support for multiple versions of detectors globally, and #2064 re-added support for multiple custom detectors globally, but neither fixed trufflehog's inability to support multiple such detectors for a given keyword match.) This PR re-adds the removed functionality (and narrows the AhoCorasickCore interface in the process.)
2023-11-03 12:26:18 -04:00
Dustin Decker
05fae156e1
Add TravisCI source (#1877)
* Add TravisCI source

* update test to use sourcestest

* Remove jobPage loop

ListByBuild does not support pagination, so this was infinitely
repeating. https://developer.travis-ci.com/resource/jobs#find

* Continue chunking on error

* review updates

* update readme

---------

Co-authored-by: Miccah Castorina <m.castorina93@gmail.com>
2023-10-30 07:28:25 -07:00
Cody Rose
876a55821b
Remove verify flag from Aho-Corasick core (#2010)
The Aho-Corasick wrapper we have tracks information about whether verification should be enabled on an individual detector basis, but that functionality isn't related to the matching functionality of Aho-Corasick, and including it complicates the implementation. This PR removes it to simplify some things.

This PR removes some code that supported a potential future implementation of detector-specific verification settings, but that feature has not actually been implemented yet, so there's no loss of functionality. If we want that feature we can add it back on top of this in a more separated way.
2023-10-30 09:52:51 -04:00
Cody Rose
e556bdd7b2
Revert "Fix off by one (#1891)" (#1963)
This reverts commit 7f534d0bb7.
2023-10-24 08:40:44 -07:00
ahrav
0f845c8eee
export ShouldVerify (#1962) 2023-10-24 07:27:01 -07:00
ahrav
9ae114f92f
export struct (#1954) 2023-10-24 06:29:26 -07:00
ahrav
68f28a0e34
Filter unique detectors by keywords in chunk (#1711)
* pre filter detectors that include the keywords in the chunk.

* Optimize the engine to prevent iterating overing all detectors.

* use sync.Map for concurrent access.

* lint.

* use correct verify.

* allow versioned detectors.

* Break apart Start.

* cleanup.

* Update benchmark.

* add comment.

* remove Engine prefix.

* update comments.

* use regular map.

* delete the pool.

* remove old code.

* refactor ahocorasickcore into own file.

* update comments

* move structs to ahocorasickcore

* update comments

* fix

* address comments

* exported some methods and constructor since it will need to be be used by the enterprise pipeline as well

* remove extra log
2023-10-23 08:02:01 -07:00
Shreyas Sriram
7f534d0bb7
Fix off by one (#1891) 2023-10-17 07:02:27 -07:00
Dustin Decker
52ed87edb7
Add an option to filter unverified results using shannon entropy (#1875)
* Add an option to filter unverified results using shannon entropy

* lint

* add test, update test, and optimize
2023-10-08 19:52:28 -07:00
ahrav
6affc903e1
add line to link for azure repos. (#1801) 2023-09-21 16:07:11 -07:00
ahrav
a8c89c59b9
[bug] - fix link line (#1793)
* fix link line.

* rename.
2023-09-20 14:46:00 -07:00
ahrav
47d5ddebf2
Ability to update line number in link (#1788)
* Add functionality to update a source's link in the metadata with the updated line number.

* update comment.

* add logic to engine.

* only update link for non empty links.

* add tests for bb.
2023-09-19 15:39:13 -07:00
ahrav
22876f8381
replace interface{} with any. (#1771) 2023-09-15 04:35:15 -07:00
ahrav
fdeccf06a0
cache dupes w/ different decoders (#1754)
* only cache dupes that have different decoders.

* add test.

* remove file.

* update comment.
2023-09-11 08:18:48 -07:00
Miccah
fae54c7ffa
Add ScanChunk to allow injecting Chunks into the SourceManager's channel (#1634)
With the introduction of the SourceManager, the chunks channel became
private and read-only. This provides a method to write chunks into the
channel as we transition away from needing to do that.
2023-08-16 16:09:23 -07:00
Miccah
eae66ccf7e
Refactor FragmentLineOffset to match multiline secrets (#1612)
* Refactor FragmentLineOffset to match multiline secrets

* Add tests and benchmarks

* Use bytes.Count and fix an ignore tag edge case
2023-08-14 10:51:41 -07:00
Miccah
1cd600f70f
Use SourceManager in engine (#1586)
* Add SourceManager to Engine struct

* Update Engine methods to use the SourceManager

* Fix GCS test

The original was testing that `Init()` errors weren't surfaced in
`Finish()`, but the `SourceManager` changed that behavior.

* JobProgress race fixes

* Add contextual values

* Remove unused code

* Add debug logs

* Rename WithConcurrency to WithConcurrentSources

* Always forward chunks to the output chunks channel
2023-08-03 13:36:30 -05:00
ahrav
06d2eab204
include scan duration in output log (#1598)
* add scan duration to output log.

* fix linter.
2023-08-02 11:48:29 -07:00
ahrav
5043fc8756
[bug] - Fix unlocking an unlocked mutex (#1583)
* use correct mutext.

* remove unused fxn.
2023-07-31 14:06:41 -07:00
ahrav
5e7a6ca11c
Concurrent detection (#1580)
* Run detection on each chunk concurrently.

* Add printer functionality.

* Add logic for dedupe.

* cleanup.

* Moddify number of notifier workers.

* Add comment.

* move consts into fxn.

* buffer resutls chan.

* fix test.

* address comments.

* return an error from Finish.

* fix test.

* fix test.

* linter.

* check err.

* address comments.
2023-07-31 11:12:08 -07:00
Dustin Decker
10b6e2898d
Increase log level of engine messages (#1576) 2023-07-28 14:30:43 -07:00
Zachary Rice
1a1977f7e6
case insensitive (#1547) 2023-07-25 17:01:15 -05:00
Zachary Rice
85f363f093
init (#1538) 2023-07-24 19:09:57 -05:00
Dustin Decker
fab80445d1
continue scanning on detector / decoder panic (#863) 2023-07-24 07:34:43 -07:00
Miccah
a613bbb979
[chore] Remove parent manipulation in context package (#1525)
The ability to set the parent allowed creating context cycles which
shouldn't be allowed, or at the very least have unintuitive behavior.
2023-07-21 13:51:51 -05:00
Miccah
e8b5e3cea3
Revert "[chore] Remove parent setting / getting in Context wrapper (#1516)" (#1519)
This reverts commit 8ec5e4916c.
This commit is somehow causing AWS verification (and possibly others) to
not work.
2023-07-20 23:31:28 -05:00
Miccah
8ec5e4916c
[chore] Remove parent setting / getting in Context wrapper (#1516)
* [chore] Remove parent setting / getting in Context wrapper

* Keep the cancellable context from errgroup
2023-07-20 13:33:09 -05:00
ahrav
a9213a1103
[chore] - Update loop to switch. (#1487)
* Update loop to switch.

* remove unused fxn.
2023-07-12 15:47:43 -07:00
Zachary Rice
b48ac24c46
Dedupe results (#1479)
* init 4 dedupin

* use raw rather than rawv2

* rm comment

* comments

* nits

* clean up and use rawv2 too

* add decoder order test
2023-07-11 15:48:00 -05:00
Zachary Rice
0bdd513d88
additional similarity check for base64 and plain (#1462)
* additional similarity check for base64 and plain

* use bytes equal

* move logic into util function
2023-07-10 10:12:59 -05:00
Zachary Rice
18a70b64bb
Introduce trufflehog:ignore tag feature (#1433)
* init ignore

* cleanup and add test

* update readme
2023-06-29 08:45:56 -05:00
Brendan Shaklovitz
da5301ea1e
Exit with non-zero exit code on chunk source error (#1286)
* Exit with non-zero exit code on chunk source error

* Exit with a non-zero exit code whenever we hit an error getting
  chunks. Previously the error would be logged but trufflehog would exit
  with a 0 (success) status code.

* fix gcs test

---------

Co-authored-by: Dustin Decker <dustin@trufflesec.com>
Co-authored-by: ahrav <ahravdutta02@gmail.com>
2023-06-26 11:39:57 -05:00
Zachary Rice
74ffbd2878
add a custom detector check for logging duplicate detector (#1394)
* add a custom detector check for logging duplicate detector

* use pb type
2023-06-13 14:49:21 -05:00
Brendan Shaklovitz
584db86031
Support line numbers in filesystem source (#1297) 2023-05-09 08:02:34 -07:00
ahrav
cec1543894
Add utf16 decoder proto. (#1276) 2023-04-20 15:25:36 -07:00
Miccah
dfc5a9f5db
[chore] Log possible duplicate detectors (#1266)
* [chore] Log possible duplicate detectors

* Fix typos
2023-04-18 10:36:00 -05:00
Zachary Rice
1c89e79c2d
Remove toLower call on decoded chunk (#1254)
* remove to lower on decoded data

* clean up
2023-04-14 07:29:32 -05:00
ahrav
0052f60090
Allow for custom verifier (#1070)
* allow for custom verifier.

* Update engine.

* use custom detectors.

* set cap.

* Update verifiers.

* Remove nil check.

* resolved nit

* handle uppercase values

* updating missing url logs

* adding more descriptive variable names

* updating logs to use correct variables

* Removing toLower for urls

* if else nits

* Adding versioning for github and gitlab

---------

Co-authored-by: ahmed <ahmed.zahran@trufflesec.com>
Co-authored-by: ah̳̕mͭͭͨͩ̐e̘ͬ́͋ͬ̊̓͂d <13666360+0x1@users.noreply.github.com>
2023-03-29 12:26:39 -07:00
Zachary Rice
f0b6b5d0d9
add a break statement when iterating through keywords (#1184) 2023-03-15 16:51:03 -05:00
Zachary Rice
4777b77ec6
Keyword optimization (#1144)
* init

* ignore trufflehog binary and added comment

* remove unused keywords in chunk, better comment

* remove keywords from engine struct
2023-03-02 11:32:37 -06:00
Miccah
dd39848709
Add ability to include and exclude detectors (#1106)
* Add ability to include and exclude detectors

* Trim space before checking for empty items

* Explicitly check for integer overflow

* Use strconv.ParseInt instead of strconv.Atoi

* Address comments
2023-02-27 16:46:45 -06:00
Miccah
58e8c1e4ac
[chore] Remove logrus from engine package (#1085) 2023-02-09 16:55:19 -06:00
Bill Rich
430d5c764c
Rename and export isGitSource (#1016) 2023-01-10 12:51:58 -08:00
Bill Rich
8b2e1d36cf
Copy metadata for line number aware sources (#1011)
* Copy metadata for line number aware sources

* Improve style
2023-01-10 09:35:44 -08:00
Bill Rich
335ce85ce4
Export line number code (#962) 2022-12-06 15:31:15 -08:00
Bill Rich
d3b24fa592
Replace plain decoder with utf8 (#922) 2022-11-15 09:36:01 -08:00
ahrav
fe029b1098
[THOG-793] - Return all unverified results (#856)
* Remove the check to filter and return only a single unverified result.

* Revert "Remove the check to filter and return only a single unverified result."

This reverts commit 494e432803.

* Add new CLI flag to filter unverified results.
2022-10-31 09:36:10 -07:00
Bill Rich
034ca4fb5b
Add bytes counter to scans (#876) 2022-10-27 12:54:22 -07:00