Commit graph

146 commits

Author SHA1 Message Date
Miccah
97f8a4834b
Add metrics for command invocation (#3185) 2024-08-13 08:50:36 -07:00
Miccah
7730fc826b
[analyze] Bandaid solution for occasional slow startups (#3191)
* [analyze] Bandaid solution for occasional slow startups

* Speed up shutdown

* Add link to upstream issue

---------

Co-authored-by: Dustin Decker <dustin@trufflesec.com>
2024-08-06 22:24:58 -07:00
Miccah
59fccbcf3f
Analyze TUI (#3172)
* Setup TUI entrypoint

* Setup key type selector and form pages

* Add basic confirmation component

* Add basic list selector for analyzer type

* Add form page

* Remove quit confirmation

* Add styles

* Add input text redaction

* Add log file input to form

* Fix some bugs and race conditions

* Remove unused code

* Fix filtering bug
2024-08-05 15:00:46 -07:00
ahrav
170f6ab624
enable mutex and block profiler (#3154) 2024-08-02 07:56:09 -07:00
Miccah
eccbca730d
[fix] Always configure the engine with the default detectors (#3152)
If detectors are not wanted by a user, they can be filtered out via
the `--include-detectors` or `--exclude-detectors` flag.
2024-08-02 07:48:39 -07:00
joeleonjr
7d606e2480
CFOR Commit Scanner (#3145)
* alpha feature for scanning hidden commits on github

* improvements re: git operations

* lint updates

* updating with exec block due to no gh token

* reworked logic into new source

* fixed collisions threshold flag input

* fixed IOutil issues

* removed additions from GH config

---------

Co-authored-by: Joe Leon <joe.leon@trufflesec.com>
2024-08-01 23:04:20 -04:00
ahrav
b193febab5
[chore] - move automaxprocs to init (#3143)
* move automaxprocs to init

* revert
2024-08-01 11:31:03 -07:00
ahrav
b56fffb6cd
[chore] - Set GOMAXPROCS (#3136)
* use automaxprocs

* remove newline
2024-07-31 17:10:03 -07:00
Miccah
2424683923
Analyze (#3099)
* Add POC analyze sub-command

* Address lint errors

* [chore] Embed scopes at compile time

* [chore] Move subcommand check up to prevent printing metrics

* added http logging to most analyzers

* Use custom RoundTripper with default http.Client

* Create framework of interfaces, structs, and protos

* Merge main

* Add AnalysisInfo to detectors.Result

* Hide analyze subcommand

* Update gen_proto.sh

* Update protos

* Make protos

* Update analyzer data types

* Rename argument to credentialInfo

---------

Co-authored-by: Joe Leon <joe.leon@trufflesec.com>
2024-07-25 12:06:05 -07:00
joeleonjr
01a1499600
New Source: HuggingFace (#3000)
* initial spike on hf

* added in user and org enum

* adding huggingface source

* updated with lint suggestions

* updated readme

* addressing resources that require org approval to access

* removing unneeded code

* updating with new error msg for 403

* deleted unused code + added resource check in main
2024-06-27 13:22:06 -04:00
ahrav
cb072603dc
Modularize scanning engine (#2887)
* POC: Modularize scanning engine.

* fix typo

* update interface name

* fix tests

* update test

* fix moar tests

* fix bug

* fixes.

* fix merge

* add detector verification overrides

* handle --no-verification flag

* support fp

* add test

* update name

* filter

* update test

* explicit use of detector

* updates
2024-06-13 13:47:09 -07:00
ahrav
ce1ce29b90
[feat] - Optimize detector performance by reducing data passed to regex (#2812)
* optimize maching detetors

* update method name

* updates

* update naming

* updates

* update comment

* updates

* remove testcase

* update default match len to 512

* update

* update test

* add support for multpart cred provider

* add ability to scan entire chunk

* encapsulate matches logic within FindDetectorMatches

* use []byte directly

* nil chunk data

* use []byte

* set hidden flag to true

* remove

* [refactor] - multi part detectors (#2906)

* Detectors beginning w/ a

* Detectors beginning w/ b

* Detectors beginning w/ c

* Detectors beginning w/ d

* Detectors beginning w/ e

* Detectors beginning w/ f

* Detectors beginning w/ f&g

* fix

* Detectors beginning w/ i-l

* Detectors beginning w/ m-p

* Detectors beginning w/ r-s

* Detectors beginning w/ t

* Detectors beginning w/ u-z

* revert alconst

* remaining fixes

* lint

* [feat] - Add Support for `compareDetectionStrategies` Mode (#2918)

* Detector comparison mode

* remove else

* return error if results dont match

* update default hidden flag to not scan entire chunks

* fix tests

* enhance encapsulation by including methods on DetectorMatch to handle merging and extracting

* remove space

* fix

* update detector

* updates

* remove else

* run comparison concurrently
2024-06-05 13:28:19 -07:00
Hon
c1a2019d5b
Add flag to get information if trufflehog being ran from TUI (#1644)
* Add flag to get information if trufflehog being ran from TUI

Co-authored-by: mcastorina <m.castorina93@gmail.com>

* Always use version.BuildVersion

---------

Co-authored-by: mcastorina <m.castorina93@gmail.com>
2024-06-05 10:07:50 -07:00
Dustin Decker
ef410873f2
Add Jenkins scanning (#2892)
* add jenkins

* whoops

* adding unauthenticated jenkins scanning

* update docs

---------

Co-authored-by: Joe Leon <joe.leon@trufflesec.com>
2024-06-04 07:13:14 -04:00
Charlie Gunyon
311494e86e
Elastic adapter (#2727)
* Add stub source and elastic API funcs

* Spawn workers and ship chunks

* Now successfully detects a credential

- Added tests
- Added some documentation comments
- Threaded the passed context through to all the API requests

* Linting fixes

* Add integration tests and resolve some bugs they uncovered

* Logstash -> Elasticsearch

* Add support for --index-pattern

* Add support for --query-json

* Use structs instead of string building to construct a search body

* Support --since-timestamp

* Implement additional authentication methods

* Fix some small bugs

* Refactoring to support --best-effort-scan

* Finish implementation of --best-effort-scan

* Implement scan catch-up

* Finish connecting support for nodes CLI arg

* Add some integration tests around the catchup mechanism

* go mod tidy

* Fix some linting issues

* Remove some debugging Prints

* Move off of _doc

* Remove informational Printf and add informational logging

* Remove debugging logging

* Copy the index from the outer loop as well

* Don't burn up the ES API with rapid requests if there's no work to do in subsequent scans

* No need to export UnitOfWork.AddSearch

* Use a better name for the range query variable when building the timestamp range clause in searches

* Replace some unlocking defers with explicit unlocks to make the synchronized part of the code clearer

* found -> ok

* Remove superfluous buildElasticClient method

---------

Co-authored-by: Charlie Gunyon <charlie@spectral.energy>
2024-05-24 09:38:20 -05:00
ahrav
6df147de58
[feat] - Support bearer auth for docker scans (#2848)
* Support bearer auth for docker scans

* updates

* use no auth by default if no other auth method is provided
2024-05-14 11:30:11 -07:00
Dustin Decker
9d4eb9516f
Update postman flags to be less confusing (#2755)
* Update postman flags to be less confusing

* Update readme

* fmt
2024-05-10 12:30:08 -05:00
ahrav
c7b72b9867
address linter (#2783) 2024-05-08 13:58:50 -07:00
Zachary Rice
d92289de78
adds build version to finished scanning log (#2773) 2024-05-01 11:50:54 -05:00
Dustin Decker
14e44db2be
Move detectors.IsKnownFalsePositive from the detectors and into the engine (#2643)
* Remove detectors.IsKnownFalsePositive from detectors

* Centralize false positive removal in engine

* Don't apply fp filtering on custom regex to preserve previous behavior.

* fix empty branch

* update excludes

* update filtering

* Add result flag option and exclude some detectors
2024-04-22 15:18:04 -07:00
Zachary Rice
20dd450e0b
make postman source public (#2635) 2024-03-27 15:25:55 -05:00
Zachary Rice
b11ce72338
Postman Source (#2579)
postman source

Co-authored-by: Miccah <m.castorina93@gmail.com>

---------

Co-authored-by: Joe Leon <joe.leon@trufflesec.com>
Co-authored-by: Miccah Castorina <m.castorina93@gmail.com>
2024-03-20 11:36:20 -05:00
Richard Gomez
b70b9b9b9f
fix(cli): properly parse --results (#2582) 2024-03-16 01:03:45 -07:00
Richard Gomez
f5025fd382
Add --results flag (#2372)
This is a follow-up to #2107 and #2335. It adds a new (hidden) --results flag that allows a user to show any combination of verified, unverified, and indeterminate secrets.
2024-03-15 10:19:31 -04:00
Cody Rose
28ed81f0a2
Add naive S3 ignorelist (#2536)
This PR adds the ability to exclude buckets from S3 scans. The capability is pretty rudimentary right now, and does not support globbing. If both lists are specified the source to fail to initialize.
2024-03-05 08:01:20 -05:00
Zachary Rice
bccba20d3e
concurrency uint8 to int (#2488)
* concurrency uint8 to uint16

* jk, use int

* git test fix
2024-02-20 09:35:40 -06:00
ahrav
41301bec8a
move clenaup outside the engine (#2475) 2024-02-17 08:06:24 -08:00
Miccah
9642d4c8fd
Add flag to write job reports to disk (#2298)
* Add flag to write job reports to disk

* Fix nil pointer / non-nil interface bug

* Synchronize job report writer goroutine

* Log when the report has been written
2024-02-09 12:30:28 -08:00
Cody Rose
95616b01f9
Disable GitHub wiki scanning by default (#2386)
The new functionality introduced by #2233 runs very slowly; this commits causes the new functionality to not run by default.
2024-02-05 16:59:53 -05:00
ahrav
b2074ad05d
Polite Verification (#2356)
* draft reverify chunks

* remove

* remove

* reduce dupe map cap

* do not verify chunk

* cli arg and use val for dupe lut

* remove counter

* skipp empty results]

* working on test and normalizing val for comparison

* forgot to save file

* optimize normalize

* reuse map

* remove print

* use levenshtein distance to check dupes

* forgot to leave in emptying map

* use slice

* small tweak

* comment

* use bytes

* praise

* use ctx logger

* add len check

* add comments

* use 8x concurrency for reverifier workers

* revert worker count

* use more workers

* process result directly for any collisions

* continue after decoder match for reverifying

* use map

* use map

* otimization and fix the bug.

* revert worker count

* better option naming

* handle identical secrets in chunks

* update comment

* update comment

* fix test

* use DetecotrKey

* rm out of scope tests and testdata

* rename all reverification elements

* don't re-write map entry

* use correct key

* rename worker, remove log val

* test likelydupe, add eq detector check in loop

* add test

* add comment

* add test

* Set verification error

* Update tests

---------

Co-authored-by: Zachary Rice <zachary.rice@trufflesec.com>
Co-authored-by: Dustin Decker <dustin@trufflesec.com>
2024-02-02 09:29:18 -08:00
Richard Gomez
8e90c4e669
Scan GitHub wikis #2233 2024-01-31 10:52:24 -05:00
ahrav
d0c0ba43de
[feat] - Provide CLI flag to only use custom verifiers (#2299)
* Provide CLI flag to only use custom verifiers

* address comments
2024-01-13 16:52:41 -08:00
ahrav
651beff492
[feat] - Allow for the use of include/exclude path files for filesystem scans (#2297)
* Allow for the use of include/exclude path files for filesystem scans

* remove oopsie
2024-01-11 15:41:50 -08:00
ahrav
39f0310f1f
[fixup] - Refactor to Pass Reader for Binary Diffs and Archived Data; Optimize /tmp Directory Cleanup (#2253) 2023-12-22 07:41:54 -08:00
ahrav
328a3f141f
move cleanup to run (#2245) 2023-12-18 20:02:36 -08:00
Mike Vanbuskirk
53f060a08e
Add disk buffer tempfile cleanup (#2130)
* add tempfile creation

- break PID retrieval into sep. function

* add tmpfile cleanup func

* add file cleanup to main cleanup func

* refactor file logic to only return name string

* add temp buffer naming to gcs

* add temp buffer naming to s3

* add temp buffer naming to filesystem

* add temp buffer naming to git

* consolidate cleanup functions

- have single function handle both files and dirs
- remove interface(not needed with a single func implementation)
- change calls to `New(...)` to reflect config implementation
- simplify automation in main.go
- update disk-buffer-reader dependency

* integrate changes from pr #2133

* merge main

* checkout from main to revert conflict issues

* re-add buffer logic to git

* interface no longer needed

* move string format to global const

---------

Co-authored-by: Ahrav Dutta <ahrav.dutta@trufflesec.com>
2023-12-11 18:31:50 -05:00
ahrav
52ffab1034
[chore] - fix import name clashes (#2143)
* fix import name clashes

* fix missing var
2023-12-01 06:53:15 -08:00
Dustin Decker
363ccab316
Simplify temp dir cleaning (#2133)
* Simplify temp dir cleaning

* rename vars

* add test

* update test
2023-11-28 16:42:17 -08:00
ahrav
d334b3075e
move all Git setup into Init method (#2105)
* add proto fields for git

* add uri to proto

* move all git setup into Init method

* fix logic for when to use repoPath
2023-11-16 13:59:53 -08:00
Dustin Decker
3c2270ae65
update kingpin import (#2053) 2023-10-30 10:58:38 -07:00
Dustin Decker
05fae156e1
Add TravisCI source (#1877)
* Add TravisCI source

* update test to use sourcestest

* Remove jobPage loop

ListByBuild does not support pagination, so this was infinitely
repeating. https://developer.travis-ci.com/resource/jobs#find

* Continue chunking on error

* review updates

* update readme

---------

Co-authored-by: Miccah Castorina <m.castorina93@gmail.com>
2023-10-30 07:28:25 -07:00
Cody Rose
876a55821b
Remove verify flag from Aho-Corasick core (#2010)
The Aho-Corasick wrapper we have tracks information about whether verification should be enabled on an individual detector basis, but that functionality isn't related to the matching functionality of Aho-Corasick, and including it complicates the implementation. This PR removes it to simplify some things.

This PR removes some code that supported a potential future implementation of detector-specific verification settings, but that feature has not actually been implemented yet, so there's no loss of functionality. If we want that feature we can add it back on top of this in a more separated way.
2023-10-30 09:52:51 -04:00
Mike Vanbuskirk
4636dc08f6
Add temp directory management (#1878)
* adds func to get scannerPIDs

* add cleanup and call to get pids

* move pid handling to git module

* remove PID logic from main

* refactor testing code to handle different exec name

* cleanup linting errors

* add better logging, fix dir if clause

* some PR fixups

* mod fixup

* add interfaces for helper funcs

* refactor cleanup into main, getPID into git

* lint and test fixups, remove fail on n<2 pids

* simplify pid sorting

* use filepath.Join

* use Args[0] for exec name, fix logger

* formatting fixup

* move functionality into cleantemp pkg

* go mod fixup

* remove redundant testing comment

* fix go.sum issues

* add 15m ticker loop for cleanup

* enclose ticker in function for goroutine defer

fix cleantemp interface

* make time more readable

* add check for non-local Trufflehog PIDs

* allow deletion even if no non-local pids found

* bundle intial cleanup into runCleanup func

* add explicit regex check for tempdir format
2023-10-26 12:28:56 -04:00
Damanpreet Singh
b9f49933b8
Added Support for '-h' Option for Help Documentation (#1901) 2023-10-18 06:57:05 -07:00
Dustin Decker
52ed87edb7
Add an option to filter unverified results using shannon entropy (#1875)
* Add an option to filter unverified results using shannon entropy

* lint

* add test, update test, and optimize
2023-10-08 19:52:28 -07:00
joeleonjr
699547b7d3
consolidated pr and issue descr/comment flags (#1827) 2023-09-27 15:54:02 -04:00
joeleonjr
1e42dae734
added PR and Issue body scanning (#1816)
* added PR and Issue body scanning; adjusted CLI args to fit

* removed print statement from debugging

* removed exclude-commits; adjusted CLI flags

* minor changes to match main branch

* fixing logic

* updating README for --issues and --prs
2023-09-26 12:25:48 -04:00
ahrav
fdeccf06a0
cache dupes w/ different decoders (#1754)
* only cache dupes that have different decoders.

* add test.

* remove file.

* update comment.
2023-09-11 08:18:48 -07:00
Mike Vanbuskirk
64dd49f9ce
add role assumption for s3 source (#1477)
* add role assumption for s3 source

* refactor role assumption to repeatable string

user can pass array of roles to assume

* refactor s3 chunks to handle passed roleARNs

* add role-session name

use timestamp to make dynamic

* add docstring for rolearn strings()

* make sure role ars are passed into source

* refactor role assumption functionality

break s3 bucket scanning into sep. function

* add log check on assume role

* fix role iteration

- Make sure s3 struct is populated with roles
- add separate new client instantiation for role-based access
- iterates through each role

* add comment

* protobuf revert for merge

* re-run make proto

* lint cleanup

* cleanup TODOs

* drop redundant switch case in assumerole client

* use less verbose 'ctx' designator

* breakout functionality from Chunks

- separate functions for:
- enumerating buckets to scan
- scanning objects within the buckets

* remake protobuf defs

* allow scan to continue on single bucket err

* add readme docs

* minor fixups
2023-08-17 20:30:20 -04:00
Zubair Khan
db89e345d7
correct logging output for github comments and add oss flags (#1632)
* correct logging output

* add flags

* respect oss cli flags for github comment scanning

* improve copy
2023-08-16 18:23:59 -04:00