trufflehog

mirror of https://github.com/trufflesecurity/trufflehog.git synced 2024-11-10 15:14:38 +00:00

Author	SHA1	Message	Date
ahrav	ead9dd5748	[refactor] - Create separate handler for non-archive data (#2825 ) * Remove specialized handler and archive struct and restructure handlers pkg. * Refactor RPM archive handlers to use a library instead of shelling out * make rpm handling context aware * update test * Refactor AR/deb archive handler to use an existing library instead of shelling out * Update tests * Handle non-archive data within the DefaultHandler * make structs and methods private * Remove non-archive data handling within sources * add max size check * add filename and size to context kvp * move skip file check and is binary check before opening file * fix test * preserve existing funcitonality of not handling non-archive files in HandleFile * Handle non-archive data within the DefaultHandler * rebase * Remove non-archive data handling within sources * Adjust check for rpm/deb archive type * add additional deb mime type * add gzip * move diskbuffered rereader setup into handler pkg * remove DiskBuffereReader creation logic within sources * update comment * move rewind closer * reduce log verbosity * add metrics for file handling * add metrics for errors * make defaultBufferSize a const * add metrics for file handling * add metrics for errors * fix tests * add metrics for max archive depth and skipped files * update error * skip symlinks and dirs * update err * Address incompatible reader to openArchive * remove nil check * fix err assignment * Allow git cat-file blob to complete before trying to handle the file * wrap compReader with DiskbufferReader * Allow git cat-file blob to complete before trying to handle the file * updates * use buffer writer * update * refactor * update context pkg * revert stuff * update test * fix test * remove * use correct reader * add metrics for file handling * add metrics for errors * fix tests * rebase * add metrics for errors * add metrics for max archive depth and skipped files * update error * skip symlinks and dirs * update err * fix err assignment * rebase * remove * Update write method in contentWriter interface * Add bufferReadSeekCloser * update name * update comment * fix lint * Remove specialized handler and archive struct and restructure handlers pkg. * Refactor RPM archive handlers to use a library instead of shelling out * make rpm handling context aware * update test * Refactor AR/deb archive handler to use an existing library instead of shelling out * Update tests * add max size check * add filename and size to context kvp * move skip file check and is binary check before opening file * fix test * preserve existing funcitonality of not handling non-archive files in HandleFile * Handle non-archive data within the DefaultHandler * rebase * Remove non-archive data handling within sources * Handle non-archive data within the DefaultHandler * add gzip * move diskbuffered rereader setup into handler pkg * remove DiskBuffereReader creation logic within sources * update comment * move rewind closer * reduce log verbosity * make defaultBufferSize a const * add metrics for file handling * add metrics for errors * fix tests * add metrics for max archive depth and skipped files * update error * skip symlinks and dirs * update err * Address incompatible reader to openArchive * remove nil check * fix err assignment * wrap compReader with DiskbufferReader * Allow git cat-file blob to complete before trying to handle the file * updates * use buffer writer * update * refactor * update context pkg * revert stuff * update test * remove * rebase * go mod tidy * lint check * update metric to ms * update metric * update comments * dont use ptr * update * fix * Remove specialized handler and archive struct and restructure handlers pkg. * Refactor RPM archive handlers to use a library instead of shelling out * make rpm handling context aware * update test * Refactor AR/deb archive handler to use an existing library instead of shelling out * Update tests * add max size check * add filename and size to context kvp * move skip file check and is binary check before opening file * fix test * preserve existing funcitonality of not handling non-archive files in HandleFile * Adjust check for rpm/deb archive type * add additional deb mime type * update comment * go mod tidy * update go mod * Add a buffered file reader * update comments * use Buffered File Readder * return buffer * update * fix * return * go mod tidy * merge * use a shared pool * use sync.Once * reorganzie * remove unused code * fix double init * fix stuff * nil check * reduce allocations * updates * update metrics * updates * reset buffer instead of putting it back * skip binaries * skip * concurrently process diffs * close chan * concurrently enumerate orgs * increase workers * ignore pbix and vsdx files * add metrics for gitparse's Diffchan * fix metric * update metrics * update * fix checks * fix * inc * update * reduce * Create workers to handle binary files * modify workers * updates * add check * delete code * use custom reader * rename struct * add nonarchive handler * fix break * add comments * add tests * refactor * remove log * do not scan rpm links * simplify * rename var * rename * fix benchmark * add buffer * buffer * buffer * handle panic * merge main * merge main * add recover * revert stuff * revert * revert to using reader * fixes * remove * update * fixes * linter * fix test * fix comment * update field name * fix	2024-05-15 13:40:16 -07:00
cuiyourong	ead4e8fa2d	chore: fix some typos in comments (#2851 ) Signed-off-by: cuiyourong <cuiyourong@gmail.com>	2024-05-15 07:36:21 -07:00
ahrav	570cec7565	[refactor] - Refactor Archive Handling Logic (#2703 ) * Remove specialized handler and archive struct and restructure handlers pkg. * Refactor RPM archive handlers to use a library instead of shelling out * make rpm handling context aware * update test * Refactor AR/deb archive handler to use an existing library instead of shelling out * Update tests * add max size check * add filename and size to context kvp * move skip file check and is binary check before opening file * fix test * preserve existing funcitonality of not handling non-archive files in HandleFile * Adjust check for rpm/deb archive type * add additional deb mime type * update comment * Remove specialized handler and archive struct and restructure handlers pkg. * Refactor RPM archive handlers to use a library instead of shelling out * make rpm handling context aware * update test * Refactor AR/deb archive handler to use an existing library instead of shelling out * Update tests * add max size check * add filename and size to context kvp * move skip file check and is binary check before opening file * fix test * preserve existing funcitonality of not handling non-archive files in HandleFile * Adjust check for rpm/deb archive type * add additional deb mime type * update comment * go mod tidy * update go mod * go mod tidy * add comment * update max depth check to > * go mod tidy * rename * [refactor] - Refactor Archive Handling Logic - Part 4: Non-Archive Data Handling and Cleanup (#2704) * Handle non-archive data within the DefaultHandler * make structs and methods private * Remove non-archive data handling within sources * Handle non-archive data within the DefaultHandler * rebase * Remove non-archive data handling within sources * add gzip * move diskbuffered rereader setup into handler pkg * remove DiskBuffereReader creation logic within sources * move rewind closer * reduce log verbosity * make defaultBufferSize a const * use correct reader * address comments * update test * [feat] - Add Prometheus Metrics for File Handlers (#2705) * add metrics for file handling * add metrics for errors * add metrics for file handling * add metrics for errors * fix tests * add metrics for max archive depth and skipped files * update error * skip symlinks and dirs * update err * fix err assignment * add metrics for file handling * add metrics for errors * fix tests * rebase * add metrics for errors * add metrics for max archive depth and skipped files * update error * skip symlinks and dirs * update err * fix err assignment * rebase * remove * update metric to ms * update comments * address comments * reduce indentations * add metrics for archive depth * [bug] - Enhanced Archive Handling to Address Interface Constraints (#2710) * add metrics for file handling * add metrics for errors * add metrics for file handling * add metrics for errors * fix tests * add metrics for max archive depth and skipped files * update error * skip symlinks and dirs * update err * Address incompatible reader to openArchive * remove nil check * fix err assignment * wrap compReader with DiskbufferReader * add metrics for file handling * add metrics for errors * fix tests * rebase * add metrics for errors * add metrics for max archive depth and skipped files * update error * skip symlinks and dirs * update err * fix err assignment * rebase * remove * update metric to ms * update comments * address comments * reduce indentations * replace diskbuffereader with bufferedfilereader * updtes * add metric back * [bug] - Fix bug and simplify git cat-file command execution and output handling (#2719) * add metrics for file handling * add metrics for errors * add metrics for file handling * add metrics for errors * fix tests * add metrics for max archive depth and skipped files * update error * skip symlinks and dirs * update err * Address incompatible reader to openArchive * remove nil check * fix err assignment * Allow git cat-file blob to complete before trying to handle the file * wrap compReader with DiskbufferReader * Allow git cat-file blob to complete before trying to handle the file * updates * revert stuff * update test * remove * add metrics for file handling * add metrics for errors * fix tests * rebase * add metrics for errors * add metrics for max archive depth and skipped files * update error * skip symlinks and dirs * update err * fix err assignment * rebase * remove * update metric to ms * update comments * address comments * reduce indentations * inline	2024-05-10 11:36:06 -07:00
Cody Rose	28ed81f0a2	Add naive S3 ignorelist (#2536 ) This PR adds the ability to exclude buckets from S3 scans. The capability is pretty rudimentary right now, and does not support globbing. If both lists are specified the source to fail to initialize.	2024-03-05 08:01:20 -05:00
Mike Vanbuskirk	53f060a08e	Add disk buffer tempfile cleanup (#2130 ) * add tempfile creation - break PID retrieval into sep. function * add tmpfile cleanup func * add file cleanup to main cleanup func * refactor file logic to only return name string * add temp buffer naming to gcs * add temp buffer naming to s3 * add temp buffer naming to filesystem * add temp buffer naming to git * consolidate cleanup functions - have single function handle both files and dirs - remove interface(not needed with a single func implementation) - change calls to `New(...)` to reflect config implementation - simplify automation in main.go - update disk-buffer-reader dependency * integrate changes from pr #2133 * merge main * checkout from main to revert conflict issues * re-add buffer logic to git * interface no longer needed * move string format to global const --------- Co-authored-by: Ahrav Dutta <ahrav.dutta@trufflesec.com>	2023-12-11 18:31:50 -05:00
ahrav	52ffab1034	[chore] - fix import name clashes (#2143 ) * fix import name clashes * fix missing var	2023-12-01 06:53:15 -08:00
Miccah	52600a897a	[chore] Replace chunks channel with ChunkReporter in git based sources (#2082 ) ChunkReporter is more flexible and will allow code reuse for unit chunking. ChanReporter was added as a way to maintain the original channel functionality, so this PR should not alter existing behavior.	2023-11-01 09:22:44 -07:00
Bill Rich	c5efa870ff	Use latest dbr (#1955 )	2023-10-24 07:52:49 -07:00
Cody Rose	e9efed85c2	Use S3 credentials waterfall (#1823 ) This PR updates the S3 source to use explicitly configured credentials if they're available and follow the normal AWS credentials waterfall if they're not. This is irrespective of whether role assumption is configured. This changes the previous behavior, which was to use waterfall credentials only if role assumption was configured and explicitly configured credentials only when it was not.	2023-09-27 16:57:47 -04:00
Miccah	dbcb888063	Update Source interface to use SourceID and JobID types (#1774 ) The previous implementation used int64 for both, which can be mixed up easily. Using distinct types adds a layer of type safety checked by the compiler.	2023-09-14 11:28:24 -07:00
Miccah	72b6a9ec6b	Add a SourceType constant to all source packages (#1768 )	2023-09-12 17:23:25 -07:00
Mike Vanbuskirk	de540652cb	verbosity updates to s3 source (#1750 )	2023-09-11 14:53:43 -05:00
ahrav	2a9f34962d	Add optional param to Chunks (#1747 ) * Add interface for targeted chunking. * use optional args. * update Chunks method signature. * update tests. * fix test. * update QueryCriteria type.	2023-09-07 09:03:37 -07:00
Cody Rose	afe708519b	Validate S3 source (#1715 ) This PR adds S3 source validation. This is accomplished by factoring out common "bucket visiting" logic to be used by both scanning and validation.	2023-09-05 10:18:58 -04:00
Cody Rose	a2c0abbfd6	Unify S3 client creation logic (#1657 ) This PR unifies some code paths within the S3 source. This is being done to better support a future implementation of S3 source validation; less code that runs means less code to validate. The logical change is to move the handling of "role-less" operation down the call tree, which allows for a single code path for more of the S3 code. This PR also fixes a bug that would occur in the (rare) case that the source couldn't create a regional S3 client. Before, an error would be logged, but it would be followed by a panic. Now the bucket in question is skipped.	2023-08-30 17:49:37 -04:00
ahrav	2b1b1b5ad0	Add jobID to chunk. (#1721 )	2023-08-29 12:02:30 -07:00
Cody Rose	33eed42e17	Test S3 role assumption (#1655 ) This PR adds a test of the S3 role assumption functionality. It currently only tests role assumption within a single account.	2023-08-25 11:30:08 -04:00
Cody Rose	059ea23a72	update s3 test bucket (#1649 ) We're switching our S3 source test account over to a different one, which means we have to change the bucket name.	2023-08-22 12:43:38 -04:00
Cody Rose	dbb2c2e319	wait before finishing s3 test (#1647 ) The S3 source test verifies that chunking has completed, but it didn't actually wait for completion first, leading to non-deterministic test failures.	2023-08-21 12:36:36 -04:00
Mike Vanbuskirk	64dd49f9ce	add role assumption for s3 source (#1477 ) * add role assumption for s3 source * refactor role assumption to repeatable string user can pass array of roles to assume * refactor s3 chunks to handle passed roleARNs * add role-session name use timestamp to make dynamic * add docstring for rolearn strings() * make sure role ars are passed into source * refactor role assumption functionality break s3 bucket scanning into sep. function * add log check on assume role * fix role iteration - Make sure s3 struct is populated with roles - add separate new client instantiation for role-based access - iterates through each role * add comment * protobuf revert for merge * re-run make proto * lint cleanup * cleanup TODOs * drop redundant switch case in assumerole client * use less verbose 'ctx' designator * breakout functionality from Chunks - separate functions for: - enumerating buckets to scan - scanning objects within the buckets * remake protobuf defs * allow scan to continue on single bucket err * add readme docs * minor fixups	2023-08-17 20:30:20 -04:00
ahrav	b8bb94f2b1	[bug] - copy chunk before sending on chunksChan (#1633 ) * Redclare chunk before sending on chunksChan. * add integration test. * update test.	2023-08-16 16:36:38 -07:00
ahrav	13999227b9	Use common chunk reader (#1596 ) * Add common chunker. * add comment. * use better config name. * Add common chunk reader to s3. * Add common chunk reader to git, gcs, circleci. * revert gcs. * revert gcs. * fix chunker. * revert gcs. * update cancellablewrite. * revert impl. * update to remove totalsize. * Fix my goof. * Use unified struct in chunkreader. * return err instead of logging and returning. * rename error to err. * only send single ChunkResult even if there is an error and chunkBytes. * fix logic.	2023-08-07 12:55:28 -07:00
ahrav	78d06658ca	Dont return in loop. (#1589 )	2023-08-01 10:29:01 -07:00
Brendan Shaklovitz	da5301ea1e	Exit with non-zero exit code on chunk source error (#1286 ) * Exit with non-zero exit code on chunk source error * Exit with a non-zero exit code whenever we hit an error getting chunks. Previously the error would be logged but trufflehog would exit with a 0 (success) status code. * fix gcs test --------- Co-authored-by: Dustin Decker <dustin@trufflesec.com> Co-authored-by: ahrav <ahravdutta02@gmail.com>	2023-06-26 11:39:57 -05:00
Miccah	f3152b6885	Implement SourceUnitUnmarshaller for all sources (#1416 ) * Implement CommonSourceUnitUnmarshaller * Add SourceUnitUnmarshaller to all sources using All sources, with the exception of git, will use the CommonSourceUnit as they only contain a single type of unit to scan. * Fix method comments to adhere to Go's style guide	2023-06-23 11:15:51 -05:00
Brendan Shaklovitz	10902f802a	Add max object size flag for s3 bucket scanning (#1294 ) Co-authored-by: Dustin Decker <dustin@trufflesec.com>	2023-04-26 15:39:43 -07:00
iamjpotts	b3d917f9c7	Resolve #1167 by adding support for the AWS_SESSION_TOKEN (#1170 ) * Resolve #1167 by adding support for the AWS_SESSION_TOKEN environment variable and adding a --session-token cli arg * fix error message --------- Co-authored-by: Dustin Decker <dustin@trufflesec.com>	2023-04-03 14:56:43 -07:00
Miccah	d317ddb51a	[chore] Remove logrus from circleci, filesystem, gitlab, and s3 sources (#1089 ) * [chore] Remove logrus from circleci, filesystem, gitlab, and s3 sources * Address comments	2023-02-10 11:02:55 -06:00
ahrav	8be89a593b	Handle errors in a thread safe manner (#1052 ) * Handle errors in a thread safe manner. * fix test. * fix linter. * address comments.	2023-02-02 11:05:33 -08:00
ahrav	009756dce6	add proto that was missing. (#986 )	2022-12-23 13:27:07 -08:00
Bill Rich	36ca2601e0	Add s3 object count to trace logs (#975 ) * Add s3 object count to trace logs * fix debug level	2022-12-13 16:46:09 -08:00
ahrav	26befdd1ec	[bug] - Handle error when scanning s3 bucket. (#969 ) * Handle error when scanning s# bucket. * move wait outside loop. * Add logging. * revert changes. * remove. * revert.	2022-12-12 10:10:06 -08:00
Bill Rich	f1ec9e74eb	Close files to clean up tmp files (#940 )	2022-11-22 13:13:34 -08:00
Dustin Decker	28dd25beeb	S3 scanner improvements (#938 )	2022-11-21 19:15:26 -08:00
Bill Rich	ab71b93f7d	Add context to handler (#877 ) * Add context to handler * Return rather than break out of select	2022-10-28 08:57:55 -07:00
Bill Rich	958266ea84	Run chunker in pipeline (#859 ) * Run chunker in pipeline * Move ChunkSize and PeekSize to source package. * Use new Chunk and Peek size location	2022-10-24 13:57:27 -07:00
ahrav	92f40c2031	[THOG-709] - Recover from detector panics (#810 )	2022-09-22 07:01:10 -07:00
ahrav	7ba583ca40	[THOG-681] - Handle errors sources (#783 ) * Handle errors w/ github source. * Fix loop var captured by func literal. * Fix loop var captured by func literal. * Set completed progress if the scan completes with no errors. * Set progress to 100% if the scope and iteration are both 0. * Fix commentary. * Fix test. * Return after the defer to os.RemoveAll. * Fix unauth scan. * Inline range loop. * update tests for partial scan completion with errors. Ensure correct progress is set. * Update progress for all sources. * Update github test. * Address comments.	2022-09-07 19:40:37 -07:00
Dustin Decker	fa9479100e	Add common sentry recover library and add into goroutines (#738 ) * Add common sentry recover library and add into goroutines * fix nits	2022-08-29 11:45:37 -07:00
Bill Rich	0ddd49a1b8	Use file handler and common chunker (#707 )	2022-08-23 16:35:52 -07:00
ahrav	dcc102a81c	[Thog-371] Utilize config struct for engine scans (#700 ) * Use a config struct when scanning and engine source. * fix tests. * Move test_helpers to the sources pkg. * Handle ScanGit error in tests. * adderss comments. * Use functional options. * Remove temp var. * Add better var names for the setup functions for each config. * Remove unused var. * fix error logs. * fix error logs. * single line. * remove blank lines.	2022-08-10 10:11:13 -07:00
Dustin Decker	2178f1f42e	reword and fix error logging	2022-06-13 16:14:22 -07:00
ahrav	d2605354fe	[THOG-332 ]Remove TokenSource interface from the init method of Source. (#539 ) * Remove TokenSource interface from the init method of Source. * Remove proto message. * Remove proto message. * Fix tests. * Fix filesystem test.	2022-05-13 14:35:06 -07:00
ahrav	b0d79180f6	[THOG-314] Add new parameter to the Init method for the source interface. (#529 ) * Add new parameter to the Init method for the source interface. * Add Oauth Token service. * remove .test file. * remove .test file. * Fix param spelling. * fix tests with new param in init * Add missing gock lib.	2022-05-10 11:11:43 -07:00
steeeve	a770f643df	Add placeholder for encoded resume info in SetProgressComplete	2022-03-24 12:43:36 -04:00
Bill Rich	6486c18565	Add s3 support to CLI (#76 ) * Add s3 support to CLI * Clean up comments Co-authored-by: Dustin Decker <dustin@trufflesec.com>	2022-03-14 17:07:07 -07:00

46 commits