Commit graph

416 commits

Author SHA1 Message Date
Miccah
7ba880f47a
Add AvailableCapacity method to SourceManager (#1665) 2023-08-29 12:36:44 -07:00
ahrav
2b1b1b5ad0
Add jobID to chunk. (#1721) 2023-08-29 12:02:30 -07:00
ahrav
c51e8f8af5
buffer channel. (#1718) 2023-08-28 18:08:31 -07:00
ahrav
0932ea224b
[chore] - Prevent nil deref panic (#1709) 2023-08-26 20:39:50 -07:00
Miccah
5eb776cd61
Support cancelling a run from a JobProgressRef (#1663) 2023-08-25 10:43:33 -07:00
Cody Rose
33eed42e17
Test S3 role assumption (#1655)
This PR adds a test of the S3 role assumption functionality. It currently only tests role assumption within a single account.
2023-08-25 11:30:08 -04:00
Miccah
61977412df
Add SourceName to JobProgressRef (#1664) 2023-08-25 07:48:25 -07:00
ahrav
4f4a79f62b
Support azure git links (#1662)
* Support azure git links.

* update comment.

* update test names.
2023-08-24 14:36:52 -07:00
Miccah
f2bfcc7ac6
Capture source-reported progress in JobProgress snapshot (#1661) 2023-08-24 11:28:50 -07:00
Miccah
a4401939a8
Add ElapsedTime method to JobProgressMetrics (#1660) 2023-08-24 11:28:33 -07:00
ahrav
a2a7a2087e
[chore] - update comments and logs. (#1654)
* update comments and logs.

* Update github.go
2023-08-23 13:18:07 -07:00
ahrav
9ae72308be
Include the job ID in a chunk (#1652)
* Include the job ID in a source's chunk.

* address comments.

* address comments.
2023-08-22 14:00:27 -07:00
Zubair Khan
fd00d2b30b
add rate limit and consumption metrics for GitHub (#1651)
* add rate limit and consumption metrics

* incrment after each repo scanned

* update repo scanned label name
2023-08-22 15:01:59 -04:00
Cody Rose
059ea23a72
update s3 test bucket (#1649)
We're switching our S3 source test account over to a different one, which means we have to change the bucket name.
2023-08-22 12:43:38 -04:00
Miccah
5cfbde783f
Fix reversed ordering of arguments (#1648)
The source manager initialization function was defined as `sourceID`
followed by `jobID`, while the source initialization function is the
reverse. This is confusing and easy to mix up since the parameters are
the same type.

This commit adds a test to make sure the source manager initializes in
the correct order, but it doesn't prevent the library user to make the
same mistake. We may want to consider using different types.
2023-08-22 07:55:56 -07:00
Zubair Khan
9a13c74a35
add thog CLI support for GitHub config validate (#1626)
* add exportable validate function for github

* update validator

* use the context

* gate to prevent panic

* wrap error with context

* wrap error with context for basic auth and unauth
2023-08-22 10:22:39 -04:00
Cody Rose
dbb2c2e319
wait before finishing s3 test (#1647)
The S3 source test verifies that chunking has completed, but it didn't actually wait for completion first, leading to non-deterministic test failures.
2023-08-21 12:36:36 -04:00
ahrav
d51e3b6d83
Only scan gist comments or repo comments. (#1646) 2023-08-20 11:38:28 -07:00
Mike Vanbuskirk
64dd49f9ce
add role assumption for s3 source (#1477)
* add role assumption for s3 source

* refactor role assumption to repeatable string

user can pass array of roles to assume

* refactor s3 chunks to handle passed roleARNs

* add role-session name

use timestamp to make dynamic

* add docstring for rolearn strings()

* make sure role ars are passed into source

* refactor role assumption functionality

break s3 bucket scanning into sep. function

* add log check on assume role

* fix role iteration

- Make sure s3 struct is populated with roles
- add separate new client instantiation for role-based access
- iterates through each role

* add comment

* protobuf revert for merge

* re-run make proto

* lint cleanup

* cleanup TODOs

* drop redundant switch case in assumerole client

* use less verbose 'ctx' designator

* breakout functionality from Chunks

- separate functions for:
- enumerating buckets to scan
- scanning objects within the buckets

* remake protobuf defs

* allow scan to continue on single bucket err

* add readme docs

* minor fixups
2023-08-17 20:30:20 -04:00
ahrav
0ae8cf5d35
[bug] - handle IOOR panic (#1639)
* handle IOOR panic.

* use a better fxn name.

* increae timeout for test to compete.

* simplify code and add test.

* do it for miccah.
2023-08-17 15:47:11 -07:00
ahrav
b8bb94f2b1
[bug] - copy chunk before sending on chunksChan (#1633)
* Redclare chunk before sending on chunksChan.

* add integration test.

* update test.
2023-08-16 16:36:38 -07:00
Miccah
fae54c7ffa
Add ScanChunk to allow injecting Chunks into the SourceManager's channel (#1634)
With the introduction of the SourceManager, the chunks channel became
private and read-only. This provides a method to write chunks into the
channel as we transition away from needing to do that.
2023-08-16 16:09:23 -07:00
Zubair Khan
db89e345d7
correct logging output for github comments and add oss flags (#1632)
* correct logging output

* add flags

* respect oss cli flags for github comment scanning

* improve copy
2023-08-16 18:23:59 -04:00
joeleonjr
fa9469cfc7
Docker scanning by digest (#1615)
* added functionality to scan docker images with digests instead of tags

* cleaned import statement

* added unit test for baseAndTag parsing + remote digest scan
2023-08-11 16:53:12 -05:00
ahrav
e894540632
Use the common chunker for scanning the filesystem source (#1619)
* Use the common chunker for scanning the filesystem source.

* remove unused conts.

* add test.
2023-08-11 13:40:10 -07:00
Bill Rich
2d2595a2e3
Move commits_scanned to ScanRepo (#1610) 2023-08-07 14:28:57 -07:00
ahrav
13999227b9
Use common chunk reader (#1596)
* Add common chunker.

* add comment.

* use better config name.

* Add common chunk reader to s3.

* Add common chunk reader to git, gcs, circleci.

* revert gcs.

* revert gcs.

* fix chunker.

* revert gcs.

* update cancellablewrite.

* revert impl.

* update to remove totalsize.

* Fix my goof.

* Use unified struct in chunkreader.

* return err instead of logging and returning.

* rename error to err.

* only send single ChunkResult even if there is an error and chunkBytes.

* fix logic.
2023-08-07 12:55:28 -07:00
Miccah
1cd600f70f
Use SourceManager in engine (#1586)
* Add SourceManager to Engine struct

* Update Engine methods to use the SourceManager

* Fix GCS test

The original was testing that `Init()` errors weren't surfaced in
`Finish()`, but the `SourceManager` changed that behavior.

* JobProgress race fixes

* Add contextual values

* Remove unused code

* Add debug logs

* Rename WithConcurrency to WithConcurrentSources

* Always forward chunks to the output chunks channel
2023-08-03 13:36:30 -05:00
Miccah
e322c4b29d
Fix nil pointer dereference to git ScanOptions (#1603) 2023-08-03 12:07:24 -05:00
Savely Krasovsky
d062834997
initial support for bare repositories (#1499)
* feat: initial support for bare repositories

* feat: use concatenation instead of formatting and os.Getenv instead of os.Environ

Signed-off-by: Savely Krasovsky <savely@krasovs.ky>

* fix: go-git update with pre-receive hooks fix

Signed-off-by: Savely Krasovsky <savely@krasovs.ky>

* fix: remove info about pre-receive hook from README.md for now

Signed-off-by: Savely Krasovsky <savely@krasovs.ky>

* fix: don't scan staged while using --bare option, fixes to make it work with the latest master

Signed-off-by: Savely Krasovsky <savely@krasovs.ky>

* fix: small refactor according to #1518

Signed-off-by: Savely Krasovsky <savely@krasovs.ky>

---------

Signed-off-by: Savely Krasovsky <savely@krasovs.ky>
2023-08-03 11:23:41 -05:00
ahrav
5a5e8a607e
Common chunk reader (#1594)
* Add common chunker.

* add comment.

* use better config name.

* Add common chunk reader to s3.

* Add common chunk reader to git, gcs, circleci.

* fix chunker.

* revert gcs.

* update cancellablewrite.

* revert impl.

* update to remove totalsize.
2023-08-03 06:27:33 -07:00
Bill Rich
c995e93dcc
Add commits scanned to log (#1600)
* Add commits scanned to log

* Use atomic
2023-08-02 14:10:54 -07:00
ahrav
78d06658ca
Dont return in loop. (#1589) 2023-08-01 10:29:01 -07:00
Miccah
69021f59c5
Refactor git source to allow ScanOptions and use source in engine (#1518)
* Refactor git source to allow ScanOptions and use source in engine

Refactor the Chunks method of the git Source to call out to two helper
methods: scanRepos and scanDirs which scans s.conn.Repositories and
s.conn.Directories respectively. The only notable change in behavior is
that a credential is no longer necessary if there are no
s.conn.Repositories to scan.

* Preserve ScanGit functionality of not cleaning up temporary files
2023-08-01 09:52:02 -05:00
ahrav
5043fc8756
[bug] - Fix unlocking an unlocked mutex (#1583)
* use correct mutext.

* remove unused fxn.
2023-07-31 14:06:41 -07:00
ahrav
eb00d0d4e1
[bug] - fix data races (#1577)
* fix data race.

* Add test and fix additional data race.

* address comments.
2023-07-31 11:12:38 -07:00
Miccah
a07b6664f8
Support fatal errors in job reports (#1562)
* Support fatal errors in job reports

* WIP: JobReporter and JobInspector

* WIP: JobReportHook and JobReportRef

* Add ChunkError type and asyncRun helper method

* Rename JobReport to JobProgress

* Return a closed channel from Done when the JobProgress is nil

* Comment catchFirstFatal function
2023-07-31 11:28:30 -05:00
Cody Rose
ad57de50cd
Do not nest transports for Github installation client (#1564)
#1454 modified one of the Github enumeration code paths in a way that broke an integration test by causing one client's transport to be used for the construction of a different client, causing authentication failures. This saves the original transport for use, fixing the test.
2023-07-31 11:31:16 -04:00
Richard Gomez
e0faac8d1c
Fix runtime error when scanning Gist comments (#1552)
* fix(github): fix runtime error from gist comments

* fix(github): add flag to scan Gist comments
2023-07-31 08:57:42 -05:00
Miccah
e391e89f3e
Initial implementation of JobReport with SourceManager usage (#1557)
* Initial implementation of JobReport with SourceManager usage

* Limit concurrent units

* Only save the last JobReport per handle
2023-07-27 10:49:56 -05:00
Richard Gomez
46823f77c9
feat(github): clarify comment log statement (#1553) 2023-07-26 09:40:30 -05:00
Miccah
10f0963bc9
Add SourceManager tests for Run and Wait methods (#1530)
* Miscellaneous SourceManager updates

* Own the chunks channel instead of accepting it as an input
* Add Chunks and Wait methods
* Fix bug in Enroll so it actually returns the handle
* Add context.Context parameter to the SourceInitFunc type

* Add SourceManager tests for Run and Wait methods

* Rename man variables to mgr
2023-07-26 00:48:28 -05:00
Richard Gomez
2290954b02
fix(github): use apiEndpoint for basic or no auth (#1454) 2023-07-25 20:03:08 -07:00
Bill Rich
f39303495a
Add commitsScanned metrics (#1533)
* Add commitsScanned metrics

* Just keep commit count
2023-07-25 11:31:01 -07:00
ahrav
b5b01d3eba
[chore] - optimize chunker (#1535)
* Use chunkbytes that includes the size of peek.

* linter.

* continue.

* add TotalChunkSize const.
2023-07-24 19:30:29 -07:00
ahrav
9e0a2e9ddd
[chore] - Remove password info from log (#1528)
* Remove password info from log.

* update.

* one more.
2023-07-22 20:25:45 -07:00
Miccah
91c5472876
Implement SourceManager basics (#1515)
* Implement SourceManager basics

* Rename identifiers and add a default headlessAPI implementation

* Rewrite to use SourceInitFunc

* Update variable name to accurately reflect its value
2023-07-21 15:20:25 -05:00
Miccah
4e774d1f01
Define SourceUnit chunking interface (#1484)
* Define SourceUnit chunking interface

* Refactor to use a ChunkReporter interface

* Rename shadowed err to scanErr
2023-07-13 14:11:43 -05:00
Miccah
4b7f94dea1
Rewrite SourceUnitEnumerator to use UnitReporter instead of a channel (#1485) 2023-07-13 13:48:33 -05:00
Richard Gomez
1594fddf05
feat(git): include line in github & gitlab links (#1466) 2023-07-11 20:02:27 -07:00
Zubair Khan
4334af4d34
scan GitHub PR and issue comments (#1435)
* issue comment scanning

* save progress

* test

* test for pr comment and issue comment

* add pagination support

* linter stuff

* make linter happy

* remove debug log

* readd logging

* github issue resolved

* var const block and handle rate limit

* remove magic number

* make gitURLParse a public function to use more generally

* fix test bug

* make comment scanning OPT-IN
2023-07-11 15:13:33 -04:00
Miccah
5c0ffda618
Define SourceUnit enumeration interface (#1428)
* Add CancellableWrite helper function

* Create SourceUnitEnumerator interface and EnumerationResult struct

* Implement SourceUnitEnumerator for the filesystem Source

* Omit explicit zero values
2023-07-10 15:05:40 -05:00
Zachary Rice
452734adc8
remove head from git diff command, rename unstaged to staged (#1439) 2023-06-29 15:33:30 -05:00
Zachary Rice
4a77688097
use stringer again for now (#1430) 2023-06-26 14:33:54 -05:00
trufflesteeeve
11bff81def
Use url redaction in git (#1399)
Co-authored-by: Zachary Rice <zachary.rice@trufflesec.com>
2023-06-26 13:56:08 -05:00
Brendan Shaklovitz
da5301ea1e
Exit with non-zero exit code on chunk source error (#1286)
* Exit with non-zero exit code on chunk source error

* Exit with a non-zero exit code whenever we hit an error getting
  chunks. Previously the error would be logged but trufflehog would exit
  with a 0 (success) status code.

* fix gcs test

---------

Co-authored-by: Dustin Decker <dustin@trufflesec.com>
Co-authored-by: ahrav <ahravdutta02@gmail.com>
2023-06-26 11:39:57 -05:00
Miccah
f3152b6885
Implement SourceUnitUnmarshaller for all sources (#1416)
* Implement CommonSourceUnitUnmarshaller

* Add SourceUnitUnmarshaller to all sources using

All sources, with the exception of git, will use the CommonSourceUnit as
they only contain a single type of unit to scan.

* Fix method comments to adhere to Go's style guide
2023-06-23 11:15:51 -05:00
Dustin Decker
e856a6890d
🎉 Add Docker image scanning 🎉 (#1412)
* Add Docker source

* Add metrics

* Add test

* Add debugging, address PR comments, fix path output

* review suggestions
2023-06-22 08:02:25 -07:00
dillonstreator
648ef3b52c
fix spelling errors (#1413) 2023-06-21 07:15:28 -07:00
Miccah
e12f0f84a1
Setup SourceUnit interface (#1393)
* Test: Asymmetrical unmarshal API

* Test: Symmetric marshal API

* Revert "Test: Symmetric marshal API"

This reverts commit f51c64a797.

* Cleanup test example and add SourceUnitUnmarshaller interface

* Add CommonSourceUnit implementation

* Update comments

* Remove UnmarshalJSON
2023-06-16 10:38:28 -05:00
Bill Rich
401688d0a8
Add Validator interface and example (#1397)
* Add Validator interface and example

* Close sockets and improve error messages

* Remove duplicate error

* Use var declaration so err slice can be nil
2023-06-15 08:24:32 -07:00
Bill Rich
c2e3e7d53a
Split files instead of using ReadAll (#1387)
* Split files instead of using ReadAll

* Remove dup chunk

* Actually break out of loop
2023-06-12 14:09:05 -07:00
Dustin Decker
572cb0e5dc Loosen up version check for git 2023-06-01 12:17:48 -07:00
Dustin Decker
183037ab34
Check that git meets version requirements (#1373) 2023-06-01 09:41:06 -07:00
Dustin Decker
c8944825de
Surface missing git as an error during initialization (#1362) 2023-05-26 15:23:08 -07:00
ahrav
1da7720912
Replace context.TODO. (#1349) 2023-05-19 11:09:51 -07:00
ahrav
31844b12e3
[oc-313] - Add GitHub metrics (#1324)
* Normalize repos during enumeration.

* fix test.

* Add benchmark.

* Add benchmark.

* Add more realistic benchmark values.

* add gist mocks.

* Remove old normalize fxn.

* abstract away the repo cache.

* update test.

* increase repo count.

* increase page limnit to 100.

* move callee fxns below caller for Chunks.

* Add context to normalize.

* remove extra logic in normalize repo.

* Delete new.txt

* Delete old.txt

* Handle errors in a thread safe manner.

* fix test.'

* fix test.

* handle repos that are included by users.

* Abstract include ignore logic within repoCache.

* Add better comment around repoCache.

* Rename params.

* remove commented out code.

* use repos instead of items.

* remove commented out code.

* Use ++ instead of atomic increment.

* update to use logger var.

* use cache pkg.

* Use separate file for repo logic.

* Address comments.

* fix test.

* make less sucky test.

* Update test.

* Add logs for duration and repo size.

* fix integration test.

* address comment.
2023-05-16 08:45:28 -07:00
Dustin Decker
4250773e92
GitHub basic auth (#1337) 2023-05-15 22:04:42 -07:00
ahrav
6db770fbe5
use md5 hash for checking if key exists. (#1257) 2023-05-15 10:04:14 -07:00
ahrav
948828ba8c
[chore] - move objectManager interface (#1332)
* Relocate the objectManager interface to the consumer package as per Go
best practices.

* address comment.
2023-05-15 09:30:26 -07:00
Brendan Shaklovitz
fad34d4dc6
git worktree scanning fix for #827 (#1315)
* Fix worktree scan by setting EnableDotGitCommonDir

* Change `PlainOpenOptions` to set `EnableDotGitCommonDir` to true.
  In every current usage of this function, it is on an already-cloned
  repository, so it should always be valid to have this set. By doing
  so, it should fix some issues with worktrees.

* Remove unused go.mod replace directives

* Remove replace directives for libraries that are not in use.
2023-05-09 08:00:47 -07:00
Bill Rich
f2924f3061
Make sure context lines are properly handled (#1331)
* Make sure context lines are properly handled

* Fix git test to account for context change
2023-05-05 12:51:27 -07:00
ahrav
030c093392
Fix how we scan orgs (#1327)
* Fix how we scan orgs.

* fix integration test.
2023-05-04 08:07:11 -07:00
Brendan Shaklovitz
be4147a24e
Output git timestamps as UTC times (#1323) 2023-05-03 11:47:00 -05:00
ahrav
323c093818
Normalize GitHub repos during enumeration (#1269)
* Normalize repos during enumeration.

* fix test.

* Add benchmark.

* Add benchmark.

* Add more realistic benchmark values.

* add gist mocks.

* Remove old normalize fxn.

* abstract away the repo cache.

* update test.

* increase repo count.

* increase page limnit to 100.

* move callee fxns below caller for Chunks.

* Add context to normalize.

* remove extra logic in normalize repo.

* Delete new.txt

* Delete old.txt

* Handle errors in a thread safe manner.

* fix test.'

* fix test.

* handle repos that are included by users.

* Abstract include ignore logic within repoCache.

* Add better comment around repoCache.

* Rename params.

* remove commented out code.

* use repos instead of items.

* remove commented out code.

* Use ++ instead of atomic increment.

* update to use logger var.

* use cache pkg.

* Address comments.

* fix test.

* make less sucky test.

* Update test.
2023-05-03 08:35:53 -07:00
ahrav
67972683ea
[chore] - format log msg (#1299)
* format log msg.

* snake.

* lowercase repo.
2023-04-27 17:14:00 -07:00
ahrav
a2266b4e28
add additional logging (#1298)
* add additional logging.

* update test.

* remove continue.

* address comments.
2023-04-27 16:48:04 -07:00
Brendan Shaklovitz
10902f802a
Add max object size flag for s3 bucket scanning (#1294)
Co-authored-by: Dustin Decker <dustin@trufflesec.com>
2023-04-26 15:39:43 -07:00
Dustin Decker
97ce27153a
[]bytes were being logged as b64ed string (#1255) 2023-04-14 06:43:26 -07:00
ahrav
461f1a631e
[chore] - use hex encode vs base64 (#1256)
* use hex encode vs base64.

* fix tests.
2023-04-13 19:16:06 -07:00
ahrav
2fbf86a6ab
Use md5 hash for resuming key (#1203)
* Add in-memory caching lib, used by the GCS source.

* Use cache for tracking progress for the GCS source.

* fix merge issue.

* fix merge issue.

* fix test.

* Fix static check.

* Add test for NewWithData.

* Use cache for tracking progress for the GCS source.

* fix merge issue.

* fix merge issue.

* fix test.

* update comment.

* update comments.

* Use cache for tracking progress for the GCS source.

* fix merge issue.

* fix merge issue.

* fix test.

* Include md5 hash to the object struct.

* remove unused dep.

* address comments.

* Add exists method.

* Use cache for tracking progress for the GCS source.

* fix merge issue.

* fix merge issue.

* fix test.

* rebase.

* fix test.

* Use cache for tracking progress for the GCS source.

* fix merge issue.

* fix merge issue.

* fix test.

* rebase.

* rebase.

* split encode resume by comma.

* update comment.

add comment for shouldCache.

remove redundant return.

* use md5 instead of name.

* update tests.

* Include md5 hash to the object struct.

* use md5 instead of name.

* update tests.

* Use a persistable cache.

* fix merge.

* fix merge.

* Include md5 hash to the object struct.

* use md5 instead of name.

* update tests.

* use md5 instead of name.

* update progress tests.

* use name for log message.

* remove slice operation.
2023-04-13 18:26:45 -07:00
Dustin Decker
1db22599af
update circle test because workflows expire and need re-running (#1251) 2023-04-10 16:21:19 -07:00
ahrav
c451f9daf8
Use persistable cache for GCS progress tracking (#1204)
* Add in-memory caching lib, used by the GCS source.

* Use cache for tracking progress for the GCS source.

* fix merge issue.

* fix merge issue.

* fix test.

* Fix static check.

* Add test for NewWithData.

* Use cache for tracking progress for the GCS source.

* fix merge issue.

* fix merge issue.

* fix test.

* update comment.

* update comments.

* Use cache for tracking progress for the GCS source.

* fix merge issue.

* fix merge issue.

* fix test.

* remove unused dep.

* address comments.

* Add exists method.

* Use cache for tracking progress for the GCS source.

* fix merge issue.

* fix merge issue.

* fix test.

* rebase.

* fix test.

* Use cache for tracking progress for the GCS source.

* fix merge issue.

* fix merge issue.

* fix test.

* rebase.

* rebase.

* split encode resume by comma.

* Use a persistable cache.

* fix merge.

* fix merge.

* Add progress as part of the cache given it will be the persistence layer.

* Add test for making sure the cache doesn't persist when the increment value is not met.

* fix tests.
2023-04-10 07:55:00 -07:00
iamjpotts
b3d917f9c7
Resolve #1167 by adding support for the AWS_SESSION_TOKEN (#1170)
* Resolve #1167 by adding support for the AWS_SESSION_TOKEN environment variable and adding a --session-token cli arg

* fix error message

---------

Co-authored-by: Dustin Decker <dustin@trufflesec.com>
2023-04-03 14:56:43 -07:00
ahrav
2cf6f831d4
Use OAuth2 http client with GCS (#1220)
* Use OAuth2 http client with GCS.

* rename variable.
2023-03-29 19:40:27 -07:00
Zachary Rice
fb9ae75661
Support for exclude globs at the git log level (#1202)
* init

* seems to be working

* better comment

* rm conditional

* Add more context to exclude-globs description
2023-03-28 10:46:03 -05:00
ahrav
ac19de75bf
Delete progress tracking from GCS source (#1190)
* Add in-memory caching lib, used by the GCS source.

* Use cache for tracking progress for the GCS source.

* fix merge issue.

* fix merge issue.

* fix test.

* Fix static check.

* Add test for NewWithData.

* Use cache for tracking progress for the GCS source.

* fix merge issue.

* fix merge issue.

* fix test.

* update comment.

* update comments.

* Use cache for tracking progress for the GCS source.

* fix merge issue.

* fix merge issue.

* fix test.

* remove unused dep.

* address comments.

* Add exists method.

* Use cache for tracking progress for the GCS source.

* fix merge issue.

* fix merge issue.

* fix test.

* rebase.

* fix test.

* Use cache for tracking progress for the GCS source.

* fix merge issue.

* fix merge issue.

* fix test.

* rebase.

* rebase.

* split encode resume by comma.

* update comment.

add comment for shouldCache.

remove redundant return.

* delete old code.

* delete more code.

* update comment.
2023-03-27 10:39:16 -07:00
ahrav
03a534d59f
Use correct date format for Date posted. (#1211) 2023-03-27 10:27:28 -07:00
ahrav
ffbd9c1ead
[chore] - log enumeration duration (#1187)
* log enumeration duration.

* use defer to print enumeration duration stat.

* remove temp var.
2023-03-21 09:14:58 -07:00
ahrav
c617bd7a4e
Add resuming capability to GCS source (#1161)
* Add resuming capability to GCS source.

* Handle no auth scans.

* complete resume logic

* Use custom function type.

* remove functions.

* linter.

* fix test.

* fix test.

* Handle concurrent map writes.

* use string as CLI flag for include/exclude.

* handle emtpy buckets.

* Handle enumeration on initial job run.

* Rename stats to attributes.

* remove redundant return.

* If test fails due to 400, that is fine, it's expected.

* Add unauth GCS source type.

* comments.

* update proto.

* Use short flag.

* address comments.
2023-03-16 17:53:42 -07:00
ahrav
6193509098
add support for json service account and service account file. (#1185) 2023-03-16 13:04:36 -07:00
Miccah
ef9488c77d
[chore] Log git output on error (#1180) 2023-03-15 15:32:29 -05:00
Tim Walter
a7abd6231d
Fix git commit date string formatting (#1181) 2023-03-14 22:39:12 -05:00
Dustin Decker
585bd82d47
update integration test excludes (#1169) 2023-03-10 14:41:29 -08:00
ahrav
cbf299aa77
Add gcs scanning integration (#1153)
* Setup for GCS scanning.

* Update GCS engine w/ projectID req.

* Add concurrency field to gcsManager.

* add errgroup to gcsManager.

* Update gcs manager.

* Use defautl ADC.

* use ADC.'

* Add TOOD.

* add log to iterator completion.

* use a BinaryReader instead of concrete object for channel type.

* initial test for Chunks.

* Add tests for chunking objects.

* Add concurrency.

* update metadata to include content type and acls.

* Add object reading code.

* Add integration test.

* Add entrypoint.

* Add removed wg.Wait().

* remove dead code.

* remove build.

* Remove period from file extension.

* remove used.

* Add comment.

* Setup for GCS scanning.

* Update GCS engine w/ projectID req.

* Add concurrency field to gcsManager.

* add errgroup to gcsManager.

* Update gcs manager.

* Use defautl ADC.

* use ADC.'

* Add TOOD.

* add log to iterator completion.

* use a BinaryReader instead of concrete object for channel type.

* initial test for Chunks.

* Add tests for chunking objects.

* Add concurrency.

* update metadata to include content type and acls.

* Add object reading code.

* Add integration test.

* Add entrypoint.

* Add removed wg.Wait().

* remove dead code.

* remove build.

* remove used.

* Add file type for objects.

* Add check for file type and size.

* Add default file size.

* Add additinoal auth options and remaining CLI flags.

* Handle errors in go routines.

* Handle resuming for buckets.

* Remove redundant words in comment.

* remove ok check on bool check.

* remove extra blank line.

* Add return if handler handles chunk.

* Add comment.

* remove extra blank line.

* cleanup comment.

* Add comment.

* move up fxn.

* go mod tidy.

* Add exclusion to perf testing buckets.

* Handle blocking the channel.

* remove unused const.

* fix tests.

* fix tests.

* Handle gcs manger options better.

* update fxn name.

* Remove arg name.

* ignore buckets in gcsManager test.

* fix test.

* propulate gsManagerOpts.

* inline err check.

* Add readme.

* update readme spelling.

* fix test.
2023-03-07 17:32:04 -08:00
ahrav
aa47e5e248
Only scanned staged git changes. (#1143) 2023-03-01 08:58:36 -08:00
Miccah
6209a80ce1
[chore] Address more linter errors (#1134)
* Address lint errors in detectors

* Update deprecated ioutil call
2023-02-28 10:00:41 -06:00
Miccah
4efe5313f4
[chore] Address lint errors (#1133)
* Update strings.Title to cases.Title

* Migrate go-genproto to google-cloud-go

See: https://github.com/googleapis/google-cloud-go/blob/main/migration.md

* Check error in test

* Check error from sem.Acquire

* Remove unused code
2023-02-27 21:03:47 -06:00
Miccah
d2d03426ed
Implement String for ScanErrors (#1131)
This will concatenate all errors together into a single string. When
possible, it would be better to log the actual errors slice to take
advantage of structured logging.
2023-02-27 21:02:59 -06:00
Miccah
c5b4d6f28b
Support file scanning in filesystem source (#1030)
* Rename directories to paths

* Generate protos

* Add file scanning support to filesystem source

* Add directories back to filesystem proto

* Generate protos

* Combine paths and directories from in source

* Add filesystem filter

* Address comments
2023-02-27 12:15:05 -06:00
Miccah
161e499142
[chore] Remove logrus from trufflehog (#1095)
* [chore] Remove logrus from trufflehog

* Minor fixes

* Fix logFatal call

* Fix logrus call
2023-02-14 17:00:07 -06:00
Miccah
c6826c4574
Fix nil scan options (#1107) 2023-02-14 12:09:45 -06:00
SAYGIN Metin
f2139a7615
Github filter support for exclude and include (#1087)
* test

* Add missing head and base hash back.

---------

Co-authored-by: Ahrav Dutta <ahravdutta02@gmail.com>
2023-02-14 08:40:53 -08:00
ahrav
c5c8d10d28
[chore] - Remove monolithic config struct (#1091)
* REmove monolithic config struct.

* fix broken test.
2023-02-10 12:43:00 -08:00
Miccah
d317ddb51a
[chore] Remove logrus from circleci, filesystem, gitlab, and s3 sources (#1089)
* [chore] Remove logrus from circleci, filesystem, gitlab, and s3 sources

* Address comments
2023-02-10 11:02:55 -06:00
Miccah
0ce72ccda3
[chore] Remove logrus from github source (#1086)
* [chore] Remove logrus from github source

* Fix handleRateLimit test

* Fix tests
2023-02-09 18:02:04 -06:00
ahrav
e47cc2451f
Dont pre-allocate errors slice. (#1083) 2023-02-08 17:33:30 -08:00
Miccah
1f0fd91205
Skip repo and continue scanning when encountering an error (#1080) 2023-02-08 11:33:01 -06:00
ahrav
0d73dbe638
[chore] - Add tests for errors (#1071) 2023-02-08 04:15:44 -08:00
Bill Rich
af6e3f8fdf
Pull gitparse config options out of pkg consts (#1072)
* Pull gitparse config options out of pkg consts.

* Adjust naming
2023-02-04 13:19:23 -08:00
ahrav
8be89a593b
Handle errors in a thread safe manner (#1052)
* Handle errors in a thread safe manner.

* fix test.

* fix linter.

* address comments.
2023-02-02 11:05:33 -08:00
Alexandr Marchenko
b29b78c10d
filesystem support for exclude and include filters (2nd attemp) (#1033)
* fix filter issue - empty lines should be ignored

* filesystem support for filter exclude

Co-authored-by: Dustin Decker <dustin@trufflesec.com>
2023-01-26 09:33:45 -08:00
Bill Rich
00ebb2ed64
Full git log when targeting base merge commit (#1044)
* Full git log when targeting merge commits

* Full log is needed whenever base is specified.
2023-01-26 09:17:54 -08:00
Dustin Decker
4ef546a06b
fix github integration tests (#1042) 2023-01-25 08:57:39 -08:00
ahrav
1621403e11
Add concurrency to CircleCi source (#1029)
* Small cleanup of CircleCi source.

* Add concurrency to circleci.

* merge w/ cleanup branch.

* Rdefine loop var.

* Delete github.go

* reverge file delete.

* Add debug log for scan errors.

* make collecting scanned errors thread safe.

* pre-allocate errors slice.
2023-01-17 12:24:49 -08:00
ahrav
319ae64a02
[chore] - Small cleanup of CircleCi source (#1028)
* Small cleanup of CircleCi source.

* address comments.

* Add context to methods as first param.
2023-01-17 09:36:18 -08:00
Yassine Ilmi
d720c0c0f3
Switch to retryableHttpClient for GitHub AuthN API Client + More Logs (#995)
* Adding missing flags to Readme

* Use retryableHttpClient by default for GitHub

* Adding repoUrl for scanning time log

* Use WithField instead of WithFields

* Updating README with lasted --help output
2023-01-09 09:21:56 -08:00
Dustin Decker
5f6143f09a
Add Circle CI source (#997)
* Add Circle CI source

* remove SHA1 line

* remove trim
2023-01-05 21:44:37 -08:00
ahrav
009756dce6
add proto that was missing. (#986) 2022-12-23 13:27:07 -08:00
ahrav
936a139596
Allow using a glob for include list. (#977)
* Allow using a glob for include list.

* Update command flag.

* Make comment more clear.

* update comment.

* Allow scanning repo and org at the same time.
2022-12-16 13:28:16 -08:00
Bill Rich
36ca2601e0
Add s3 object count to trace logs (#975)
* Add s3 object count to trace logs

* fix debug level
2022-12-13 16:46:09 -08:00
Miccah
7ac7fdae44
Add more logging for git sources (#974) 2022-12-13 17:51:57 -06:00
ahrav
26befdd1ec
[bug] - Handle error when scanning s3 bucket. (#969)
* Handle error when scanning s# bucket.

* move wait outside loop.

* Add logging.

* revert changes.

* remove.

* revert.
2022-12-12 10:10:06 -08:00
Dustin Decker
7de9bdd12d
Support globbing with ignore repos (#967) 2022-12-09 12:10:42 -08:00
ahrav
a72b9feb35
Only scan org with --org flag. (#931) 2022-12-06 16:18:48 -08:00
Bill Rich
33d32d2de4
Don't scan the --since-commit target (#960) 2022-12-06 13:24:27 -08:00
Bill Rich
1a1c2e275e
Change chunker test source (#959)
* Change chunker test source

* Emit chunk if the size isn't 0
2022-12-06 12:45:08 -08:00
Bill Rich
9f99ee470d
Integration test fixes (#956)
* Adjust repo count for new app

* Fix chunk test count
2022-12-06 08:42:24 -08:00
Bill Rich
f1ec9e74eb
Close files to clean up tmp files (#940) 2022-11-22 13:13:34 -08:00
Bill Rich
79cae3b82b
Add newlines when file is split (#937) 2022-11-22 09:01:39 -08:00
Dustin Decker
28dd25beeb
S3 scanner improvements (#938) 2022-11-21 19:15:26 -08:00
Miccah
86f9e1288f
Initialize scan options if given a nil pointer (#924) 2022-11-15 17:01:59 -06:00
Jessica
3d501975e4
Add filter as scan option to gitlab module's git scan (#919) 2022-11-15 13:02:37 -08:00
ahrav
dd141fb55f
[oc-147] - Add context to all git methods (#901)
* Add context to all git methods.

* remove logrus.

* Add ctx.

* Address comments.

* Add error to clone failing.

* Return error.
2022-11-03 16:36:52 -07:00
ahrav
fe1e475a04
Prevent concurrent read and writes to visibility map. (#892) 2022-11-01 16:20:59 -07:00
Bill Rich
965279421c
Support common ssh repo format (#878)
* Try ssh repo format

* Add tests
2022-10-28 11:56:03 -07:00
Bill Rich
ab71b93f7d
Add context to handler (#877)
* Add context to handler

* Return rather than break out of select
2022-10-28 08:57:55 -07:00
Bill Rich
d7d614cc5f
Copy buffer bytes (#864) 2022-10-25 09:09:47 -07:00
Bill Rich
958266ea84
Run chunker in pipeline (#859)
* Run chunker in pipeline

* Move ChunkSize and PeekSize to source package.

* Use new Chunk and Peek size location
2022-10-24 13:57:27 -07:00
Bill Rich
3d5f697f9a
Use line aware chunking for git. (#858) 2022-10-24 13:00:03 -07:00
Dustin Decker
64ace363af Change commit to trace level logging 2022-10-24 08:59:52 -07:00
ahrav
46bc010165
Add tests for including github repos. (#854) 2022-10-21 07:56:36 -07:00
trufflesteeeve
fb56b9f713
Check rate limit when getting github user (#855)
Also, don't fetch a github user or their token when both are known. This
currently only affects the Github Token auth type. Github App
installations will continually fetch tokens every time we clone a repo.
In the future we should check the `ExpiresAt` field of the Github App
token and determine if we need to fetch a new one at that point.
2022-10-20 18:14:28 -04:00
ahrav
029519eb01
[THOG-767] ignore gitlab repos (#853)
* Add ability to ignore repos.

* use std library slices.Contains.

* Add tests.

* Remove zero values from test.
2022-10-19 13:55:44 -07:00
ahrav
2d6aadcb46
[THOG-774] - GitHub ignore repo full name (#848)
* Use github repo full name.

* fix tests.
2022-10-14 09:20:49 -07:00
ahrav
04c9bb535e
[THOG-768] - Add ability to skip scanning Github repos (#846)
* Add ability to skip scanning Github repos.

* remove old change.

* rename method.
2022-10-12 16:28:24 -07:00
Dustin Decker
785cead43e
Ignore URIs where the password is redacted (#842)
Only `*`s in the password is a redacted basic auth URI.
2022-10-11 14:18:52 -07:00
Miccah
2bc4985061
Add SSH config option for the git source (#830)
* Add SSH config option for the git source

The auth message is empty since we use the git binary underneath to
handle the SSH authentication.

* Import digitaloceanv2
2022-09-28 20:40:01 +02:00
Miccah
891996f546
Do not fail scanning if we cannot enumerate gists (#826) 2022-09-27 20:59:10 +02:00
Bill Rich
1c00014051
Include public/private in github metadata (#812)
* Include public/private in github metadata

* CR feedback

* Fix typos and naming
2022-09-26 14:55:46 -07:00
Dustin Decker
97a73710de
403 on listing user gist should not fail org scan (#822) 2022-09-26 14:37:25 -07:00
Dustin Decker
752c848640
Show clone path for git repos (#823) 2022-09-26 14:36:55 -07:00
Bill Rich
e3107ad6bb
Move head and base normalization to source (#818) 2022-09-23 08:58:45 -07:00
ahrav
92f40c2031
[THOG-709] - Recover from detector panics (#810) 2022-09-22 07:01:10 -07:00
trufflesteeeve
63fcf33ce6
Fix improper github org member pagination (#814)
I'm not sure I fully understand why this issue exists. But I think the
short version is this: When we attempted to paginate users, we would set
a variable's Page value. But that variable appears to not actually be a
pointer, despite being added as one. It probably has to do with how
struct embedding works. Either way, if we make the overall options
variable the whole thing, and update its embedded struct with our page
variable, everything works out.
2022-09-21 16:22:42 -07:00
Bill Rich
509cf8b6fa
Use headref and check empty commits for base (#815) 2022-09-21 16:04:01 -07:00
Dustin Decker
335e676caa
Provide user when during private clones with token and fix integration tests (#811) 2022-09-19 15:53:21 -07:00
Bill Rich
593f1e6754
Include apiClient in Github source (#804) 2022-09-19 14:31:48 -07:00
trufflesteeeve
945de06858
Fix include-members not working on github (#773) 2022-09-12 13:26:38 -04:00
Bill Rich
912d8e461d
Add context so to avoid splitting creds. (#791)
* Add context so to avoid splitting creds.

* Add context newlines to expected results
2022-09-09 15:00:33 -07:00
Dustin Decker
ecfdb0105b
Provide correct username for app cloning and add integration test (#786) 2022-09-08 17:41:53 -07:00
Dustin Decker
80b247286b
Improve GitHub debug logging (#784)
* close bodies early

* add more debug logging to github

* fix nil check

* Add nil checks for response
2022-09-08 12:23:40 -07:00
ahrav
7ba583ca40
[THOG-681] - Handle errors sources (#783)
* Handle errors w/ github source.

* Fix loop var captured by func literal.

* Fix loop var captured by func literal.

* Set completed progress if the scan completes with no errors.

* Set progress to 100% if the scope and iteration are both 0.

* Fix commentary.

* Fix test.

* Return after the defer to os.RemoveAll.

* Fix unauth scan.

* Inline range loop.

* update tests for partial scan completion with errors. Ensure correct progress is set.

* Update progress for all sources.

* Update github test.

* Address comments.
2022-09-07 19:40:37 -07:00
Bill Rich
41936169c7
Use gitparse for unstaged changes. (#775) 2022-09-03 18:01:36 -07:00
Bill Rich
d11ce27f33
Use correct reader in filesystem source (#756) 2022-08-30 10:24:52 -07:00
Dustin Decker
fa9479100e
Add common sentry recover library and add into goroutines (#738)
* Add common sentry recover library and add into goroutines

* fix nits
2022-08-29 11:45:37 -07:00
Bill Rich
0ddd49a1b8
Use file handler and common chunker (#707) 2022-08-23 16:35:52 -07:00
Haz
4cc3529bc5
Added support for SSH URIs (#725) 2022-08-23 16:34:34 -07:00
Bill Rich
a0d44a39f1
Use trufflesec git parser (#729)
* Use trufflesec git parser.

* wip

* Fix line numbers and linter feedback
2022-08-23 13:29:20 -07:00
Bill Rich
5ad3bbde37
Use pointer to config (#715) 2022-08-16 09:15:25 -07:00
ahrav
73f9d3f0a0
[chore] - Use config struct instead of pointer for engine scans. (#709)
* Use a config struct instead of pointer when scanning engine sources.

* use config.
2022-08-12 09:56:24 -07:00
Bill Rich
4a93e49eea
Support scanning binary files in git sources (#684)
* Scan binary files for git sources

* Create data chunks in for loop

* Linter feedback and newline commit result

* Use disk buffered reader and chunker function
2022-08-10 16:10:45 -07:00
Bill Rich
a473b9aa99
Use re-readable reader and common chunker (#703)
* Use re-readable reader and common chunker

* Linter feedback

* Break on error
2022-08-10 15:32:49 -07:00
ahrav
dcc102a81c
[Thog-371] Utilize config struct for engine scans (#700)
* Use a config struct when scanning and engine source.

* fix tests.

* Move test_helpers to the sources pkg.

* Handle ScanGit error in tests.

* adderss comments.

* Use functional options.

* Remove temp var.

* Add better var names for the setup functions for each config.

* Remove unused var.

* fix error logs.

* fix error logs.

* single line.

* remove blank lines.
2022-08-10 10:11:13 -07:00
ahrav
30ebe84e3e
[THOG-608] - Fix linter errors. (#701)
* Fix linter errors.

* Fix gist adding test.

* Update test string for mock JSON reply.

* Remove if.
2022-08-09 19:20:02 -07:00
Bill Rich
7273dc9058
Archive decoder (#683)
* Archive decoder

* Fix reader handling

* Seek error handling

* Add tests

* Fix extra empty chunk

* Sync chunk size
2022-08-02 20:36:21 -07:00
ahrav
21e1ff4a8a
Fix the order to correctly match the params in NewGit. (#676) 2022-07-28 13:23:45 -07:00
trufflesteeeve
176552b07a
Fix commit attribution, git tests, and run make protos (#667)
* Update dependency to fix commit attribution, fix git tests

* Run make protos to match code with current proto definitions
2022-07-25 11:44:15 -04:00
trufflesteeeve
96106563a9
Remove git fragment trace (#656)
The fragment trace was a bit too verbose even at the trace level. We may
want to trace the file being chunked or something like that, but not the
entire diff.
2022-07-14 13:13:23 -04:00
trufflesteeeve
e793f4a5e6
Properly count the number of repos after a github scan resume (#625) 2022-06-17 16:21:22 -04:00
trufflesteeeve
10f4d02c31
Allow gitlab to resume from encoded resume info (#611) 2022-06-17 11:45:17 -04:00
Dustin Decker
2178f1f42e reword and fix error logging 2022-06-13 16:14:22 -07:00
trufflesteeeve
e123e9f177
Cleanup individual repositories after scanning (#614) 2022-06-10 14:00:50 -04:00
Dustin Decker
9bcddbc45a
Change GHE org enum to use since ID instead of pages (#618)
* Change GHE org enum to use since ID instead of pages

* fix logging
2022-06-09 15:09:13 -07:00
Dustin Decker
8051b03bbf
improve debug logging for GHE enum (#615) 2022-06-08 13:56:07 -07:00
Dustin Decker
1a12a25f4d
Enumerate all visible orgs in GHE (#612) 2022-06-07 09:24:31 -07:00
Dustin Decker
e3bbf293e2
Fix NPD on mutex (#609)
* Fix NPD on mutex

* fix test
2022-06-06 17:20:27 -07:00
Miccah
9074006695
Fix bug in GitHub unit test mocking (#608) 2022-06-06 16:58:34 -07:00
trufflesteeeve
fd79a367f1
Allow github to resume from encoded resume info (#601) 2022-06-06 12:08:57 -04:00
Miccah
fc18a5ae0c
Bug fix and add authentication in shallow clone (#595) 2022-05-31 20:45:28 -05:00
Miccah
67ad2f2247
Shallow clone if --since-commit is provided (#564)
* Shallow clone if --since-commit is provided

* Set the user before constructing args

* Fix vbout detector

* Address PR comments

* Use a better name for timestamp
* Use net.URL.String method for the remote path
2022-05-24 10:49:03 -05:00
ahrav
2051fe14ff
remove profililing. (#567) 2022-05-23 11:05:39 -07:00
ahrav
d2605354fe
[THOG-332 ]Remove TokenSource interface from the init method of Source. (#539)
* Remove TokenSource interface from the init method of Source.

* Remove proto message.

* Remove proto message.

* Fix tests.

* Fix filesystem test.
2022-05-13 14:35:06 -07:00
ahrav
b0d79180f6
[THOG-314] Add new parameter to the Init method for the source interface. (#529)
* Add new parameter to the Init method for the source interface.

* Add Oauth Token service.

* remove .test file.

* remove .test file.

* Fix param spelling.

* fix tests with new param in init

* Add missing gock lib.
2022-05-10 11:11:43 -07:00
ahrav
e12432cef8
[THOG-315] Replace bytes.buffer with strings.builder. (#533)
* Replace bytes.buffer with string.builder.

* Remove profiling.

* Remove detector changes.

* ignore .test files.

* fix detectors removed.
2022-05-09 17:02:46 -07:00
Miccah
edaf1e1fd3
Move GitHub integration tests behind a build flag and add unit tests (#527)
* Add unit tests and refactor some logic

* Move integration tests to a separate file behind a build flag

* Fix bugs in normalizeRepos

* Address lint errors

* Sort slices before comparing because order doesn't matter
2022-05-09 08:31:00 -07:00
Miccah
85208606bb
Reorganize GitHub source (#517)
* Reorganize GitHub source

This breaks up the Chunks method into smaller sub-method calls to help
organize and better understand the logic flow. No logic has been
modified (except one obvious bug), just shuffling code around.

* Check errors and revert bug fix
2022-05-06 05:00:46 -07:00
Bill Rich
212aa9ba1e
Disable tests that take too long (#524) 2022-05-04 16:37:37 -07:00
Bill Rich
c78120e56f
Syslog source (#500)
* Add syslog source

* only load cert/key with tls

* Cleanup

* Linting

Co-authored-by: Bill Rich <bill.rich@trufflesec.com>
2022-05-04 15:08:11 -07:00
Miccah
71442320ec
Chunk orgs the same when authenticated as unauthenticated (#501)
Also debug log the amount of forks we find in addReposByOrg.
2022-05-02 17:26:01 -07:00