Commit graph

55 commits

Author SHA1 Message Date
ahrav
fd257350dd
[chore] - address linter (#3133)
* addres linter

* fix
2024-07-31 17:30:51 -07:00
Richard Gomez
11e5febeee
feat(git): scan commit metadata (#2754)
This is a follow-up to #2713 that fixes the strange test error.

As suspected, the failure was caused by additional diffs not being included in the test's expected data.
2024-04-29 16:58:45 -04:00
Miccah
fadf9c6286
[chore] Remove broken test (#2748)
This wasn't actually testing the fix, which is more difficult to
orchestrate than is worth.

See: https://github.com/trufflesecurity/trufflehog/pull/2742
2024-04-25 11:27:17 -07:00
ahrav
b430dae83e
[refactor] - lazy buffer retrieval (#2745)
* only create the contentWriter once

* update test

* Lazily fetch buffer from the pool

* fix tests

* fix test

* remove ctx
2024-04-25 08:27:15 -07:00
ahrav
8ceeb5d5a1
[bug] - Refactor newDiff constructor to avoid double initialization of contentWriter (#2742)
* only create the contentWriter once

* update test

* correclty use mock

* remove deprecated pkg
2024-04-25 08:01:38 -07:00
Cody Rose
11452e8a57
Revert "feat(git): scan commit metadata (#2713)" (#2747)
This reverts commit 81a9c813a1.
2024-04-25 10:56:48 -04:00
Richard Gomez
81a9c813a1
feat(git): scan commit metadata (#2713)
This fixes #2683. It scans the commit author, committer (which is typically GitHub <noreply@github.com> for GitHub, but can be different), and message.

It also scans Git notes.
2024-04-25 10:13:09 -04:00
ahrav
4a5fbf8417
[refactor] - Update Write method signature in contentWriter interface (#2721)
* Update write method in contentWriter interface

* fix lint
2024-04-23 08:47:53 -07:00
Richard Gomez
baf7ea1458
feat(gitparse): avoid uneeded calls to strconv.Unquote (#2605) 2024-03-22 08:35:10 -07:00
Richard Gomez
aa862e46bb
fix(git): decode unicode paths (#2585) 2024-03-19 08:50:27 -07:00
Miccah
88c1bb3289
[chore] Increase TestMaxDiffSize timeout (#2472) 2024-02-16 11:09:25 -08:00
ahrav
40bbab8add
[cleanup] - Extract buffer logic (#2409)
* extract the buffer logic into it's own package

* address comments
2024-02-15 11:40:34 -08:00
ahrav
e8006f1bee
2396 since commit stopped working (#2402)
* Ensure we handle commits with no diffs correctly.

* cleanup

* add nil check

* address comments

* move comment

* revert

* add comment
2024-02-13 07:21:22 -08:00
Richard Gomez
3b40c4fa63
Update GitParse to handle quoted binary filenames (#2391)
* fix(gitparse): quoted binary files

* fix(gitparse): use bytes.Cut instead of regexp

* fix lint warning

---------

Co-authored-by: Zachary Rice <zachary.rice@trufflesec.com>
2024-02-08 09:25:04 -06:00
ahrav
7b492a690a
[feat] - use diff chan (#2387)
* use diff chan

* address comments

* add comment

* address comments

* use old ordering

* add correct author line

* Add required *Commit arg to newDiff

* address comments
2024-02-06 10:06:10 -08:00
ahrav
843334222c
[not-fixup] - Reduce memory consumption for Buffered File Writer (#2377)
* correctly use the buffered file writer

* use value from source

* reorder fields

* use only the DetectorKey as a map field

* correctly use the buffered file writer

* use value from source

* reorder fields

* add tests and update

* Fix issue with buffer slices growing

* fix test

* fix

* add singleton

* use shared pool

* optimize

* rename and cleanup

* use correct calculation to grow buffer

* only grow if needed

* address comments

* remove unused

* remove

* rip out Grow

* address coment

* use 2k default buffer

* update comment allow large buffers to be garbage collected
2024-02-06 09:22:25 -08:00
Miccah
01c9ac7b59
Fix binary file hanging bug in git sources (#2388)
Waiting for the sub-command will block until all of `stdout` has been
read. In some cases, we return early due to failed chunking without
reading all of the data, and thus, get stuck waiting for the command to
finish. Closing the pipe will ensure `Wait` does not block on that I/O.
2024-02-05 15:28:49 -08:00
ahrav
135cc3eb69
[fixup] - correctly use the buffered file writer (#2373)
* correctly use the buffered file writer

* use value from source

* reorder fields

* use only the DetectorKey as a map field

* address comments and use factory function

* fix optional params

* remove commented out code
2024-02-05 10:43:55 -08:00
ahrav
9867ce8eb8
Allow for configuring the buffered file writer (#2319)
* Write large diffs to tmp files

* address comments

* Move bufferedfilewriter to own pkg

* update test

* swallow write err

* use buffer pool

* use size vs len

* use interface

* fix test

* update comments

* fix test

* Allow for configuring the buffered file writer

* remove unused

* add missing method

* remove

* remove unused

* move parser and commit struct closer to where they are used

* linter change

* fix snifftest

* address comments

* add more kvp pairs to error

* fix test

* update

* add back missing metadata fields

* address comments

* remove bufferedfile writer

* fix

* address comments

* use unint8

* update interface

* adjust interface

* fix tests

* make linter happy

* fix finalize

* address comments

* update test

* address comments

* lint

* remove guard

* fix test

* fix

* add TODO

* fix tests
2024-01-30 12:51:58 -08:00
ahrav
7c59ff95d5
[feat] - tmp file diffs (#2306)
* Write large diffs to tmp files

* address comments

* Move bufferedfilewriter to own pkg

* update test

* swallow write err

* use buffer pool

* use size vs len

* use interface

* fix test

* update comments

* fix test

* remove unused

* remove

* remove unused

* move parser and commit struct closer to where they are used

* linter change

* add more kvp pairs to error

* fix test

* update

* address comments

* remove bufferedfile writer

* address comments

* adjust interface

* fix finalize

* address comments

* lint

* remove guard

* fix

* add TODO
2024-01-30 12:30:51 -08:00
ahrav
383f8a1f67
[chore] - reduce test time (#2321)
* reduce test time

* remove commented out code
2024-01-22 09:40:32 -08:00
Richard Gomez
241e153dfb
fix(gitparse): handle fromFileLine edge case (#2206) 2024-01-04 14:53:08 -08:00
Richard Gomez
e72fdb62e4
fix(gitparse): don't trim filename (#2201) 2023-12-14 08:29:46 -08:00
Bill Rich
00a00ef651
Fix binary handling (#1999) 2023-10-26 10:07:02 -07:00
Savely Krasovsky
d062834997
initial support for bare repositories (#1499)
* feat: initial support for bare repositories

* feat: use concatenation instead of formatting and os.Getenv instead of os.Environ

Signed-off-by: Savely Krasovsky <savely@krasovs.ky>

* fix: go-git update with pre-receive hooks fix

Signed-off-by: Savely Krasovsky <savely@krasovs.ky>

* fix: remove info about pre-receive hook from README.md for now

Signed-off-by: Savely Krasovsky <savely@krasovs.ky>

* fix: don't scan staged while using --bare option, fixes to make it work with the latest master

Signed-off-by: Savely Krasovsky <savely@krasovs.ky>

* fix: small refactor according to #1518

Signed-off-by: Savely Krasovsky <savely@krasovs.ky>

---------

Signed-off-by: Savely Krasovsky <savely@krasovs.ky>
2023-08-03 11:23:41 -05:00
Miccah
b54683acb9
gitparse: Use an object for currentDiff (#1573)
* gitparse: Use an object for currentDiff instead of a pointer

* gitparse: Use an object for currentCommit instead of a pointer

* Revert "gitparse: Use an object for currentCommit instead of a pointer"

This reverts commit c5f0708b4a.
2023-07-31 11:39:14 -05:00
Miccah
6bd48583ae
Fix gitparse from panicking on a nil-pointer (#1570) 2023-07-28 11:15:02 -05:00
Zachary Rice
3897454dbb
add merge support (#1561) 2023-07-27 09:24:49 -05:00
Richard Gomez
f48a635c34
feat: update gitparse logic (#1486) 2023-07-25 17:52:34 -05:00
Zachary Rice
452734adc8
remove head from git diff command, rename unstaged to staged (#1439) 2023-06-29 15:33:30 -05:00
Zachary Rice
c28c70b399
fix new git file plus plus plus bug (#1386) 2023-06-08 18:29:11 -05:00
ahrav
1da7720912
Replace context.TODO. (#1349) 2023-05-19 11:09:51 -07:00
Bill Rich
f2924f3061
Make sure context lines are properly handled (#1331)
* Make sure context lines are properly handled

* Fix git test to account for context change
2023-05-05 12:51:27 -07:00
ahrav
714c480931
Add log to track git log size (#1325)
* Add log to track git log size.

* Add calc for large commits and last commit.
2023-05-02 16:36:39 -07:00
Zachary Rice
458c79165a
fix extra log messages (#1253)
* fix extra log messages

* add small test, move flag to isindex
2023-04-13 09:53:21 -05:00
Dustin Decker
8f10938bf7
forager requires direct access to gitparse.FromReader (#1233) 2023-04-02 17:54:43 -07:00
Zachary Rice
fb9ae75661
Support for exclude globs at the git log level (#1202)
* init

* seems to be working

* better comment

* rm conditional

* Add more context to exclude-globs description
2023-03-28 10:46:03 -05:00
ahrav
aa47e5e248
Only scanned staged git changes. (#1143) 2023-03-01 08:58:36 -08:00
Bill Rich
ae2d510ced
Gitparse message fix (#1125)
* Fix messages being reused

* Add comment about change.
2023-02-23 15:20:54 -08:00
Bill Rich
f1582aafa9
Drop tabs for filenames with spaces (#1115) 2023-02-16 17:15:32 -08:00
Bill Rich
9158dcaa80
Correctly parse most filenames with ' and ' (#1113) 2023-02-16 14:11:35 -08:00
Miccah
161e499142
[chore] Remove logrus from trufflehog (#1095)
* [chore] Remove logrus from trufflehog

* Minor fixes

* Fix logFatal call

* Fix logrus call
2023-02-14 17:00:07 -06:00
Bill Rich
b37080e6a5
Add max commit size (#1079)
* Add max commit size

* Use common.IsDone

* Use breaks instead of return
2023-02-07 15:25:00 -08:00
Bill Rich
af6e3f8fdf
Pull gitparse config options out of pkg consts (#1072)
* Pull gitparse config options out of pkg consts.

* Adjust naming
2023-02-04 13:19:23 -08:00
Bill Rich
00ebb2ed64
Full git log when targeting base merge commit (#1044)
* Full git log when targeting merge commits

* Full log is needed whenever base is specified.
2023-01-26 09:17:54 -08:00
Bill Rich
ac1dd23d37
Limit diff size to prevent out of control memory use. (#1035)
* Limit diff size to prevent out of control memory use.

* Group consts
2023-01-23 10:14:10 -08:00
Miccah
4aab7b7276
Buffer commit log processing (#845)
Some very large commits take a lot of time to process, which we can make
progress on while we are scanning the contents of other commits.
2022-10-12 14:55:08 -05:00
ahrav
92f40c2031
[THOG-709] - Recover from detector panics (#810) 2022-09-22 07:01:10 -07:00
Bill Rich
912d8e461d
Add context so to avoid splitting creds. (#791)
* Add context so to avoid splitting creds.

* Add context newlines to expected results
2022-09-09 15:00:33 -07:00
Bill Rich
3fe916fe1e
add tests (#785) 2022-09-08 21:46:12 -07:00