Commit graph

2289 commits

Author SHA1 Message Date
Miccah
c60443891b
Add Display method to SourceUnit and Kind member to the CommonSourceUnit (#2450)
* Add Display method to SourceUnit and Kind member to the CommonSourceUnit

* Make SourceUnitID return the ID and a kind

These two values together uniquely represent a unit.
2024-02-20 11:24:13 -08:00
Zachary Rice
bccba20d3e
concurrency uint8 to int (#2488)
* concurrency uint8 to uint16

* jk, use int

* git test fix
2024-02-20 09:35:40 -06:00
ahrav
5290023c2d
use read full (#2474) 2024-02-20 07:21:16 -08:00
ahrav
afccf2cf5f
[chore] - upgrade lru cache version (#2487) 2024-02-19 18:07:31 -08:00
ahrav
41301bec8a
move clenaup outside the engine (#2475) 2024-02-17 08:06:24 -08:00
ahrav
5c313c14db
tighten keyword match (#2473) 2024-02-16 13:38:07 -08:00
Miccah
88c1bb3289
[chore] Increase TestMaxDiffSize timeout (#2472) 2024-02-16 11:09:25 -08:00
Zachary Rice
834163acf5
add lazy quantifier to prefixregex (#2466) 2024-02-15 17:08:27 -06:00
ahrav
40bbab8add
[cleanup] - Extract buffer logic (#2409)
* extract the buffer logic into it's own package

* address comments
2024-02-15 11:40:34 -08:00
ahrav
5568b2e0a6
update gitlab proto (#2469)
* update gitlab proto

* update protos
2024-02-15 10:23:41 -08:00
Zachary Rice
bd729ce48e
add missing prefixregex (#2468) 2024-02-15 07:13:57 -06:00
Dustin Decker
a9817a3292
Remove some noisy / less useful detectors (#2467) 2024-02-14 15:27:03 -08:00
Miccah
216a29d7cf
[chore] Add some doc comments to source manager (#2434) 2024-02-13 07:54:48 -08:00
ahrav
e8006f1bee
2396 since commit stopped working (#2402)
* Ensure we handle commits with no diffs correctly.

* cleanup

* add nil check

* address comments

* move comment

* revert

* add comment
2024-02-13 07:21:22 -08:00
Richard Gomez
9572628dc6
chore(gcp): ignore known test creds (#2413) 2024-02-12 10:29:00 -06:00
Miccah
74f1553e06
[fix] Add unit information to error returned by ChunkUnit (#2410) 2024-02-12 08:24:31 -08:00
renovate[bot]
af6099665f
fix(deps): update module github.com/charmbracelet/bubbletea to v0.25.0 (#2326)
* fix(deps): update module github.com/charmbracelet/bubbletea to v0.25.0

* Remove deprecated and unused mouse events

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Miccah Castorina <m.castorina93@gmail.com>
2024-02-11 12:11:46 -08:00
Miccah
4acf3ccb80
[chore] Ensure Postgres detector respects context deadline (#2408) 2024-02-10 23:32:05 -08:00
Miccah
8f01326468
[chore] Rename file to legacy_reporters.go (#2406) 2024-02-09 18:17:46 -08:00
Miccah
9642d4c8fd
Add flag to write job reports to disk (#2298)
* Add flag to write job reports to disk

* Fix nil pointer / non-nil interface bug

* Synchronize job report writer goroutine

* Log when the report has been written
2024-02-09 12:30:28 -08:00
Miccah
aace92b64d
Implement SourceUnitEnumChunker for GitLab (#2367)
* Implement SourceUnitEnumChunker for GitLab

* Add GitLab engine integration test

* Use a SliceReporter instead of checking for nil reporters

* Use more generic VisitorReporter

* Merge logic from getReposFromGitlab into getAllProjectRepos

* Update integration test to have a lower bound

Unfortunately, the GitLab integration test does not appear to be
deterministic. Sometimes 36390 chunks are found, sometimes 36312, or
even lower.
2024-02-09 11:06:31 -08:00
Miccah
dd4d4a8a96
Refactor UnitHook to block the scan if finished metrics aren't handled (#2309)
* Refactor UnitHook to block the scan if finished metrics aren't handled

* Log once when back-pressure is detected

* Add hook channel size metric

* Use plural "metrics" for consistency

* Replace LRU cache with map
2024-02-08 14:50:58 -08:00
ahrav
6557b3b321
[feat] - buffered file writer metrics (#2395)
* use diff chan

* correctly use the buffered file writer

* use value from source

* reorder fields

* add tests and update

* Fix issue with buffer slices growing

* fix test

* correctly use the buffered file writer

* use value from source

* reorder fields

* fix

* add singleton

* use shared pool

* optimize

* rename and cleanup

* add metrics

* add print

* rebase

* remove extra inc

* add metrics for checkout time

* add comment

* use microseconds

* add metrics

* add metrics pkg

* add more metrics

* rever test

* remove fields

* fix

* resize and return

* update metric name

* remove comment

* address comments

* add comment
2024-02-08 07:38:40 -08:00
Richard Gomez
3b40c4fa63
Update GitParse to handle quoted binary filenames (#2391)
* fix(gitparse): quoted binary files

* fix(gitparse): use bytes.Cut instead of regexp

* fix lint warning

---------

Co-authored-by: Zachary Rice <zachary.rice@trufflesec.com>
2024-02-08 09:25:04 -06:00
Dustin Decker
a00ffe9522
Allow multiple domains for Forager (#2400) 2024-02-08 07:08:30 -08:00
ahrav
bbf1decb39
prevent concurrent map writes (#2399) 2024-02-07 17:45:06 -08:00
Ryan Jacobchick
7296bcdc5d
Allow CLI version pinning in GHA (#2397) (#2398)
* Allow CLI version pinning in GHA (#2397)

* prevent segfault in test-community
2024-02-07 16:58:04 -06:00
Richard Gomez
b3ff12d1e9
Fix handling of GitHub ratelimit information (#2041)
This is a follow-up to #1912, which used the headers from the response to determine rate-limiting information, instead of using the values from RateLimitError.Rate. Although that logic seemed solid, I discovered that it did not work in some circumstances. This lead to the "unexpected" path more often than intended, and periodic instances where requests would be made before the ratelimit was refreshed.
2024-02-07 09:11:12 -05:00
ahrav
7b492a690a
[feat] - use diff chan (#2387)
* use diff chan

* address comments

* add comment

* address comments

* use old ordering

* add correct author line

* Add required *Commit arg to newDiff

* address comments
2024-02-06 10:06:10 -08:00
ahrav
843334222c
[not-fixup] - Reduce memory consumption for Buffered File Writer (#2377)
* correctly use the buffered file writer

* use value from source

* reorder fields

* use only the DetectorKey as a map field

* correctly use the buffered file writer

* use value from source

* reorder fields

* add tests and update

* Fix issue with buffer slices growing

* fix test

* fix

* add singleton

* use shared pool

* optimize

* rename and cleanup

* use correct calculation to grow buffer

* only grow if needed

* address comments

* remove unused

* remove

* rip out Grow

* address coment

* use 2k default buffer

* update comment allow large buffers to be garbage collected
2024-02-06 09:22:25 -08:00
Richard Gomez
8104611d6e
fix: case-insensitive ext check (#2383) 2024-02-06 10:13:53 -05:00
dylanTruffle
901c851698
tightening opsgenie detection and verification (#2389)
Co-authored-by: Dylan Ayrey <dylan@Dylans-MacBook-Pro.local>
2024-02-05 17:31:09 -08:00
Miccah
01c9ac7b59
Fix binary file hanging bug in git sources (#2388)
Waiting for the sub-command will block until all of `stdout` has been
read. In some cases, we return early due to failed chunking without
reading all of the data, and thus, get stuck waiting for the command to
finish. Closing the pipe will ensure `Wait` does not block on that I/O.
2024-02-05 15:28:49 -08:00
ahrav
135cc3eb69
[fixup] - correctly use the buffered file writer (#2373)
* correctly use the buffered file writer

* use value from source

* reorder fields

* use only the DetectorKey as a map field

* address comments and use factory function

* fix optional params

* remove commented out code
2024-02-05 10:43:55 -08:00
ahrav
28d079bdad
use only the DetectorKey as a map field (#2374) 2024-02-05 06:53:08 -08:00
ahrav
a22874f9f0
[feat] - concurently scan the filesystem source (#2364)
* concurently scan the filesystem source

Co-authored-by: Miccah Castorina <m.castorina93@gmail.com>

* fix test

* update test

* remove return

* use error not info

* address comment

---------

Co-authored-by: Miccah Castorina <m.castorina93@gmail.com>
2024-02-03 10:49:14 -08:00
Miccah
27b30e65ed
[chore] Cleanup GitLab source errors (#2345)
* [chore] Cleanup GitLab source errors

* Ungroup compile time interface checks and revert error message
2024-02-02 20:00:34 -08:00
ahrav
382990a6bd
[bug] - use DetectorKey as the key in the detectorKeysWithResults map (#2366)
* use DetectorKey as the key in the map

* nil check

* update comment
2024-02-02 13:43:56 -08:00
Mike Vanbuskirk
f6546ffaf5
Add s3 credential validation (#2362)
* add string non-empty validation to AWS creds

* clean up import spacing

* syntax fixup

* change to non-empty validation only

* convert to lower snake_case

- https://protobuf.dev/programming-guides/style/#message-field-names
2024-02-02 12:49:46 -05:00
ahrav
b2074ad05d
Polite Verification (#2356)
* draft reverify chunks

* remove

* remove

* reduce dupe map cap

* do not verify chunk

* cli arg and use val for dupe lut

* remove counter

* skipp empty results]

* working on test and normalizing val for comparison

* forgot to save file

* optimize normalize

* reuse map

* remove print

* use levenshtein distance to check dupes

* forgot to leave in emptying map

* use slice

* small tweak

* comment

* use bytes

* praise

* use ctx logger

* add len check

* add comments

* use 8x concurrency for reverifier workers

* revert worker count

* use more workers

* process result directly for any collisions

* continue after decoder match for reverifying

* use map

* use map

* otimization and fix the bug.

* revert worker count

* better option naming

* handle identical secrets in chunks

* update comment

* update comment

* fix test

* use DetecotrKey

* rm out of scope tests and testdata

* rename all reverification elements

* don't re-write map entry

* use correct key

* rename worker, remove log val

* test likelydupe, add eq detector check in loop

* add test

* add comment

* add test

* Set verification error

* Update tests

---------

Co-authored-by: Zachary Rice <zachary.rice@trufflesec.com>
Co-authored-by: Dustin Decker <dustin@trufflesec.com>
2024-02-02 09:29:18 -08:00
Dustin Decker
c2ae31d060
Make AzureDevopsPersonalAccessToken verification more robust (#2359)
* Make AzureDevopsPersonalAccessToken verification more robust

* fix snifftest
2024-02-01 08:40:44 -08:00
roxanne-tampus
143e275272
update azure test files to check rawV2 (#2353) 2024-01-31 08:36:52 -08:00
Miccah
24d0680f5c
[chore] Add filesystem integration test (#2358) 2024-01-31 08:27:57 -08:00
Richard Gomez
8e90c4e669
Scan GitHub wikis #2233 2024-01-31 10:52:24 -05:00
Marlon
91d6496a76
added flyio protos (#2357)
* added flyio protos

* added builtwith proto

---------

Co-authored-by: root <root@ubuntutruffle.myguest.virtualbox.org>
2024-01-31 07:02:06 -08:00
ahrav
9867ce8eb8
Allow for configuring the buffered file writer (#2319)
* Write large diffs to tmp files

* address comments

* Move bufferedfilewriter to own pkg

* update test

* swallow write err

* use buffer pool

* use size vs len

* use interface

* fix test

* update comments

* fix test

* Allow for configuring the buffered file writer

* remove unused

* add missing method

* remove

* remove unused

* move parser and commit struct closer to where they are used

* linter change

* fix snifftest

* address comments

* add more kvp pairs to error

* fix test

* update

* add back missing metadata fields

* address comments

* remove bufferedfile writer

* fix

* address comments

* use unint8

* update interface

* adjust interface

* fix tests

* make linter happy

* fix finalize

* address comments

* update test

* address comments

* lint

* remove guard

* fix test

* fix

* add TODO

* fix tests
2024-01-30 12:51:58 -08:00
ahrav
7c59ff95d5
[feat] - tmp file diffs (#2306)
* Write large diffs to tmp files

* address comments

* Move bufferedfilewriter to own pkg

* update test

* swallow write err

* use buffer pool

* use size vs len

* use interface

* fix test

* update comments

* fix test

* remove unused

* remove

* remove unused

* move parser and commit struct closer to where they are used

* linter change

* add more kvp pairs to error

* fix test

* update

* address comments

* remove bufferedfile writer

* address comments

* adjust interface

* fix finalize

* address comments

* lint

* remove guard

* fix

* add TODO
2024-01-30 12:30:51 -08:00
Miccah
6824eb41ea
Fix filesystem enumeration ignore paths bug (#2355) 2024-01-30 12:21:37 -08:00
āh̳̕mͭͭͨͩ̐e̘ͬ́͋ͬ̊̓͂d
7ece4c3e66
Detectors Updates 1 for Tristate Verification (#2187)
* updating alibaba

* updating agora

* updating aeroworkflow

* updating aha

* updating artifactory

* updating abbysale

* updating abstract

* updating abuseipdb

* updating accuweather

* updating adafruitio

* updating adzuna

* cleanup on abuseipdb

* cleanup on aha

* cleanup on abuseipdb

* cleanup on aeroworkflow

* cleanup on adzuna

* cleanup on accuweather

* cleanup/refactor

* update token pattern to be explicitly 73char (old) or 64char (new)

* comment to clarify 403 on Aha

* mocking out verified case for aha + adding inactive account test

* using contact response instead of gock

* update 403 to be determinate
2024-01-30 12:20:56 -05:00
Richard Gomez
232032410c
feat(detectors): update template (#2342) 2024-01-29 21:21:23 -08:00