Commit graph

43 commits

Author SHA1 Message Date
ahrav
570cec7565
[refactor] - Refactor Archive Handling Logic (#2703)
* Remove specialized handler and archive struct and restructure handlers pkg.

* Refactor RPM archive handlers to use a library instead of shelling out

* make rpm handling context aware

* update test

* Refactor AR/deb archive handler to use an existing library instead of shelling out

* Update tests

* add max size check

* add filename and size to context kvp

* move skip file check and is binary check before opening file

* fix test

* preserve existing funcitonality of not handling non-archive files in HandleFile

* Adjust check for rpm/deb archive type

* add additional deb mime type

* update comment

* Remove specialized handler and archive struct and restructure handlers pkg.

* Refactor RPM archive handlers to use a library instead of shelling out

* make rpm handling context aware

* update test

* Refactor AR/deb archive handler to use an existing library instead of shelling out

* Update tests

* add max size check

* add filename and size to context kvp

* move skip file check and is binary check before opening file

* fix test

* preserve existing funcitonality of not handling non-archive files in HandleFile

* Adjust check for rpm/deb archive type

* add additional deb mime type

* update comment

* go mod tidy

* update go mod

* go mod tidy

* add comment

* update max depth check to >

* go mod tidy

* rename

* [refactor] - Refactor Archive Handling Logic - Part 4: Non-Archive Data Handling and Cleanup (#2704)

* Handle non-archive data within the DefaultHandler

* make structs and methods private

* Remove non-archive data handling within sources

* Handle non-archive data within the DefaultHandler

* rebase

* Remove non-archive data handling within sources

* add gzip

* move diskbuffered rereader setup into handler pkg

* remove DiskBuffereReader creation logic within sources

* move rewind closer

* reduce log verbosity

* make defaultBufferSize a const

* use correct reader

* address comments

* update test

* [feat] - Add Prometheus Metrics for File Handlers (#2705)

* add metrics for file handling

* add metrics for errors

* add metrics for file handling

* add metrics for errors

* fix tests

* add metrics for max archive depth and skipped files

* update error

* skip symlinks and dirs

* update err

* fix err assignment

* add metrics for file handling

* add metrics for errors

* fix tests

* rebase

* add metrics for errors

* add metrics for max archive depth and skipped files

* update error

* skip symlinks and dirs

* update err

* fix err assignment

* rebase

* remove

* update metric to ms

* update comments

* address comments

* reduce indentations

* add metrics for archive depth

* [bug] - Enhanced Archive Handling to Address Interface Constraints (#2710)

* add metrics for file handling

* add metrics for errors

* add metrics for file handling

* add metrics for errors

* fix tests

* add metrics for max archive depth and skipped files

* update error

* skip symlinks and dirs

* update err

* Address incompatible reader to openArchive

* remove nil check

* fix err assignment

* wrap compReader with DiskbufferReader

* add metrics for file handling

* add metrics for errors

* fix tests

* rebase

* add metrics for errors

* add metrics for max archive depth and skipped files

* update error

* skip symlinks and dirs

* update err

* fix err assignment

* rebase

* remove

* update metric to ms

* update comments

* address comments

* reduce indentations

* replace diskbuffereader with bufferedfilereader

* updtes

* add metric back

* [bug] -  Fix bug and simplify git cat-file command execution and output handling (#2719)

* add metrics for file handling

* add metrics for errors

* add metrics for file handling

* add metrics for errors

* fix tests

* add metrics for max archive depth and skipped files

* update error

* skip symlinks and dirs

* update err

* Address incompatible reader to openArchive

* remove nil check

* fix err assignment

* Allow git cat-file blob to complete before trying to handle the file

* wrap compReader with DiskbufferReader

* Allow git cat-file blob to complete before trying to handle the file

* updates

* revert stuff

* update test

* remove

* add metrics for file handling

* add metrics for errors

* fix tests

* rebase

* add metrics for errors

* add metrics for max archive depth and skipped files

* update error

* skip symlinks and dirs

* update err

* fix err assignment

* rebase

* remove

* update metric to ms

* update comments

* address comments

* reduce indentations

* inline
2024-05-10 11:36:06 -07:00
Dustin Decker
612ff1a0f1
Use Lstat to identify non-regular files in filesystem source (#2628)
* Use Lstat to identify non-regular files in filesystem source

* fix test
2024-03-26 15:22:42 -07:00
Miccah
c60443891b
Add Display method to SourceUnit and Kind member to the CommonSourceUnit (#2450)
* Add Display method to SourceUnit and Kind member to the CommonSourceUnit

* Make SourceUnitID return the ID and a kind

These two values together uniquely represent a unit.
2024-02-20 11:24:13 -08:00
ahrav
a22874f9f0
[feat] - concurently scan the filesystem source (#2364)
* concurently scan the filesystem source

Co-authored-by: Miccah Castorina <m.castorina93@gmail.com>

* fix test

* update test

* remove return

* use error not info

* address comment

---------

Co-authored-by: Miccah Castorina <m.castorina93@gmail.com>
2024-02-03 10:49:14 -08:00
Miccah
6824eb41ea
Fix filesystem enumeration ignore paths bug (#2355) 2024-01-30 12:21:37 -08:00
Miccah
4c698fc1e8
Walk directories in filesystem source enumeration (#2313)
* Walk directories in filesystem source enumeration

* Ignore all directories instead of just the root

* Fix bug with multiple directories

* Skip filesystem TestEnumerate

* Update filesystem enumeration test to create files and folders
2024-01-23 14:57:38 -08:00
ahrav
651beff492
[feat] - Allow for the use of include/exclude path files for filesystem scans (#2297)
* Allow for the use of include/exclude path files for filesystem scans

* remove oopsie
2024-01-11 15:41:50 -08:00
Mike Vanbuskirk
53f060a08e
Add disk buffer tempfile cleanup (#2130)
* add tempfile creation

- break PID retrieval into sep. function

* add tmpfile cleanup func

* add file cleanup to main cleanup func

* refactor file logic to only return name string

* add temp buffer naming to gcs

* add temp buffer naming to s3

* add temp buffer naming to filesystem

* add temp buffer naming to git

* consolidate cleanup functions

- have single function handle both files and dirs
- remove interface(not needed with a single func implementation)
- change calls to `New(...)` to reflect config implementation
- simplify automation in main.go
- update disk-buffer-reader dependency

* integrate changes from pr #2133

* merge main

* checkout from main to revert conflict issues

* re-add buffer logic to git

* interface no longer needed

* move string format to global const

---------

Co-authored-by: Ahrav Dutta <ahrav.dutta@trufflesec.com>
2023-12-11 18:31:50 -05:00
Miccah
52600a897a
[chore] Replace chunks channel with ChunkReporter in git based sources (#2082)
ChunkReporter is more flexible and will allow code reuse for unit
chunking. ChanReporter was added as a way to maintain the original
channel functionality, so this PR should not alter existing behavior.
2023-11-01 09:22:44 -07:00
Bill Rich
c5efa870ff
Use latest dbr (#1955) 2023-10-24 07:52:49 -07:00
Miccah
03dc7cb68d
[chore] Add SourceUnitEnumChunker filesystem tests (#1873)
* [chore] Add SourceUnitEnumChunker filesystem tests

* Ensure reported units are exactly what is expected
2023-10-16 10:42:18 -07:00
Miccah
dbcb888063
Update Source interface to use SourceID and JobID types (#1774)
The previous implementation used int64 for both, which can be mixed up
easily. Using distinct types adds a layer of type safety checked by the
compiler.
2023-09-14 11:28:24 -07:00
Miccah
72b6a9ec6b
Add a SourceType constant to all source packages (#1768) 2023-09-12 17:23:25 -07:00
ahrav
2a9f34962d
Add optional param to Chunks (#1747)
* Add interface for targeted chunking.

* use optional args.

* update Chunks method signature.

* update tests.

* fix test.

* update QueryCriteria type.
2023-09-07 09:03:37 -07:00
ahrav
2b1b1b5ad0
Add jobID to chunk. (#1721) 2023-08-29 12:02:30 -07:00
ahrav
e894540632
Use the common chunker for scanning the filesystem source (#1619)
* Use the common chunker for scanning the filesystem source.

* remove unused conts.

* add test.
2023-08-11 13:40:10 -07:00
Miccah
4e774d1f01
Define SourceUnit chunking interface (#1484)
* Define SourceUnit chunking interface

* Refactor to use a ChunkReporter interface

* Rename shadowed err to scanErr
2023-07-13 14:11:43 -05:00
Miccah
4b7f94dea1
Rewrite SourceUnitEnumerator to use UnitReporter instead of a channel (#1485) 2023-07-13 13:48:33 -05:00
Miccah
5c0ffda618
Define SourceUnit enumeration interface (#1428)
* Add CancellableWrite helper function

* Create SourceUnitEnumerator interface and EnumerationResult struct

* Implement SourceUnitEnumerator for the filesystem Source

* Omit explicit zero values
2023-07-10 15:05:40 -05:00
Miccah
f3152b6885
Implement SourceUnitUnmarshaller for all sources (#1416)
* Implement CommonSourceUnitUnmarshaller

* Add SourceUnitUnmarshaller to all sources using

All sources, with the exception of git, will use the CommonSourceUnit as
they only contain a single type of unit to scan.

* Fix method comments to adhere to Go's style guide
2023-06-23 11:15:51 -05:00
Bill Rich
c2e3e7d53a
Split files instead of using ReadAll (#1387)
* Split files instead of using ReadAll

* Remove dup chunk

* Actually break out of loop
2023-06-12 14:09:05 -07:00
Miccah
c5b4d6f28b
Support file scanning in filesystem source (#1030)
* Rename directories to paths

* Generate protos

* Add file scanning support to filesystem source

* Add directories back to filesystem proto

* Generate protos

* Combine paths and directories from in source

* Add filesystem filter

* Address comments
2023-02-27 12:15:05 -06:00
Miccah
d317ddb51a
[chore] Remove logrus from circleci, filesystem, gitlab, and s3 sources (#1089)
* [chore] Remove logrus from circleci, filesystem, gitlab, and s3 sources

* Address comments
2023-02-10 11:02:55 -06:00
Alexandr Marchenko
b29b78c10d
filesystem support for exclude and include filters (2nd attemp) (#1033)
* fix filter issue - empty lines should be ignored

* filesystem support for filter exclude

Co-authored-by: Dustin Decker <dustin@trufflesec.com>
2023-01-26 09:33:45 -08:00
ahrav
009756dce6
add proto that was missing. (#986) 2022-12-23 13:27:07 -08:00
Bill Rich
ab71b93f7d
Add context to handler (#877)
* Add context to handler

* Return rather than break out of select
2022-10-28 08:57:55 -07:00
Bill Rich
958266ea84
Run chunker in pipeline (#859)
* Run chunker in pipeline

* Move ChunkSize and PeekSize to source package.

* Use new Chunk and Peek size location
2022-10-24 13:57:27 -07:00
Dustin Decker
785cead43e
Ignore URIs where the password is redacted (#842)
Only `*`s in the password is a redacted basic auth URI.
2022-10-11 14:18:52 -07:00
Bill Rich
d11ce27f33
Use correct reader in filesystem source (#756) 2022-08-30 10:24:52 -07:00
Dustin Decker
fa9479100e
Add common sentry recover library and add into goroutines (#738)
* Add common sentry recover library and add into goroutines

* fix nits
2022-08-29 11:45:37 -07:00
Bill Rich
a473b9aa99
Use re-readable reader and common chunker (#703)
* Use re-readable reader and common chunker

* Linter feedback

* Break on error
2022-08-10 15:32:49 -07:00
ahrav
dcc102a81c
[Thog-371] Utilize config struct for engine scans (#700)
* Use a config struct when scanning and engine source.

* fix tests.

* Move test_helpers to the sources pkg.

* Handle ScanGit error in tests.

* adderss comments.

* Use functional options.

* Remove temp var.

* Add better var names for the setup functions for each config.

* Remove unused var.

* fix error logs.

* fix error logs.

* single line.

* remove blank lines.
2022-08-10 10:11:13 -07:00
ahrav
30ebe84e3e
[THOG-608] - Fix linter errors. (#701)
* Fix linter errors.

* Fix gist adding test.

* Update test string for mock JSON reply.

* Remove if.
2022-08-09 19:20:02 -07:00
Bill Rich
7273dc9058
Archive decoder (#683)
* Archive decoder

* Fix reader handling

* Seek error handling

* Add tests

* Fix extra empty chunk

* Sync chunk size
2022-08-02 20:36:21 -07:00
ahrav
d2605354fe
[THOG-332 ]Remove TokenSource interface from the init method of Source. (#539)
* Remove TokenSource interface from the init method of Source.

* Remove proto message.

* Remove proto message.

* Fix tests.

* Fix filesystem test.
2022-05-13 14:35:06 -07:00
ahrav
b0d79180f6
[THOG-314] Add new parameter to the Init method for the source interface. (#529)
* Add new parameter to the Init method for the source interface.

* Add Oauth Token service.

* remove .test file.

* remove .test file.

* Fix param spelling.

* fix tests with new param in init

* Add missing gock lib.
2022-05-10 11:11:43 -07:00
Bill Rich
d78c929385
Actually skip file (#299) 2022-04-06 09:48:40 -07:00
Bill Rich
33aa6f9cab
Log error and skip file when stat fails (#296) 2022-04-05 18:58:05 -07:00
steeeve
a770f643df Add placeholder for encoded resume info in SetProgressComplete 2022-03-24 12:43:36 -04:00
Bill Rich
5ab5c6f9d9
Only scan regular files (#87)
* Only scan regular files

* Remove IsDirectory func
2022-03-16 16:04:10 -07:00
Dustin Decker
c20e9f4732 improvements 2022-03-04 08:39:17 -08:00
Dustin Decker
77418fb3f8 module v3 2022-02-15 18:54:47 -08:00
Dustin Decker
4218c39d99
Initial CLI w/ partially implemented Git source and demo detector (#1) 2022-01-13 12:02:24 -08:00