Commit graph

18 commits

Author SHA1 Message Date
Miccah
3db9ed7c74
[chore] Fix lint errors (#3218)
* [chore] Fix lint errors under analyzer package

* Fix lint error in source manager test

* Use Sprint instead of Sprintf where appropriate
2024-08-14 13:49:24 -07:00
Cody Rose
a317897d66
increase test chan size (#2797)
This test has a race condition. This change makes it less likely to cause a test failure, and is a stopgap measure to de-flake the test while we investigate the underlying issue.
2024-05-07 11:11:11 -04:00
Miccah
c60443891b
Add Display method to SourceUnit and Kind member to the CommonSourceUnit (#2450)
* Add Display method to SourceUnit and Kind member to the CommonSourceUnit

* Make SourceUnitID return the ID and a kind

These two values together uniquely represent a unit.
2024-02-20 11:24:13 -08:00
Miccah
dd4d4a8a96
Refactor UnitHook to block the scan if finished metrics aren't handled (#2309)
* Refactor UnitHook to block the scan if finished metrics aren't handled

* Log once when back-pressure is detected

* Add hook channel size metric

* Use plural "metrics" for consistency

* Replace LRU cache with map
2024-02-08 14:50:58 -08:00
Miccah
57203a56cd
[chore] Fix SourceManager flaky test (#2059)
* [chore] Fix SourceManager flaky test

Sorting by EndTime is not deterministic, however sorting by StartTime
should be. StartTime is set in a goroutine that's limited by
WithConcurrentUnits, so it should happen in order that the units are
received.

* Sort by unit ID
2023-10-30 19:16:55 -07:00
Miccah
0b16142d4f
Add UnitHook and NoopHook implementations (#1930)
* Add UnitHook and NoopHook implementations

The UnitHook tracks metrics per unit of a job, and emits them on a
channel once finished. It should work even if the Source does not
support source units.

* Refactor channel to use an LRU cache instead

An LRU cache has a more favorable failure mode than the channel. With
the channel, if the consumer stopped consuming metrics, scanning would
block. With the LRU cache, metrics will be dropped when space runs out
and a log message emitted.
2023-10-23 14:27:01 -07:00
Miccah
dbcb888063
Update Source interface to use SourceID and JobID types (#1774)
The previous implementation used int64 for both, which can be mixed up
easily. Using distinct types adds a layer of type safety checked by the
compiler.
2023-09-14 11:28:24 -07:00
Miccah
be4d0bcb41
Refactor SourceManager to remove Enrollment (#1740)
* Refactor SourceManager to remove Enrollment

Initializing the Source will be the responsibility of the caller. The
SourceManager exposes a GetIDs method for getting a source and job ID.

* Update tests

* Update engine usage

* Update apiClient interface to have one GetIDs method

* Update SourceManager usage in engine
2023-09-12 16:58:38 -07:00
ahrav
2a9f34962d
Add optional param to Chunks (#1747)
* Add interface for targeted chunking.

* use optional args.

* update Chunks method signature.

* update tests.

* fix test.

* update QueryCriteria type.
2023-09-07 09:03:37 -07:00
Miccah
522b2fab29
Add a cancel cause to job cancellation (#1728) 2023-08-30 12:00:44 -07:00
Miccah
7ba880f47a
Add AvailableCapacity method to SourceManager (#1665) 2023-08-29 12:36:44 -07:00
Miccah
5eb776cd61
Support cancelling a run from a JobProgressRef (#1663) 2023-08-25 10:43:33 -07:00
Miccah
61977412df
Add SourceName to JobProgressRef (#1664) 2023-08-25 07:48:25 -07:00
Miccah
5cfbde783f
Fix reversed ordering of arguments (#1648)
The source manager initialization function was defined as `sourceID`
followed by `jobID`, while the source initialization function is the
reverse. This is confusing and easy to mix up since the parameters are
the same type.

This commit adds a test to make sure the source manager initializes in
the correct order, but it doesn't prevent the library user to make the
same mistake. We may want to consider using different types.
2023-08-22 07:55:56 -07:00
Miccah
1cd600f70f
Use SourceManager in engine (#1586)
* Add SourceManager to Engine struct

* Update Engine methods to use the SourceManager

* Fix GCS test

The original was testing that `Init()` errors weren't surfaced in
`Finish()`, but the `SourceManager` changed that behavior.

* JobProgress race fixes

* Add contextual values

* Remove unused code

* Add debug logs

* Rename WithConcurrency to WithConcurrentSources

* Always forward chunks to the output chunks channel
2023-08-03 13:36:30 -05:00
Miccah
a07b6664f8
Support fatal errors in job reports (#1562)
* Support fatal errors in job reports

* WIP: JobReporter and JobInspector

* WIP: JobReportHook and JobReportRef

* Add ChunkError type and asyncRun helper method

* Rename JobReport to JobProgress

* Return a closed channel from Done when the JobProgress is nil

* Comment catchFirstFatal function
2023-07-31 11:28:30 -05:00
Miccah
e391e89f3e
Initial implementation of JobReport with SourceManager usage (#1557)
* Initial implementation of JobReport with SourceManager usage

* Limit concurrent units

* Only save the last JobReport per handle
2023-07-27 10:49:56 -05:00
Miccah
10f0963bc9
Add SourceManager tests for Run and Wait methods (#1530)
* Miscellaneous SourceManager updates

* Own the chunks channel instead of accepting it as an input
* Add Chunks and Wait methods
* Fix bug in Enroll so it actually returns the handle
* Add context.Context parameter to the SourceInitFunc type

* Add SourceManager tests for Run and Wait methods

* Rename man variables to mgr
2023-07-26 00:48:28 -05:00