* Refactor UnitHook to block the scan if finished metrics aren't handled
* Log once when back-pressure is detected
* Add hook channel size metric
* Use plural "metrics" for consistency
* Replace LRU cache with map
The source manager attaches some context keys, but in certain circumstances, they're already present, resulting in duplicate keys. This PR changes the attachment to be conditional. It also adds some new log messages to track source startup progress.
* Add ability to dynamically scale concurrently running sources
Refactor SourceManager to use a counting semaphore to allow for
dymanically changing limits. This complicated `Wait() error` which needs
to return the first error encountered. We previously got that for free
using `errgroup.Group`, however now we need to handle that ourselves.
`Wait()` needs to return an error for use in the engine to set the
correct exit code.
* Group third party imports together
The previous implementation used int64 for both, which can be mixed up
easily. Using distinct types adds a layer of type safety checked by the
compiler.
* Refactor SourceManager to remove Enrollment
Initializing the Source will be the responsibility of the caller. The
SourceManager exposes a GetIDs method for getting a source and job ID.
* Update tests
* Update engine usage
* Update apiClient interface to have one GetIDs method
* Update SourceManager usage in engine
The source manager initialization function was defined as `sourceID`
followed by `jobID`, while the source initialization function is the
reverse. This is confusing and easy to mix up since the parameters are
the same type.
This commit adds a test to make sure the source manager initializes in
the correct order, but it doesn't prevent the library user to make the
same mistake. We may want to consider using different types.
With the introduction of the SourceManager, the chunks channel became
private and read-only. This provides a method to write chunks into the
channel as we transition away from needing to do that.
* Add SourceManager to Engine struct
* Update Engine methods to use the SourceManager
* Fix GCS test
The original was testing that `Init()` errors weren't surfaced in
`Finish()`, but the `SourceManager` changed that behavior.
* JobProgress race fixes
* Add contextual values
* Remove unused code
* Add debug logs
* Rename WithConcurrency to WithConcurrentSources
* Always forward chunks to the output chunks channel
* Support fatal errors in job reports
* WIP: JobReporter and JobInspector
* WIP: JobReportHook and JobReportRef
* Add ChunkError type and asyncRun helper method
* Rename JobReport to JobProgress
* Return a closed channel from Done when the JobProgress is nil
* Comment catchFirstFatal function
* Miscellaneous SourceManager updates
* Own the chunks channel instead of accepting it as an input
* Add Chunks and Wait methods
* Fix bug in Enroll so it actually returns the handle
* Add context.Context parameter to the SourceInitFunc type
* Add SourceManager tests for Run and Wait methods
* Rename man variables to mgr
* Implement SourceManager basics
* Rename identifiers and add a default headlessAPI implementation
* Rewrite to use SourceInitFunc
* Update variable name to accurately reflect its value