Nick Sweeting
0acd388c02
fix imports and deps
2024-11-18 18:07:34 -08:00
Nick Sweeting
6b83b4c995
leave archivebox running when in archivebox update
2024-11-18 04:27:38 -08:00
Nick Sweeting
eeb2671e4d
API improvements
2024-11-18 04:27:38 -08:00
Nick Sweeting
c7bd9449d5
better jobs dashboard with faster refresh
2024-11-18 04:27:38 -08:00
Nick Sweeting
eb53145e4e
working state machine flow yay
2024-11-18 04:27:38 -08:00
Nick Sweeting
9adfe0e2e6
add code to log all SQL queries for DEBUG
2024-11-18 04:27:38 -08:00
Nick Sweeting
385ccaa14d
extend core models with ModelWithOutputDir
2024-11-18 04:27:38 -08:00
Nick Sweeting
1e3ce67834
fix API and CLU calls
2024-11-18 04:27:38 -08:00
Nick Sweeting
f65c2b40f8
tweak dashboard UI css
2024-11-18 04:27:38 -08:00
Nick Sweeting
f5727c7da2
rename actors to workers
2024-11-18 04:27:37 -08:00
Nick Sweeting
9b8cf7b4f0
simplify actor and orchestrator by removing threading code, fixing bugs
2024-11-18 04:27:37 -08:00
Nick Sweeting
af21c3428b
add ModelWithOutputDir base model to manage output directories and index writing
2024-11-18 04:27:37 -08:00
Nick Sweeting
c8b830b8dc
add ABIDModel.update_for_workers to update-in-place + bump retry_at time
2024-11-18 04:27:37 -08:00
Nick Sweeting
b852442efc
add crawls app back to django admin
2024-11-18 04:27:37 -08:00
Nick Sweeting
1ec2753664
fix statemachine create_root_snapshot and retry timing
2024-11-18 04:27:37 -08:00
Nick Sweeting
67c22b2df0
fix config set not working with constants
2024-11-18 04:27:37 -08:00
Nick Sweeting
2a66bb9a57
flip queue processing order to do most recent first
2024-11-18 04:27:37 -08:00
Nick Sweeting
148ea907bd
fix serious bug with Actor.get_next updating all rows instead of only top row
2024-11-18 04:27:37 -08:00
Nick Sweeting
c206056f07
add better docstrings to abx package
2024-11-17 20:26:56 -08:00
Nick Sweeting
2f30a35d2b
add extractors files to favicon and title plugins
2024-11-17 20:11:43 -08:00
Nick Sweeting
1b8bafdb56
add abx-spec-abx-pkg pkg
2024-11-17 20:10:33 -08:00
Nick Sweeting
36d24cd8d7
add jobs dashboard
2024-11-17 20:09:55 -08:00
Nick Sweeting
fb82fdae16
make actor pending include all obj with retry_at in the past
2024-11-17 20:09:38 -08:00
Nick Sweeting
8f8fbbb7a2
API fixes and add actors endpoints
2024-11-17 20:09:06 -08:00
Nick Sweeting
c8e186f21b
fix plugin loading order, admin, abx-pkg
2024-11-16 06:44:12 -08:00
Nick Sweeting
210fd935d7
make orchestrator run as long as any tasks are pending
2024-11-16 06:42:04 -08:00
Nick Sweeting
b7df1ca3a7
add start orchestrator management command
2024-11-16 02:49:01 -08:00
Nick Sweeting
2291f02147
setup seed model
2024-11-16 02:48:17 -08:00
Nick Sweeting
8cd285e273
add Seed admin
2024-11-16 02:48:06 -08:00
Nick Sweeting
c2add7119c
make supervisord start orchestrator on startup
2024-11-16 02:47:51 -08:00
Nick Sweeting
ba26d75079
add notes and label fields, fix model getters
2024-11-16 02:47:35 -08:00
Nick Sweeting
227fd4e1c6
fix statemachine progression for Snapshot, Crawl, and ArchiveResult
2024-11-16 02:46:45 -08:00
Nick Sweeting
684a394cba
add HOSTNAME to config.permissions
2024-11-16 02:45:58 -08:00
Nick Sweeting
b4a5da3ffd
update archivebox add CLI command to use new actor system
2024-11-16 02:45:37 -08:00
Nick Sweeting
43514da0d0
add crawl and seed endpoints to REST API
2024-11-16 02:45:11 -08:00
Nick Sweeting
48bb634b75
fix orchestrator startup and add exit_on_idle option
2024-11-16 02:44:57 -08:00
Nick Sweeting
c3d692b5d5
fix minor actor erros around CLAIM_ATOMIC
2024-11-16 02:44:33 -08:00
Nick Sweeting
7c0e3dcc21
load crawls,seeds,actors apps as pluggy plugins
2024-11-16 02:44:11 -08:00
Nick Sweeting
ed43f1d027
better docstrings and comments
2024-11-15 23:21:40 -08:00
Nick Sweeting
ec100bfe29
fix docs build for vendored pkgs
2024-11-12 23:53:34 -08:00
Nick Sweeting
57852fd89e
fix sphinx docs build
2024-11-12 22:20:11 -08:00
Nick Sweeting
a9a3b153b1
more StateMachine, Actor, and Orchestrator improvements
2024-11-04 07:08:39 -08:00
Nick Sweeting
1148cadd7a
Update __init__.py
2024-11-03 16:12:29 -05:00
Nick Sweeting
b6ab4e298e
merge dev
2024-11-03 12:56:44 -08:00
Nick Sweeting
758c0c6774
add user providable PLAYWRIGHT cache dir
2024-11-03 12:54:04 -08:00
Nick Sweeting
50a85ec97b
Update archivebox/plugins_pkg/playwright/binproviders.py
2024-11-03 15:47:00 -05:00
Andrew Dunham
49c520914c
playwright: support PLAYWRIGHT_BROWSERS_PATH environment variable
...
This environment variable is used by Playwright to configure where to
install browsers. If specified, the given directory is used instead of
the OS-specific cache folder. For compatibility with existing Playwright
installations, and better control over where these binaries are
installed, check the same environment variable in PlaywrightBinProvider.
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
2024-11-03 13:59:20 -05:00
Nick Sweeting
48f8416762
add new core and crawsl statemachine manager
2024-11-03 00:41:11 -07:00
Nick Sweeting
41efd010f0
add wip crawl actor spec
2024-11-02 19:54:37 -07:00
Nick Sweeting
2337f874ad
better actor atomic claim
2024-11-02 19:54:25 -07:00