Commit graph

1961 commits

Author SHA1 Message Date
Nick Sweeting
9adfe0e2e6
add code to log all SQL queries for DEBUG 2024-11-18 04:27:38 -08:00
Nick Sweeting
385ccaa14d
extend core models with ModelWithOutputDir 2024-11-18 04:27:38 -08:00
Nick Sweeting
1e3ce67834
fix API and CLU calls 2024-11-18 04:27:38 -08:00
Nick Sweeting
f65c2b40f8
tweak dashboard UI css 2024-11-18 04:27:38 -08:00
Nick Sweeting
f5727c7da2
rename actors to workers 2024-11-18 04:27:37 -08:00
Nick Sweeting
9b8cf7b4f0
simplify actor and orchestrator by removing threading code, fixing bugs 2024-11-18 04:27:37 -08:00
Nick Sweeting
af21c3428b
add ModelWithOutputDir base model to manage output directories and index writing 2024-11-18 04:27:37 -08:00
Nick Sweeting
c8b830b8dc
add ABIDModel.update_for_workers to update-in-place + bump retry_at time 2024-11-18 04:27:37 -08:00
Nick Sweeting
b852442efc
add crawls app back to django admin 2024-11-18 04:27:37 -08:00
Nick Sweeting
1ec2753664
fix statemachine create_root_snapshot and retry timing 2024-11-18 04:27:37 -08:00
Nick Sweeting
67c22b2df0
fix config set not working with constants 2024-11-18 04:27:37 -08:00
Nick Sweeting
2a66bb9a57
flip queue processing order to do most recent first 2024-11-18 04:27:37 -08:00
Nick Sweeting
148ea907bd
fix serious bug with Actor.get_next updating all rows instead of only top row 2024-11-18 04:27:37 -08:00
Nick Sweeting
c206056f07
add better docstrings to abx package 2024-11-17 20:26:56 -08:00
Nick Sweeting
2f30a35d2b
add extractors files to favicon and title plugins 2024-11-17 20:11:43 -08:00
Nick Sweeting
1b8bafdb56
add abx-spec-abx-pkg pkg 2024-11-17 20:10:33 -08:00
Nick Sweeting
36d24cd8d7
add jobs dashboard 2024-11-17 20:09:55 -08:00
Nick Sweeting
fb82fdae16
make actor pending include all obj with retry_at in the past 2024-11-17 20:09:38 -08:00
Nick Sweeting
8f8fbbb7a2
API fixes and add actors endpoints 2024-11-17 20:09:06 -08:00
Nick Sweeting
c8e186f21b
fix plugin loading order, admin, abx-pkg 2024-11-16 06:44:12 -08:00
Nick Sweeting
210fd935d7
make orchestrator run as long as any tasks are pending 2024-11-16 06:42:04 -08:00
Nick Sweeting
b7df1ca3a7
add start orchestrator management command 2024-11-16 02:49:01 -08:00
Nick Sweeting
2291f02147
setup seed model 2024-11-16 02:48:17 -08:00
Nick Sweeting
8cd285e273
add Seed admin 2024-11-16 02:48:06 -08:00
Nick Sweeting
c2add7119c
make supervisord start orchestrator on startup 2024-11-16 02:47:51 -08:00
Nick Sweeting
ba26d75079
add notes and label fields, fix model getters 2024-11-16 02:47:35 -08:00
Nick Sweeting
227fd4e1c6
fix statemachine progression for Snapshot, Crawl, and ArchiveResult 2024-11-16 02:46:45 -08:00
Nick Sweeting
684a394cba
add HOSTNAME to config.permissions 2024-11-16 02:45:58 -08:00
Nick Sweeting
b4a5da3ffd
update archivebox add CLI command to use new actor system 2024-11-16 02:45:37 -08:00
Nick Sweeting
43514da0d0
add crawl and seed endpoints to REST API 2024-11-16 02:45:11 -08:00
Nick Sweeting
48bb634b75
fix orchestrator startup and add exit_on_idle option 2024-11-16 02:44:57 -08:00
Nick Sweeting
c3d692b5d5
fix minor actor erros around CLAIM_ATOMIC 2024-11-16 02:44:33 -08:00
Nick Sweeting
7c0e3dcc21
load crawls,seeds,actors apps as pluggy plugins 2024-11-16 02:44:11 -08:00
Nick Sweeting
ed43f1d027
better docstrings and comments 2024-11-15 23:21:40 -08:00
Nick Sweeting
ec100bfe29
fix docs build for vendored pkgs 2024-11-12 23:53:34 -08:00
Nick Sweeting
57852fd89e
fix sphinx docs build 2024-11-12 22:20:11 -08:00
Nick Sweeting
a9a3b153b1
more StateMachine, Actor, and Orchestrator improvements 2024-11-04 07:08:39 -08:00
Nick Sweeting
1148cadd7a
Update __init__.py 2024-11-03 16:12:29 -05:00
Nick Sweeting
b6ab4e298e
merge dev 2024-11-03 12:56:44 -08:00
Nick Sweeting
758c0c6774
add user providable PLAYWRIGHT cache dir 2024-11-03 12:54:04 -08:00
Nick Sweeting
50a85ec97b
Update archivebox/plugins_pkg/playwright/binproviders.py 2024-11-03 15:47:00 -05:00
Andrew Dunham
49c520914c playwright: support PLAYWRIGHT_BROWSERS_PATH environment variable
This environment variable is used by Playwright to configure where to
install browsers. If specified, the given directory is used instead of
the OS-specific cache folder. For compatibility with existing Playwright
installations, and better control over where these binaries are
installed, check the same environment variable in PlaywrightBinProvider.

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
2024-11-03 13:59:20 -05:00
Nick Sweeting
48f8416762
add new core and crawsl statemachine manager 2024-11-03 00:41:11 -07:00
Nick Sweeting
41efd010f0
add wip crawl actor spec 2024-11-02 19:54:37 -07:00
Nick Sweeting
2337f874ad
better actor atomic claim 2024-11-02 19:54:25 -07:00
Nick Sweeting
9b24fe7390
merge dev 2024-11-02 17:34:33 -07:00
Nick Sweeting
dbe5c0bc07
more orchestrator and actor improvements 2024-11-02 17:25:51 -07:00
Nick Sweeting
721427a484
hide progress bar on startup 2024-10-31 07:11:15 -07:00
Nick Sweeting
17faa5a507
improvements to new actor and orchestrators 2024-10-31 07:11:15 -07:00
Nick Sweeting
9c2eac4e47
add new actors and orchestrators 2024-10-31 07:11:14 -07:00