Commit graph

1961 commits

Author SHA1 Message Date
Nick Sweeting
0b4cbb6415
clear out stale supervisord state in between runs 2024-10-08 18:59:48 -07:00
Nick Sweeting
584abe8548
never attempt to create system venv, install ldap in lib automatically, and setup binproviders before bins 2024-10-08 18:52:02 -07:00
Nick Sweeting
3e4a846488
fix more installer bugs 2024-10-08 18:06:57 -07:00
Nick Sweeting
4b34b729ab
fuck it go back to nested lib and tmp dirs with supervisord sock workaround 2024-10-08 17:48:59 -07:00
Nick Sweeting
1888691ee8
try creating shared libs as 777 when running as root 2024-10-08 17:10:56 -07:00
Nick Sweeting
35c7019772
handle failure on tmp_dir and lib_dir detection better 2024-10-08 16:56:25 -07:00
Nick Sweeting
a33da44492
more attempts to fix euid permissions issues on ubuntu 2024-10-08 16:56:24 -07:00
Nick Sweeting
216e885b85
bump pydantic-pkgr 2024-10-08 03:53:41 -07:00
Nick Sweeting
8d32508581
only show data dir info if one is active 2024-10-08 03:04:38 -07:00
Nick Sweeting
de2ab43f7f
switch .is_dir and .exists for os.access to avoid PermissionError on startup 2024-10-08 03:02:34 -07:00
Nick Sweeting
611a2b7c1b
fix a few small nits 2024-10-08 02:10:08 -07:00
Nick Sweeting
3fb5b6eb94
exit archivebox version with failure status if any subdependencies are not installed 2024-10-08 01:52:04 -07:00
Nick Sweeting
5e351f6ba6
more docker dependency tweaks 2024-10-08 01:47:38 -07:00
Nick Sweeting
46c0463539
safer import handling 2024-10-08 00:51:58 -07:00
Nick Sweeting
d9fef4cd84
fix import order 2024-10-08 00:15:53 -07:00
Nick Sweeting
397ae1a99b
fix Docker build and import issues 2024-10-08 00:12:09 -07:00
Nick Sweeting
cf1ea8f80f
improve config loading of TMP_DIR, LIB_DIR, move to separate files 2024-10-07 23:45:11 -07:00
Nick Sweeting
7a895d9285
make sure monkey patches and vendor libs come before first constants import 2024-10-05 16:42:51 -07:00
Nick Sweeting
430e974719
add pydantic-pkgr back as vendored lib submodule for now 2024-10-05 16:39:20 -07:00
Nick Sweeting
ccdc3e1c47
add pydantic-pkgr back as vendored lib becauase pypi is misbehaving 2024-10-05 16:38:55 -07:00
Nick Sweeting
55e286972d
fix timeout check showing regardless of value 2024-10-05 04:24:07 -07:00
Nick Sweeting
db10a2142e
remove extra files from repo root and move package.json into etc 2024-10-05 03:53:23 -07:00
Nick Sweeting
66a785bb35
only use system tmp dirs because of socket path length restrictions 2024-10-05 03:16:27 -07:00
Nick Sweeting
35446ce742
include sonic-client by default and allow ldap to be installed at runtime 2024-10-05 03:11:48 -07:00
Nick Sweeting
ce2e19a429
switch to uv builds and rc1 versioning system 2024-10-04 23:48:25 -07:00
Nick Sweeting
0876cc78d9
remove no longer needed vendored libs 2024-10-04 23:35:34 -07:00
Nick Sweeting
80e052b166
fix pip binary loading 2024-10-04 23:26:49 -07:00
Nick Sweeting
beefe69b74
fix CHROME_TIMEOUT causing hanging on some platforms 2024-10-04 21:49:09 -07:00
Nick Sweeting
ac96cc62fc
fix CUSTOM_TEMPLATES_DIR loading 2024-10-04 21:40:36 -07:00
Nick Sweeting
0c7d7a2225
fix archivebox init colors and dir status checking 2024-10-04 21:34:19 -07:00
Nick Sweeting
5323953f94
handle Ctrl+C more gracefully 2024-10-04 21:33:46 -07:00
Nick Sweeting
026169a8e2
fix rich colors 2024-10-04 21:09:29 -07:00
Nick Sweeting
d747cf7f31
fix SYSTEM_TMP_DIR and SYSTEM_LIB_DIR in docker 2024-10-04 21:03:02 -07:00
Nick Sweeting
811f9a8d93
move queue db name into constants and fix file detection at startup 2024-10-04 19:38:36 -07:00
Nick Sweeting
3f986f09cc
fix relative dir calculation in extraactor hook 2024-10-04 19:24:01 -07:00
Nick Sweeting
73e69ccb8b
fixes for docs generation 2024-10-04 19:16:46 -07:00
Nick Sweeting
f76bdc4332
fix old wsgi.py 2024-10-04 16:08:57 -07:00
Nick Sweeting
da274fd8e8
remove dead code 2024-10-04 14:48:20 -07:00
Nick Sweeting
396a7ffcd8
move tmp dir to machine-id scoped dir 2024-10-04 03:24:15 -07:00
Nick Sweeting
12f32c4690
fix tmp data dir resolution when running help or version outside data dir 2024-10-04 01:40:41 -07:00
Nick Sweeting
f321d25f4c
fallback to reading binaries from filesystem when theres no db 2024-10-04 01:00:09 -07:00
Nick Sweeting
152b530249
scope LIB_DIR by os, arch, and docker status 2024-10-04 00:08:44 -07:00
Nick Sweeting
c84ea81c5a
more ldap lib optimization 2024-10-03 18:25:35 -07:00
Nick Sweeting
89a066da0b
remove django-url-tools in favor of core_tags snippet 2024-10-03 18:25:20 -07:00
Nick Sweeting
3f1a19dd35
fix type imports 2024-10-03 18:24:57 -07:00
Nick Sweeting
c5da3c1f22
fix docker build 2024-10-03 18:24:27 -07:00
Nick Sweeting
0619750ffa
add django-url-tools to fix pagination and search on public index 2024-10-03 17:39:55 -07:00
Nick Sweeting
1492c02bfa
lazy-load loadfire and ldap lib for faster startup time 2024-10-03 17:39:39 -07:00
Nick Sweeting
563e4de678
unwinding circular dependencies 2024-10-03 17:05:49 -07:00
Nick Sweeting
9241a45bb8
Update base_configset.py 2024-10-03 11:42:27 -04:00
Nick Sweeting
de09867f87
ignore lib/bin symlinking errors 2024-10-03 04:10:52 -07:00
Nick Sweeting
aae9deccc2
add machine migrations 2024-10-03 04:06:28 -07:00
Nick Sweeting
b072fd8ef4
load all binaries from cache by default 2024-10-03 04:06:17 -07:00
Nick Sweeting
0f37abb657
fix symlinking to lib when conflicting file already exists 2024-10-03 03:56:45 -07:00
Nick Sweeting
490e5ba11d
fallback to localhost if detecting dnsserver fails 2024-10-03 03:53:50 -07:00
Nick Sweeting
b36e89d086
relocate LIB_DIR and TMP_DIR inside docker so it doesnt clash with outside docker 2024-10-03 03:43:02 -07:00
Nick Sweeting
f4f1d7893c
fix CUSTOM_TEMPLATES_DIR config load and chrome symlinking 2024-10-03 03:20:25 -07:00
Nick Sweeting
29fc14dff4
dont build docker container twice during release 2024-10-03 03:12:18 -07:00
Nick Sweeting
9728d81fee
add puppeteer to docker requirements for easier browser fetching 2024-10-03 03:12:06 -07:00
Nick Sweeting
697d0a3566
nicer version and help pretty printing with rich 2024-10-03 03:11:23 -07:00
Nick Sweeting
161afc7297
add health stats counters to machine models 2024-10-03 03:11:04 -07:00
Nick Sweeting
3b9e48ead8
show deprecation warning for archivebox setup command 2024-10-03 03:10:36 -07:00
Nick Sweeting
e315905721
add new InstalledBinary model to cache binaries on host machine 2024-10-03 03:10:22 -07:00
Nick Sweeting
32167de936
add daemonize flag to archivebox server 2024-10-02 19:46:48 -07:00
Nick Sweeting
035a14b6ea
better help text output 2024-10-02 19:46:31 -07:00
Nick Sweeting
968adf64da
small easter eggs 2024-10-02 14:17:28 -07:00
Nick Sweeting
95043e5a07
bump js subdeps versions 2024-10-01 21:48:07 -07:00
Nick Sweeting
295c5c46e0
add new crawl model 2024-10-01 21:47:16 -07:00
Nick Sweeting
f46d62a114
add py-machineid lib for new machine app 2024-10-01 21:46:35 -07:00
Nick Sweeting
4a19051f4a
change BaseExtractor to use new extract hookspec 2024-10-01 21:45:18 -07:00
Nick Sweeting
276a505cae
fix extractor path calculation 2024-10-01 21:44:56 -07:00
Nick Sweeting
8498ca5c64
add abx.archivebox extract hookspec 2024-10-01 21:44:19 -07:00
Nick Sweeting
81d16e96fd
fix toml_util circular import in abx 2024-10-01 21:43:35 -07:00
Nick Sweeting
716df44b7a
fix created_by_id access error on abid creation 2024-10-01 21:43:11 -07:00
Nick Sweeting
5697ecefad
fix index SEARCH_BACKENDS import_backend to load via ABX instead of settings 2024-10-01 00:19:19 -07:00
Nick Sweeting
dac134dfca
improve default chrome cli launch args used for archiving 2024-10-01 00:18:57 -07:00
Nick Sweeting
94123ca68c
fix archive_dot_org repsonse parsing bytes vs str bug 2024-10-01 00:18:38 -07:00
Nick Sweeting
71d215367b
add types to abx use api 2024-10-01 00:18:21 -07:00
Nick Sweeting
8fbfa10df3
fix missing check_migrations import 2024-09-30 23:52:48 -07:00
Nick Sweeting
18474f452b
move config moved out of legacy files and better version output 2024-09-30 23:52:00 -07:00
Nick Sweeting
d21bc86075
finish migrating almost all config to new system 2024-09-30 23:21:34 -07:00
Nick Sweeting
4b6a2a3e50
add git plugin 2024-09-30 23:20:03 -07:00
Nick Sweeting
8c3342afe5
rename archivebox setup to archivebox install 2024-09-30 23:19:46 -07:00
Nick Sweeting
4334c74548
change archivebox setup to install ALL binaries by default 2024-09-30 21:44:23 -07:00
Nick Sweeting
f6176ae05e
move curl into plugin 2024-09-30 21:43:54 -07:00
Nick Sweeting
69522da4bb
move wget and mercury into plugins 2024-09-30 21:43:45 -07:00
Nick Sweeting
c4e040f11a
add WgetPlugin with WgetExtractor, WarcExtractor, WgetBinary 2024-09-30 19:33:30 -07:00
Nick Sweeting
2a1645ba27
fix import errors 2024-09-30 19:32:57 -07:00
Nick Sweeting
51fe4c38c2
speed up version command by checking if quiet is passed 2024-09-30 18:33:43 -07:00
Nick Sweeting
31ce490321
fix help command output docstrings and more CLI log coloring 2024-09-30 18:29:17 -07:00
Nick Sweeting
7489663ff3
use pretty printing for config CLI output 2024-09-30 18:14:43 -07:00
Nick Sweeting
c909c00123
improve archivebox version cli output 2024-09-30 18:13:05 -07:00
Nick Sweeting
66cd711df9
improve version detection 2024-09-30 18:12:48 -07:00
Nick Sweeting
b913e6f426
rename OUTPUT_DIR to DATA_DIR 2024-09-30 17:44:18 -07:00
Nick Sweeting
363a499289
move util.py into misc folder 2024-09-30 17:25:15 -07:00
Nick Sweeting
dfca4b13b2
move system.py into misc folder 2024-09-30 17:13:55 -07:00
Nick Sweeting
7a41b6ae46
remove ConfigSectionName and add type hints to CONSTANTS 2024-09-30 16:50:36 -07:00
Nick Sweeting
3e5b6ddeae
move config into dedicated global app 2024-09-30 15:59:05 -07:00
Nick Sweeting
ee7f73bd7b
bump version to 0.8.5 2024-09-27 01:26:12 -07:00
Nick Sweeting
8d3f45b720
merge plugantic and abx, all praise be to praise our glorious pluggy gods 2024-09-27 01:26:12 -07:00
Nick Sweeting
4f42eb0313
move ini_to_toml into misc 2024-09-27 01:26:11 -07:00
Nick Sweeting
6f7b6c6bde
move unused ansible folder 2024-09-27 01:26:11 -07:00
Nick Sweeting
7b6a491ae0
exclude dunder vars from constants 2024-09-27 01:26:10 -07:00
Nick Sweeting
0589ff2b5d
move loading of vendor libs import archivebox init 2024-09-27 01:26:10 -07:00
Nick Sweeting
8ed3155ec5
migrate plugin loading process to new pluggy-powered system based on djp 2024-09-26 02:43:12 -07:00
Nick Sweeting
efd341d8ad
add DIR_OUTPUT_PERMISSIONS to STORAGE_CONFIG and fix ripgrep constants import 2024-09-26 02:42:50 -07:00
Nick Sweeting
7b85ba7fd8
fix log line view in admin data when bytes are not utf8 2024-09-26 02:41:45 -07:00
Nick Sweeting
0cfcabf6f4
fix admin data view configs type rendering 2024-09-26 02:41:22 -07:00
Nick Sweeting
ed45f58758
use constants in more places 2024-09-26 02:41:09 -07:00
Nick Sweeting
eb360f188a
remove old insecure index.json url serving from root 2024-09-26 02:38:59 -07:00
Nick Sweeting
d8a9dca0f6
use constants in more places 2024-09-26 02:38:45 -07:00
Nick Sweeting
24a9f432c9
fix archivebox manage command not passing args correctly 2024-09-26 02:37:44 -07:00
Nick Sweeting
6ec5925b7f
fix readability plugin name 2024-09-26 02:37:26 -07:00
Nick Sweeting
45736036e0
simplify archivebox.constants to just use benedict instead of kludgy NamedTuple 2024-09-26 02:36:59 -07:00
Nick Sweeting
80d3def206
improve archivebox.__init__ and load vendor libs at very beginning 2024-09-26 02:36:34 -07:00
Nick Sweeting
446b38dc41
add favicon and archivedotorg plugins 2024-09-26 02:32:10 -07:00
Nick Sweeting
c950271bc3
fix more constants / config loading 2024-09-25 05:12:34 -07:00
Nick Sweeting
bb65b2dbec
move almost all config into new archivebox.CONSTANTS 2024-09-25 05:10:09 -07:00
Nick Sweeting
f5e8d99fdf
update archivebox setup to use new binprovider install methods 2024-09-25 01:15:15 -07:00
Nick Sweeting
bc08bb04a2
archivebox version show when binary is not loaded correctly 2024-09-25 01:15:00 -07:00
Nick Sweeting
0ef3a0913b
check python encoding in SHELL_CONFIG validation 2024-09-25 01:14:48 -07:00
Nick Sweeting
e0eb3119b7
bump pydantic-pkgr to 0.3.7 2024-09-25 01:13:34 -07:00
Nick Sweeting
a5ffd4e9d3
move pdf, screenshot, dom, singlefile, and ytdlp extractor config to new plugin system 2024-09-25 00:42:26 -07:00
Nick Sweeting
a2a586e369
fix system.run not using text arg 2024-09-25 00:41:55 -07:00
Nick Sweeting
5b6cf68d98
move system startup checks to pip and plugins_sys config model validation 2024-09-25 00:41:24 -07:00
Nick Sweeting
2fd837f254
setup rich tracebacks width properly in monkey patched exception handler 2024-09-25 00:40:37 -07:00
Nick Sweeting
ee5bec6a10
flip link_archive exception throw order so real exception is easier to read at the bottom 2024-09-25 00:39:49 -07:00
Nick Sweeting
6742888278
setup rich tracebacks width properly 2024-09-25 00:39:27 -07:00
Nick Sweeting
5e4b78d9e0
change supervisord to always start non-daemonized by default 2024-09-24 22:22:03 -07:00
Nick Sweeting
de2ba890ea
add ArchiveBox binary 2024-09-24 22:01:28 -07:00
Nick Sweeting
3dacec3f5b
prevent redundant supervisord starts 2024-09-24 22:01:18 -07:00
Nick Sweeting
b117484de7
add new Snapshot.archive method powered by huey task 2024-09-24 21:17:51 -07:00
Nick Sweeting
e99260feb2
fix rich logging issues 2024-09-24 21:17:07 -07:00
Nick Sweeting
0dffbf1bb4
fix rich autodetection of TTY, USE_COLOR, SHOW_PROGRESS 2024-09-24 19:37:29 -07:00
Nick Sweeting
bde0bf8263
load ipython rich extension in archivebox shell 2024-09-24 19:37:05 -07:00
Nick Sweeting
07eff77c0a
bump pydantic-pkgr submodule 2024-09-24 19:05:09 -07:00
Nick Sweeting
7c363bffc6
add ini_to_toml test 2024-09-24 19:04:54 -07:00
Nick Sweeting
64c7100cf9
speed up startup time, add rich startup progressbar, split logging and checks into misc, fix search index import backend bug 2024-09-24 19:04:38 -07:00
Nick Sweeting
7ffb81f61b
delete dead code 2024-09-24 15:26:43 -07:00
Nick Sweeting
97695bda5e
more settings loading tweaks and improvements 2024-09-24 15:13:54 -07:00
Nick Sweeting
fbfd16e195
fully migrate all search backends to new plugin system 2024-09-24 03:05:43 -07:00
Nick Sweeting
c9c163efed
begin migrating search backends to new plugin system 2024-09-24 02:13:01 -07:00
Nick Sweeting
2d19317e3f
rename plugins_sys base to config 2024-09-24 02:12:30 -07:00
Nick Sweeting
e8f1264954
rename plugins dirs 2024-09-24 01:34:27 -07:00
Nick Sweeting
8713091e73
remove redundant import 2024-09-24 01:32:01 -07:00
Nick Sweeting
77d3990535
temporarily add prints on plugin setup for easier debugging 2024-09-24 01:26:16 -07:00
Nick Sweeting
a9a97c013d
split plugin dirs, created new cleaner import path for plugin config in settings.py 2024-09-24 01:25:55 -07:00
Nick Sweeting
1a58967e8c
first example of plugin config based on another plugin config 2024-09-23 21:10:19 -07:00
Nick Sweeting
8df9480824
make sure hooks have the object identity everywhere in the codebase by avoiding pydantics usual deepcopy on every validation 2024-09-23 21:04:23 -07:00
Nick Sweeting
4eb1c14139
handle ConfigSet default value factories that dont take any args 2024-09-23 21:03:16 -07:00
Nick Sweeting
1f4cded152
use benedict in old config instead of AttrDict 2024-09-23 21:02:51 -07:00
Nick Sweeting
e992a84b80
add custom TOML encoder to work around issues with dumping toml of lots of different types 2024-09-23 21:02:33 -07:00
Nick Sweeting
b6cfeb8d40
add new pydantic_settings based loader for ConfigSets 2024-09-22 19:30:24 -07:00
Nick Sweeting
c8ff8f2b86
add header to generated TOML file warning its been converted from INI 2024-09-22 19:27:33 -07:00
Nick Sweeting
7f05026022
change is_registered and is_ready into private model fields 2024-09-22 19:27:00 -07:00
Nick Sweeting
8f38f70e4a
define PACKAGE_DIR and DATA_DIR in settings.py directly 2024-09-22 19:26:26 -07:00
Nick Sweeting
8c8c64d90f
swap AttrDict for benedict everywhere 2024-09-22 19:26:05 -07:00
Nick Sweeting
b611c0114c
add pydantic_settings mockup 2024-09-22 16:48:28 -07:00
Nick Sweeting
3b0a25950d
add minor pydantic pkgr fix 2024-09-22 16:28:48 -07:00
Nick Sweeting
d89b6ce419
add SQLite semaphore mockup 2024-09-22 16:28:30 -07:00
Nick Sweeting
a2d827afd6
bump pydantic pkgr to 0.3.5 2024-09-22 15:41:21 -07:00
Nick Sweeting
f8c6ff88ad
add clickable host link back to archivebox server output 2024-09-22 15:41:21 -07:00
Nick Sweeting
2d99f184d3
add mockup for new config loading process 2024-09-22 15:41:21 -07:00
Nick Sweeting
ab0087e106
cleanup chrome and playwright symlink and app names 2024-09-22 15:41:20 -07:00
Nick Sweeting
8945475f8d
bump pydantic-pkgr submodule to 0.3.4 2024-09-21 04:12:59 -07:00
Nick Sweeting
99dd812e3b
bump pydantic-pkgr version to 0.3.4 2024-09-21 04:12:34 -07:00
Nick Sweeting
541cd6c5a1
split puppeteer plugin into Puppeteer, Playwright, and Chrome 2024-09-21 04:12:34 -07:00
Nick Sweeting
33fd7fe439
fix log_list_view trying to seek past end of file on short logs 2024-09-21 04:12:34 -07:00
Nick Sweeting
aa21c56ddd
add timeout limit to bin_version loading in config 2024-09-21 04:12:34 -07:00
Nick Sweeting
575105006d
add LIB_DIR and BIN_DIR to config 2024-09-21 04:12:34 -07:00
Nick Sweeting
6096fb1427
update puppeteer plugin to create a PuppeteerBinProvider for installing browsers 2024-09-21 04:12:34 -07:00
Nick Sweeting
6c39d27ccb
update singlefile plugin to use new npm binprovider and support installing 2024-09-21 04:12:33 -07:00
Nick Sweeting
dd6d7e4975
fix npm and pip binprovider setup and paths search 2024-09-21 04:12:33 -07:00
Nick Sweeting
30def925e7
move all ansible files into plugantic folder for now 2024-09-21 04:12:33 -07:00
Nick Sweeting
11f369ee2d
bump subdependency versions 2024-09-21 04:12:33 -07:00
Nick Sweeting
61df9ea059
fix duplicate when 2024-09-17 02:04:41 -07:00
Nick Sweeting
2c8779736a
change default node version to 21 2024-09-17 02:03:28 -07:00
Nick Sweeting
f29f72f383
fix os checking for npm install 2024-09-17 01:57:06 -07:00
Nick Sweeting
a5cefb5464
install nodesource first 2024-09-17 01:46:02 -07:00
Nick Sweeting
19c7b9c24e
install nodejs and npm packages properly in npm ansible 2024-09-17 01:42:06 -07:00
Nick Sweeting
7ab1a0b873
fix singlefile and puppeteer ansible install 2024-09-17 01:33:32 -07:00
Nick Sweeting
5c0aa6fe59
more ansible fixes 2024-09-17 01:12:49 -07:00
Nick Sweeting
c55cd46ecb
consolidate ansible setup into roles dir 2024-09-17 00:48:47 -07:00
Nick Sweeting
25db6826ec
ignore lib dirs 2024-09-17 00:47:55 -07:00
Nick Sweeting
8d69469887
silence ansible errors about implicit localhost 2024-09-15 20:31:11 -07:00
Nick Sweeting
fab80632b7
add ansible runner code to get facts after execution and benedict 2024-09-15 20:29:02 -07:00
Nick Sweeting
e9ddac0219
fix ansible installed_packages and cacheable facts 2024-09-15 20:28:35 -07:00
Nick Sweeting
56b851ea1b
more ansible playbooks improvements 2024-09-13 04:55:40 -07:00
Nick Sweeting
8557e77a70
add ansible playbooks 2024-09-13 03:27:38 -07:00
Nick Sweeting
3bbf8f69ab
cleanup settings.py sqlite settings more 2024-09-13 03:27:38 -07:00
Nick Sweeting
c887af0278
minor ruff fixes 2024-09-12 02:00:07 -07:00
Nick Sweeting
c00afce71f
upgrade dependency versions to django 5.1 minimum 2024-09-11 17:08:10 -07:00
Nick Sweeting
eae11cba19
add recommended SQLite db connection settings to avoid single-writer lock contention 2024-09-11 16:50:44 -07:00
Nick Sweeting
0187c8b6cb
bump version to 0.8.4 2024-09-10 03:10:30 -07:00
Nick Sweeting
a13f71a86c
allow supervisord to start if pid file is stale 2024-09-10 03:10:10 -07:00
Nick Sweeting
cecca8d169
allow deleting results from list page 2024-09-10 03:09:43 -07:00
Nick Sweeting
f5c878b267
point select2 js resources to local statifiles 2024-09-10 01:51:08 -07:00
Nick Sweeting
0640018426
bump packages 2024-09-10 01:50:49 -07:00
Nick Sweeting
0bd678c30f
fix init 2024-09-10 00:37:01 -07:00
Nick Sweeting
d680c48942
avoid auto-starting all supervisord workers on startup 2024-09-10 00:19:32 -07:00