Commit graph

205 commits

Author SHA1 Message Date
Nick Sweeting
de2ab43f7f
switch .is_dir and .exists for os.access to avoid PermissionError on startup 2024-10-08 03:02:34 -07:00
Nick Sweeting
cf1ea8f80f
improve config loading of TMP_DIR, LIB_DIR, move to separate files 2024-10-07 23:45:11 -07:00
Nick Sweeting
18474f452b
move config moved out of legacy files and better version output 2024-09-30 23:52:00 -07:00
Nick Sweeting
b913e6f426
rename OUTPUT_DIR to DATA_DIR 2024-09-30 17:44:18 -07:00
Nick Sweeting
363a499289
move util.py into misc folder 2024-09-30 17:25:15 -07:00
Nick Sweeting
dfca4b13b2
move system.py into misc folder 2024-09-30 17:13:55 -07:00
Nick Sweeting
3e5b6ddeae
move config into dedicated global app 2024-09-30 15:59:05 -07:00
Nick Sweeting
ed45f58758
use constants in more places 2024-09-26 02:41:09 -07:00
Nick Sweeting
bb65b2dbec
move almost all config into new archivebox.CONSTANTS 2024-09-25 05:10:09 -07:00
Nick Sweeting
fbfd16e195
fully migrate all search backends to new plugin system 2024-09-24 03:05:43 -07:00
Nick Sweeting
c9c163efed
begin migrating search backends to new plugin system 2024-09-24 02:13:01 -07:00
Nick Sweeting
60154fba5f
add django_huey, huey_monitor, and replace Threads with huey tasks 2024-09-10 00:05:45 -07:00
Nick Sweeting
cbf2a8fdc3
rename datetime fields to _at, massively improve ABID generation safety and determinism 2024-09-04 23:42:36 -07:00
Nick Sweeting
68a39b7392
remove .old_id entirely and make ABID generation only happen once on initial save 2024-09-04 16:40:15 -07:00
Nick Sweeting
d060eaa499
abid gradual improvements, some regrets 2024-09-04 00:08:14 -07:00
Nick Sweeting
6456cb1727
fix NOT NULL constraint failed: core_snapshot.created_by_id 2024-08-28 00:51:16 -07:00
Nick Sweeting
d0fefc0279
add chunk_size=500 to more iterator calls 2024-08-27 19:28:00 -07:00
Nick Sweeting
24fe958ff3
massively improve Snapshot admin list view query performance 2024-08-26 20:16:43 -07:00
Nick Sweeting
09553d8340
hardcode EXTRACTOR_CHOICES to prevent nondeterministic migrations 2024-08-22 15:36:02 -07:00
Nick Sweeting
9b1659c72f
make created_by_id autoapply to any ArchiveResults created under Snapshot 2024-08-20 19:43:07 -07:00
Nick Sweeting
c30ae1d2cb
add created_by_id to all Snapshot creation functions 2024-08-20 19:28:28 -07:00
Nick Sweeting
0285aa52a0
config and attr access improvements 2024-08-20 18:31:21 -07:00
Nick Sweeting
57d31b2b14
fix snapshot uuid 2024-08-18 01:07:21 -07:00
Nick Sweeting
8c50257fe9
move snapshot id to old_id 2024-08-18 00:24:38 -07:00
Nick Sweeting
f574d34357
wrap migrations maker in try catch 2024-06-03 03:02:00 -07:00
Nick Sweeting
774ce3fda7
fix singlefile extractor exception when result is none 2024-05-17 20:12:18 -07:00
Nick Sweeting
241a7c6ab2
add created, modified, updated, created_by and update django admin 2024-05-13 07:50:07 -07:00
Nick Sweeting
0420662174
switch everywhere to use Snapshot.pk and ArchiveResult.pk instead of id 2024-05-13 05:12:12 -07:00
Nick Sweeting
457c42bf84
load EXTRACTORS dynamically using importlib.import_module 2024-05-11 22:28:59 -07:00
Nick Sweeting
c7fc9c004f
add django-signal-webhooks 2024-05-06 06:58:03 -07:00
Nick Sweeting
ac73fb5129 merge fixes 2024-03-26 15:22:40 -07:00
Nick Sweeting
6a4e568d1b new archivebox update speed improvements 2024-02-22 04:50:22 -08:00
Nick Sweeting
1773146833 include more output file locations when considering whether snapshot.is_archived 2024-01-19 03:47:38 -08:00
Nick Sweeting
c1fd2cfa42 tag URLs immediately once added instead of waiting until archival completes 2024-01-03 20:31:46 -08:00
Nick Sweeting
a680724367
Merge branch 'dev' into search_index_extract_html_text 2023-10-27 23:09:28 -07:00
Ross Williams
310b4d1242 Add htmltotext extractor
Saves HTML text nodes and selected element attributes in
`htmltotext.txt` for each Snapshot. Primarily intended to be used
for search indexing.
2023-10-23 21:42:32 -04:00
Nick Sweeting
63ad43f46c
Merge branch 'dev' into method_allow_deny 2023-10-20 04:25:44 -07:00
DanielBatteryStapler
94dacc49c7
Fix archive_org icon "exists" 2023-08-15 23:49:54 -04:00
Ross Williams
46e80dd509 Rename URL_(WHITE|BLACK)LIST to URL_(ALLOW|DENY)LIST
Retain aliases for old configuration files
2023-08-02 09:31:48 -04:00
Micah R Ledbetter
1e50ca243e Add FAVICON_PROVIDER option for custom favicon service 2023-05-05 20:42:36 -05:00
Nick Sweeting
8ebf3e2f93 add config option PREVIEW_ORIGINALS to hide original iframes in snapshot detail pages 2022-05-09 19:31:41 -07:00
hannah98
fc3d2bb4dc rename TAG_SEPARATORS to TAG_SEPARATOR_PATTERN 2022-01-06 14:14:41 +00:00
hannah98
049f88def9 Added TAG_SEPARATORS option to supply a regex of characters to use when splitting tags 2021-12-30 20:19:48 +00:00
Nick Sweeting
d7f01922f3
fix direct assignment of tags to many-to-many set 2021-12-23 12:29:17 -05:00
Nick Sweeting
b1b7ee2b85
Update sql.py 2021-12-23 12:17:55 -05:00
hannah98
4b8962b60b Fix #725 - correctly parse tags on json import 2021-12-20 08:58:58 -06:00
Nick Sweeting
5a2c78e14b add proper support for URL_WHITELIST instead of using negation regexes 2021-07-06 23:42:00 -04:00
Nick Sweeting
e4974d3536 support negation patterns by checking both re.search and re.match 2021-07-06 23:17:05 -04:00
Nick Sweeting
2c6f0a96bf fix extra arg 2021-04-13 02:21:51 -04:00
Nick Sweeting
50b341baab bail out if old index.json is found during init but doesnt contain links 2021-04-12 16:51:45 -04:00