Commit graph

197 commits

Author SHA1 Message Date
Nick Sweeting
bb65b2dbec
move almost all config into new archivebox.CONSTANTS 2024-09-25 05:10:09 -07:00
Nick Sweeting
fbfd16e195
fully migrate all search backends to new plugin system 2024-09-24 03:05:43 -07:00
Nick Sweeting
c9c163efed
begin migrating search backends to new plugin system 2024-09-24 02:13:01 -07:00
Nick Sweeting
60154fba5f
add django_huey, huey_monitor, and replace Threads with huey tasks 2024-09-10 00:05:45 -07:00
Nick Sweeting
cbf2a8fdc3
rename datetime fields to _at, massively improve ABID generation safety and determinism 2024-09-04 23:42:36 -07:00
Nick Sweeting
68a39b7392
remove .old_id entirely and make ABID generation only happen once on initial save 2024-09-04 16:40:15 -07:00
Nick Sweeting
d060eaa499
abid gradual improvements, some regrets 2024-09-04 00:08:14 -07:00
Nick Sweeting
6456cb1727
fix NOT NULL constraint failed: core_snapshot.created_by_id 2024-08-28 00:51:16 -07:00
Nick Sweeting
d0fefc0279
add chunk_size=500 to more iterator calls 2024-08-27 19:28:00 -07:00
Nick Sweeting
24fe958ff3
massively improve Snapshot admin list view query performance 2024-08-26 20:16:43 -07:00
Nick Sweeting
09553d8340
hardcode EXTRACTOR_CHOICES to prevent nondeterministic migrations 2024-08-22 15:36:02 -07:00
Nick Sweeting
9b1659c72f
make created_by_id autoapply to any ArchiveResults created under Snapshot 2024-08-20 19:43:07 -07:00
Nick Sweeting
c30ae1d2cb
add created_by_id to all Snapshot creation functions 2024-08-20 19:28:28 -07:00
Nick Sweeting
0285aa52a0
config and attr access improvements 2024-08-20 18:31:21 -07:00
Nick Sweeting
57d31b2b14
fix snapshot uuid 2024-08-18 01:07:21 -07:00
Nick Sweeting
8c50257fe9
move snapshot id to old_id 2024-08-18 00:24:38 -07:00
Nick Sweeting
f574d34357
wrap migrations maker in try catch 2024-06-03 03:02:00 -07:00
Nick Sweeting
774ce3fda7
fix singlefile extractor exception when result is none 2024-05-17 20:12:18 -07:00
Nick Sweeting
241a7c6ab2
add created, modified, updated, created_by and update django admin 2024-05-13 07:50:07 -07:00
Nick Sweeting
0420662174
switch everywhere to use Snapshot.pk and ArchiveResult.pk instead of id 2024-05-13 05:12:12 -07:00
Nick Sweeting
457c42bf84
load EXTRACTORS dynamically using importlib.import_module 2024-05-11 22:28:59 -07:00
Nick Sweeting
c7fc9c004f
add django-signal-webhooks 2024-05-06 06:58:03 -07:00
Nick Sweeting
ac73fb5129 merge fixes 2024-03-26 15:22:40 -07:00
Nick Sweeting
6a4e568d1b new archivebox update speed improvements 2024-02-22 04:50:22 -08:00
Nick Sweeting
1773146833 include more output file locations when considering whether snapshot.is_archived 2024-01-19 03:47:38 -08:00
Nick Sweeting
c1fd2cfa42 tag URLs immediately once added instead of waiting until archival completes 2024-01-03 20:31:46 -08:00
Nick Sweeting
a680724367
Merge branch 'dev' into search_index_extract_html_text 2023-10-27 23:09:28 -07:00
Ross Williams
310b4d1242 Add htmltotext extractor
Saves HTML text nodes and selected element attributes in
`htmltotext.txt` for each Snapshot. Primarily intended to be used
for search indexing.
2023-10-23 21:42:32 -04:00
Nick Sweeting
63ad43f46c
Merge branch 'dev' into method_allow_deny 2023-10-20 04:25:44 -07:00
DanielBatteryStapler
94dacc49c7
Fix archive_org icon "exists" 2023-08-15 23:49:54 -04:00
Ross Williams
46e80dd509 Rename URL_(WHITE|BLACK)LIST to URL_(ALLOW|DENY)LIST
Retain aliases for old configuration files
2023-08-02 09:31:48 -04:00
Micah R Ledbetter
1e50ca243e Add FAVICON_PROVIDER option for custom favicon service 2023-05-05 20:42:36 -05:00
Nick Sweeting
8ebf3e2f93 add config option PREVIEW_ORIGINALS to hide original iframes in snapshot detail pages 2022-05-09 19:31:41 -07:00
hannah98
fc3d2bb4dc rename TAG_SEPARATORS to TAG_SEPARATOR_PATTERN 2022-01-06 14:14:41 +00:00
hannah98
049f88def9 Added TAG_SEPARATORS option to supply a regex of characters to use when splitting tags 2021-12-30 20:19:48 +00:00
Nick Sweeting
d7f01922f3
fix direct assignment of tags to many-to-many set 2021-12-23 12:29:17 -05:00
Nick Sweeting
b1b7ee2b85
Update sql.py 2021-12-23 12:17:55 -05:00
hannah98
4b8962b60b Fix #725 - correctly parse tags on json import 2021-12-20 08:58:58 -06:00
Nick Sweeting
5a2c78e14b add proper support for URL_WHITELIST instead of using negation regexes 2021-07-06 23:42:00 -04:00
Nick Sweeting
e4974d3536 support negation patterns by checking both re.search and re.match 2021-07-06 23:17:05 -04:00
Nick Sweeting
2c6f0a96bf fix extra arg 2021-04-13 02:21:51 -04:00
Nick Sweeting
50b341baab bail out if old index.json is found during init but doesnt contain links 2021-04-12 16:51:45 -04:00
Nick Sweeting
a9986f1f05 add timezone support, tons of CSS and layout improvements, more detailed snapshot admin form info, ability to sort by recently updated, better grid view styling, better table layouts, better dark mode support 2021-04-10 04:21:36 -04:00
Nick Sweeting
59d5423483 fix snapshot icon caching and ordering 2021-04-01 02:22:15 -04:00
Nick Sweeting
36f0646501
Merge pull request #669 from FliegendeWurst/fix-issue-235
add command: --parser option (fixes #235)
2021-03-31 00:53:47 -04:00
FliegendeWurst
60bd9a902e add command: --parser option 2021-03-28 10:09:11 +02:00
Nick Sweeting
1cabde3ccd remove atomic transactions 2021-02-28 22:54:40 -05:00
Nick Sweeting
46a4197514 fix tests 2021-02-18 04:26:56 -05:00
Nick Sweeting
75e1bfd0a9 create_or_update ArchiveResults from history instead of get_or_create 2021-02-18 02:34:20 -05:00
Nick Sweeting
265bcc0264 fix lint errors2 2021-02-16 16:29:41 -05:00