Commit graph

3 commits

Author SHA1 Message Date
Nick Sweeting
457c42bf84
load EXTRACTORS dynamically using importlib.import_module 2024-05-11 22:28:59 -07:00
Nick Sweeting
6a4e568d1b new archivebox update speed improvements 2024-02-22 04:50:22 -08:00
Ross Williams
310b4d1242 Add htmltotext extractor
Saves HTML text nodes and selected element attributes in
`htmltotext.txt` for each Snapshot. Primarily intended to be used
for search indexing.
2023-10-23 21:42:32 -04:00