ArchiveBox/archivebox/extractors
2023-08-28 17:27:03 +02:00
..
__init__.py just use out_dir 2023-05-29 10:03:49 +02:00
archive_org.py enforce utf8 on literally all file operations because windows sucks 2021-03-27 01:16:29 -04:00
dom.py After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file. 2023-08-28 17:27:03 +02:00
favicon.py Add FAVICON_PROVIDER option for custom favicon service 2023-05-05 20:42:36 -05:00
git.py Refactor should_save_extractor methods to accept overwrite parameter 2021-01-21 15:56:32 -06:00
headers.py Refactor should_save_extractor methods to accept overwrite parameter 2021-01-21 15:56:32 -06:00
media.py Don't be strict on unicode errors 2022-09-12 20:40:45 +00:00
mercury.py improve readability and mercury error handling and fix output path to be relative 2021-02-16 15:53:11 -05:00
pdf.py After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file. 2023-08-28 17:27:03 +02:00
readability.py remove unused import 2022-02-09 10:48:51 +08:00
screenshot.py After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file. 2023-08-28 17:27:03 +02:00
singlefile.py add CHROME_TIMEOUT args 2023-03-14 20:29:41 +09:00
title.py improve title extractor 2022-02-08 23:17:52 +08:00
wget.py add timezone support, tons of CSS and layout improvements, more detailed snapshot admin form info, ability to sort by recently updated, better grid view styling, better table layouts, better dark mode support 2021-04-10 04:21:36 -04:00