ArchiveBox

mirror of https://github.com/ArchiveBox/ArchiveBox synced 2024-11-23 04:33:11 +00:00

History

Nick Sweeting 99bb02cd6c add fallback to check wget output dir with port stripped		2024-01-19 03:34:07 -08:00
..
__init__.py	config.py lint fixes	2023-11-14 02:07:35 -08:00
archive_org.py	enforce utf8 on literally all file operations because windows sucks	2021-03-27 01:16:29 -04:00
dom.py	After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file.	2023-08-28 17:27:03 +02:00
favicon.py	Add FAVICON_PROVIDER option for custom favicon service	2023-05-05 20:42:36 -05:00
git.py	Refactor `should_save_extractor` methods to accept `overwrite` parameter	2021-01-21 15:56:32 -06:00
headers.py	Refactor `should_save_extractor` methods to accept `overwrite` parameter	2021-01-21 15:56:32 -06:00
htmltotext.py	Add htmltotext extractor	2023-10-23 21:42:32 -04:00
media.py	Don't be strict on unicode errors	2022-09-12 20:40:45 +00:00
mercury.py	improve readability and mercury error handling and fix output path to be relative	2021-02-16 15:53:11 -05:00
pdf.py	After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file.	2023-08-28 17:27:03 +02:00
readability.py	tag URLs immediately once added instead of waiting until archival completes	2024-01-03 20:31:46 -08:00
screenshot.py	After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file.	2023-08-28 17:27:03 +02:00
singlefile.py	add CHROME_TIMEOUT args	2023-03-14 20:29:41 +09:00
title.py	prefer dom dump to singlefile for generating readability output	2024-01-03 20:11:06 -08:00
wget.py	add fallback to check wget output dir with port stripped	2024-01-19 03:34:07 -08:00