papersnake
|
de8e22efb7
|
improve title extractor
|
2022-02-08 23:17:52 +08:00 |
|
Nick Sweeting
|
4715ace7dd
|
ignore BaseException lgtm errors
|
2021-05-31 20:59:05 -04:00 |
|
Nick Sweeting
|
62078a77f8
|
show run duration after each archived link in cli output
|
2021-04-10 07:52:01 -04:00 |
|
Nick Sweeting
|
a9986f1f05
|
add timezone support, tons of CSS and layout improvements, more detailed snapshot admin form info, ability to sort by recently updated, better grid view styling, better table layouts, better dark mode support
|
2021-04-10 04:21:36 -04:00 |
|
Nick Sweeting
|
084cf7ff51
|
add more explanation about snapshot.save timestamp bump
|
2021-02-17 13:34:46 -05:00 |
|
Nick Sweeting
|
c95698e608
|
bump Snapshot.updated time after each extractor, change extractor order
|
2021-02-16 15:52:18 -05:00 |
|
Dan Arnfield
|
5420903102
|
Refactor should_save_extractor methods to accept overwrite parameter
|
2021-01-21 15:56:32 -06:00 |
|
Cristian
|
275ad22db7
|
refactor: Remove skip_index from archive related functions
|
2020-12-08 18:42:25 -05:00 |
|
Cristian
|
f6c73f9aeb
|
fix: Issue with oneshot command
|
2020-12-08 18:42:25 -05:00 |
|
JDC
|
7903db6dfb
|
Add ArchiveResult Manager and sorted indexable filter
|
2020-12-06 01:13:39 +02:00 |
|
JDC
|
b1f70b2197
|
Initial implementation
|
2020-12-06 01:12:45 +02:00 |
|
Cristian
|
33182fd53c
|
fix: Add missing assignation
|
2020-11-04 15:07:45 -05:00 |
|
Cristian
|
d064a3eeff
|
fix: Handle case when update tries to re-add a link that is not in the sql index
|
2020-11-04 15:02:54 -05:00 |
|
Cristian
|
f292cface2
|
fix: Add condition for oneshot when archiving links
|
2020-11-04 14:40:44 -05:00 |
|
Cristian
|
4484491fb7
|
feat: Create ArchiveResult after finishing an extractor process
|
2020-11-04 11:22:55 -05:00 |
|
Angel Rey
|
ce71747538
|
replaced os.path in init extractors
|
2020-10-02 15:46:39 -05:00 |
|
Cristian
|
7d3767b882
|
fix: oneshot command not running extractors
|
2020-09-24 12:56:16 -05:00 |
|
Angel Rey
|
852e3c9cff
|
Added headers extractor
|
2020-09-23 11:07:00 -05:00 |
|
ttimasdf
|
357b677363
|
fix: add mercury-parser to extractors list
|
2020-09-22 18:44:12 -05:00 |
|
Cristian
|
b18bbf8874
|
test: Fix tests post-rebase
|
2020-09-17 09:09:52 -05:00 |
|
Cristian
|
50f3f16203
|
lint: Remove unused import
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
0a83392cbf
|
fix: Replace any typing with Union[Iterable[Link], QuerySet] in archive_links
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
018bd91745
|
refactor: Remove get_iter lambda from archive_links
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
01fb44fd40
|
refactor: Change archive_links check to focus on queryset, so it allows other iterables and not just lists
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
fe9604a772
|
feat: Add tests for remove command
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
be520d137a
|
feat: Refactor add method to use querysets
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
874403e667
|
feat: Remove patch_main_index
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
31343c1367
|
feat: Update extractors and add command to use sql index as source of truth
|
2020-09-15 08:05:46 -05:00 |
|
Nick Sweeting
|
e87f1d57a3
|
fix linters
|
2020-08-18 09:22:12 -04:00 |
|
Nick Sweeting
|
c9b3bab84d
|
fix pull title not working
|
2020-08-18 08:49:26 -04:00 |
|
Nick Sweeting
|
b0c0a676f8
|
re-enable readability and singlefile by default now that its less noisy
|
2020-08-18 08:29:46 -04:00 |
|
Nick Sweeting
|
d7d53cfb12
|
dont show skipped extractors to reduce visual noise
|
2020-08-18 08:13:35 -04:00 |
|
Nick Sweeting
|
b681a477ae
|
add overwrite flag to add command to force re-archiving
|
2020-08-18 04:37:54 -04:00 |
|
Nick Sweeting
|
58e928520a
|
tweak log output for skipped methods
|
2020-08-14 13:12:50 -04:00 |
|
Cristian
|
b7aa3df8d2
|
feat: Disable singlefile and readability by default
|
2020-08-12 14:42:21 -05:00 |
|
Cristian
|
0ec747f64e
|
feat: Look in wget, singlefile or dom outputs before attempting to download the information again
|
2020-08-11 08:37:12 -05:00 |
|
Cristian
|
7e2b249388
|
feat: Initial version of readability extractor
|
2020-08-11 08:37:12 -05:00 |
|
Cristian
|
853685668c
|
feat: Add initial support for singlefile extractor
|
2020-08-03 13:22:06 -05:00 |
|
Cristian
|
e6c571beb2
|
fix: Remove title from extractors for oneshot
|
2020-07-31 10:24:58 -05:00 |
|
Cristian
|
8bcb171e74
|
fix: Remove support for multiple urls in oneshot command
|
2020-07-31 09:05:40 -05:00 |
|
Cristian
|
3afb2401bc
|
fix: Add condition to avoid breaking the add command
|
2020-07-29 11:53:49 -05:00 |
|
Cristian
|
c073ea141d
|
feat: Initial oneshot command proposal
|
2020-07-29 11:19:06 -05:00 |
|
Nick Sweeting
|
2e0b751376
|
accept methods argument to filder archive_link
|
2020-07-28 05:58:38 -04:00 |
|
Nick Sweeting
|
af9084ee95
|
update Snapshot.title to latest_title after fetching
|
2020-07-28 05:55:09 -04:00 |
|
Nick Sweeting
|
943453a9a8
|
pass overwrite properly
|
2020-07-28 05:54:42 -04:00 |
|
Cristian
|
a5550b2105
|
fix: Rename logging folder to avoid naming conflicts (and circular import issues)
|
2020-07-22 11:02:13 -05:00 |
|
Cristian
|
f4d1b5121e
|
refactor: Move logging.py to main module to avoid circular import issues
|
2020-07-17 18:00:04 -05:00 |
|
Nick Sweeting
|
b4ce20cbe5
|
write link details json before and after archiving
|
2020-07-13 11:41:27 -04:00 |
|
Nick Sweeting
|
d3bfa98a91
|
fix depth flag and tweak logging
|
2020-07-13 11:26:34 -04:00 |
|
Nick Sweeting
|
602e141f08
|
fix config file atomic writing bugs
|
2020-06-30 02:04:16 -04:00 |
|