Commit graph

37 commits

Author SHA1 Message Date
Dima Gerasimov
75c062f33e Add script to remove entries from index 2018-11-09 20:12:37 +00:00
Nick Sweeting
a2f5fa8ba6 Use a more appropriate coding style from @pirate.
Co-Authored-By: f0086 <mail@aaron-fischer.net>
2018-10-24 21:10:41 +02:00
Aaron Fischer
ebc327bb89 Make O(n^2) loop to an O(n) problem. 2018-10-21 22:36:32 +02:00
Aaron Fischer
b1b6be4f13
merge_links() used wrong index
Because merge_links() use the index, we need to get the new_links() _before_ we manipulate the index with write_links_index(). This has the negative side effect that the "Adding X new links ..." will output twice (because we execute merge_links() twice. For that, we only output stuff when the only_new is not set.
2018-10-19 22:35:08 +02:00
Aaron Fischer
69c007ce85 Optionally import only new links
When importing a huge list of links periodically (from a big dump of
links from a bookmark service for example) with a lot of broken links,
this links will always be rechecked. To skip this, the environment
variable ONLY_NEW can be used to only import new links and skip the rest
altogether. This partially fixes #95.
2018-10-19 21:34:57 +02:00
William Esz
a59d609571 Fix archive_dot_org submit_url
It was removing functional query parameters. (e.g., https://news.ycombinator.com/item?id=18216459)
2018-10-15 13:09:31 +02:00
William Esz
8b850393df Fix archive_dot_org CMD
`curl -I {url}` returns 404
2018-10-15 13:07:20 +02:00
Nick Sweeting
46ad4fd163 fix python io encoding 2018-10-13 22:12:31 -04:00
Nick Sweeting
6c6bdaa3d7 add chrome sandbox option 2018-10-13 22:12:26 -04:00
Nick Sweeting
a6650dfca0 move requirements down a level 2018-10-12 23:48:15 -04:00
Christian Kollmann
fbc90b4279 Enable importing files from wallabag 2018-10-08 18:45:51 +02:00
Pig Monkey
7ed4f8deed support a configurable output directory
Closes #94
2018-09-21 17:41:11 -07:00
Florian Tham
5450afd18b fixes unstable sorting between consecutive runs 2018-09-15 00:08:59 +02:00
Nick Sweeting
8a23358fc8 create robots.txt in output dir 2018-09-12 19:26:00 -04:00
Nick Sweeting
ff10253cef fix user agent breaking all wgets 2018-09-10 22:31:19 -04:00
Nick Sweeting
735f530516 hide scrollbars in screenshots 2018-06-17 19:09:09 -04:00
Nick Sweeting
738513ead8 regigger wget exception handling order 2018-06-17 19:09:01 -04:00
Nick Sweeting
70530060c2 make ts naming consistent 2018-06-17 18:35:09 -04:00
Nick Sweeting
062d2ddc98 fix archive_url broken on first run 2018-06-17 18:32:52 -04:00
Nick Sweeting
47c3d563b2 tweak index columns and footer links 2018-06-17 18:32:42 -04:00
Nick Sweeting
16b6e0b428 flip collapse and return to archive buttons 2018-06-17 18:16:12 -04:00
Nick Sweeting
c4e0af84e7 add default wget user agent 2018-06-17 18:05:25 -04:00
Nick Sweeting
cad622a137 re-arrange index columns 2018-06-17 17:51:28 -04:00
Nick Sweeting
aa5a674a17 add new migrate_data step to move old folder 2018-06-10 23:01:56 -04:00
Nick Sweeting
755845c69a use latest instead of deriving wget path 2018-06-10 22:43:01 -04:00
Nick Sweeting
5498822a97 fix parsing of chrome and ff histories 2018-06-10 22:13:56 -04:00
Nick Sweeting
9ec1f81bd5 add author and version 2018-06-10 22:02:33 -04:00
Nick Sweeting
b5e2ed1d46 pretty_path the source and index paths in stdout 2018-06-10 22:00:31 -04:00
Nick Sweeting
d6354ac93f rearrange files again 2018-06-10 21:58:48 -04:00
Nick Sweeting
19ade54668 move examples to tests 2018-06-10 21:50:09 -04:00
Nick Sweeting
e74227569e fix timestamp parsing edgecase 2018-06-10 21:27:23 -04:00
Nick Sweeting
c90f4bfd5b cleanup ARCHIVE_DIR paths 2018-06-10 21:26:11 -04:00
Nick Sweeting
46ea65d4f2 remove DS_Store files 2018-06-10 21:22:19 -04:00
Nick Sweeting
d2d1b977fe log wget 500 errors 2018-06-10 21:14:46 -04:00
Nick Sweeting
c1c689cb94 log wget 404 and 403 errors 2018-06-10 21:13:07 -04:00
Nick Sweeting
a287900345 fix template locations 2018-06-10 21:12:55 -04:00
Nick Sweeting
d0f2e693b3 re-arrange and cleanup directory structure 2018-06-10 20:52:15 -04:00