Dima Gerasimov
75c062f33e
Add script to remove entries from index
2018-11-09 20:12:37 +00:00
Nick Sweeting
a2f5fa8ba6
Use a more appropriate coding style from @pirate.
...
Co-Authored-By: f0086 <mail@aaron-fischer.net>
2018-10-24 21:10:41 +02:00
Aaron Fischer
ebc327bb89
Make O(n^2) loop to an O(n) problem.
2018-10-21 22:36:32 +02:00
Aaron Fischer
b1b6be4f13
merge_links() used wrong index
...
Because merge_links() use the index, we need to get the new_links() _before_ we manipulate the index with write_links_index(). This has the negative side effect that the "Adding X new links ..." will output twice (because we execute merge_links() twice. For that, we only output stuff when the only_new is not set.
2018-10-19 22:35:08 +02:00
Aaron Fischer
69c007ce85
Optionally import only new links
...
When importing a huge list of links periodically (from a big dump of
links from a bookmark service for example) with a lot of broken links,
this links will always be rechecked. To skip this, the environment
variable ONLY_NEW can be used to only import new links and skip the rest
altogether. This partially fixes #95 .
2018-10-19 21:34:57 +02:00
William Esz
a59d609571
Fix archive_dot_org submit_url
...
It was removing functional query parameters. (e.g., https://news.ycombinator.com/item?id=18216459 )
2018-10-15 13:09:31 +02:00
William Esz
8b850393df
Fix archive_dot_org CMD
...
`curl -I {url}` returns 404
2018-10-15 13:07:20 +02:00
Nick Sweeting
46ad4fd163
fix python io encoding
2018-10-13 22:12:31 -04:00
Nick Sweeting
6c6bdaa3d7
add chrome sandbox option
2018-10-13 22:12:26 -04:00
Nick Sweeting
a6650dfca0
move requirements down a level
2018-10-12 23:48:15 -04:00
Christian Kollmann
fbc90b4279
Enable importing files from wallabag
2018-10-08 18:45:51 +02:00
Pig Monkey
7ed4f8deed
support a configurable output directory
...
Closes #94
2018-09-21 17:41:11 -07:00
Florian Tham
5450afd18b
fixes unstable sorting between consecutive runs
2018-09-15 00:08:59 +02:00
Nick Sweeting
8a23358fc8
create robots.txt in output dir
2018-09-12 19:26:00 -04:00
Nick Sweeting
ff10253cef
fix user agent breaking all wgets
2018-09-10 22:31:19 -04:00
Nick Sweeting
735f530516
hide scrollbars in screenshots
2018-06-17 19:09:09 -04:00
Nick Sweeting
738513ead8
regigger wget exception handling order
2018-06-17 19:09:01 -04:00
Nick Sweeting
70530060c2
make ts naming consistent
2018-06-17 18:35:09 -04:00
Nick Sweeting
062d2ddc98
fix archive_url broken on first run
2018-06-17 18:32:52 -04:00
Nick Sweeting
47c3d563b2
tweak index columns and footer links
2018-06-17 18:32:42 -04:00
Nick Sweeting
16b6e0b428
flip collapse and return to archive buttons
2018-06-17 18:16:12 -04:00
Nick Sweeting
c4e0af84e7
add default wget user agent
2018-06-17 18:05:25 -04:00
Nick Sweeting
cad622a137
re-arrange index columns
2018-06-17 17:51:28 -04:00
Nick Sweeting
aa5a674a17
add new migrate_data step to move old folder
2018-06-10 23:01:56 -04:00
Nick Sweeting
755845c69a
use latest instead of deriving wget path
2018-06-10 22:43:01 -04:00
Nick Sweeting
5498822a97
fix parsing of chrome and ff histories
2018-06-10 22:13:56 -04:00
Nick Sweeting
9ec1f81bd5
add author and version
2018-06-10 22:02:33 -04:00
Nick Sweeting
b5e2ed1d46
pretty_path the source and index paths in stdout
2018-06-10 22:00:31 -04:00
Nick Sweeting
d6354ac93f
rearrange files again
2018-06-10 21:58:48 -04:00
Nick Sweeting
19ade54668
move examples to tests
2018-06-10 21:50:09 -04:00
Nick Sweeting
e74227569e
fix timestamp parsing edgecase
2018-06-10 21:27:23 -04:00
Nick Sweeting
c90f4bfd5b
cleanup ARCHIVE_DIR paths
2018-06-10 21:26:11 -04:00
Nick Sweeting
46ea65d4f2
remove DS_Store files
2018-06-10 21:22:19 -04:00
Nick Sweeting
d2d1b977fe
log wget 500 errors
2018-06-10 21:14:46 -04:00
Nick Sweeting
c1c689cb94
log wget 404 and 403 errors
2018-06-10 21:13:07 -04:00
Nick Sweeting
a287900345
fix template locations
2018-06-10 21:12:55 -04:00
Nick Sweeting
d0f2e693b3
re-arrange and cleanup directory structure
2018-06-10 20:52:15 -04:00