Commit graph

2351 commits

Author SHA1 Message Date
Nick Sweeting
734c99cfdb fix parser bailing out with IndexError 2017-10-30 03:03:31 -05:00
Nick Sweeting
965130815d check if folder exists before trying to cleanup 2017-10-30 02:55:39 -05:00
Nick Sweeting
318b9ae1db finished manual link merging logic to fix folder conflicts 2017-10-30 02:50:37 -05:00
Nick Sweeting
db26ab9aa9 add basic html and fancy link index options 2017-10-30 02:49:59 -05:00
Nick Sweeting
a95912679e refactoring and fancy new link index 2017-10-23 04:58:41 -05:00
Nick Sweeting
1249493fcd better config system 2017-10-23 04:57:34 -05:00
Nick Sweeting
77ab8ebda6 cleanup archive.py 2017-10-23 04:56:21 -05:00
Nick Sweeting
3f3531ed4c fix timestamp uniquification 2017-10-18 19:33:31 -05:00
Nick Sweeting
7e4ca8a6ac link docstring 2017-10-18 19:08:33 -05:00
Nick Sweeting
d74394fa45 encoding bugfixes 2017-10-18 17:47:19 -05:00
Nick Sweeting
eb47155a12 major refactor + ability to handle http downloads 2017-10-18 17:39:09 -05:00
Nick Sweeting
f31da350c5 Update README.md 2017-09-07 17:06:37 -05:00
Nick Sweeting
e0a8ca91b4 add link to reddit exporter 2017-09-07 16:56:35 -05:00
Nick Sweeting
d558f58bb8 add Wallabag instructions 2017-09-04 17:11:28 -05:00
Nick Sweeting
01149fa6ac Merge pull request #45 from klauern/patch-1
ensure encoding check is case-insensitive
2017-09-04 16:40:56 -05:00
Nick Klauer
6bc2be4c70 change case
goodness me...:(
2017-09-04 16:38:43 -05:00
Nick Klauer
e6e3bfead1 ensure encoding check is case-insensitive
encoding will fail if it isn't 'UTF-8', resulting in encodings of 'utf-8' to fail, too... This normalizes it.
2017-09-04 09:47:52 -05:00
Nick Sweeting
4799998884 Update README.md 2017-08-27 20:10:21 -05:00
Nick Sweeting
9d3e3c5c65 add Sheetsee-pocket link 2017-08-27 20:00:49 -05:00
Nick Sweeting
a0c39bc725 Merge pull request #43 from m-rossi/custom_archive_dir
Add custom archive directory as configuration option.
2017-08-22 01:31:21 -04:00
Marco Rossi
17d07239f0 Add custom archive directory as configuration option. 2017-08-20 17:37:49 +02:00
Nick Sweeting
0a75dee583 Merge pull request #42 from nodiscc/patch-2
clarify TIMEOUT option purpose +  archive options
2017-08-03 11:43:18 -05:00
Nick Sweeting
2d4a9c08df Merge pull request #37 from nodiscc/patch-1
README: clarify chrome/ium dependency,
2017-08-03 11:42:22 -05:00
nodiscc
cb6fcf4a2f README: clarify archive method options 2017-08-03 16:29:56 +02:00
nodiscc
e499b23543 clarify TIMEOUT option purpose
This option does not set a timeout  for establishing connections (non-responding hosts):
instead it is the maximum allowed time for a page download.
If the page is large, setting a low timeout value may cause the transfer to abort, even if the transfer speed is good enough.
2017-08-03 16:24:41 +02:00
nodiscc
5dcfa3e42c README: clarify chrome/ium dependency,
Note that the chromium dependency is only required for screenshots and PDF output to work.
Users only wanting to copy the website using `wget` are not required to install it.
Fix Shaarli export link
2017-08-03 14:58:49 +02:00
Nick Sweeting
1e0bb2c854 Update README.md 2017-07-25 23:37:30 -05:00
Nick Sweeting
2c40249fba Update README.md 2017-07-25 14:15:46 -05:00
Nick Sweeting
50a06f5e52 correct mistake about chrome headless sessions 2017-07-22 02:34:17 -05:00
Nick Sweeting
ea1d3541b2 remove jeckl config 2017-07-11 19:43:45 -05:00
Nick Sweeting
700eb63a74 Set theme jekyll-theme-merlot 2017-07-11 19:41:42 -05:00
Nick Sweeting
d8fa4508e8 add google keywords 2017-07-11 19:40:50 -05:00
Nick Sweeting
58de8cb964 Update README.md 2017-07-08 18:42:10 -05:00
Nick Sweeting
e37401d74b Update fetch.py 2017-07-08 12:51:06 -05:00
Nick Sweeting
3d261c4734 Merge pull request #36 from tscs37/patch-1
Quick Fix for Wrong Argument Splitting on wget
2017-07-08 12:49:17 -05:00
Tim Schuster
ba8307c27e Quick Fix for Wrong Argument Splitting on wget
When setting the user agent, the resulting CLI arguments for wget where `['wget', '--timestamping', '--adjust-extension', '--no-parent', '--page-requisites', '--convert-links', '-', '-', 'u', 's', 'e', 'r', '-', 'a', 'g', 'e', 'n', 't', '=', '"', 'L', 'y', 'n', 'x', '"', 'https://example.org']`, with this fix it turns into `['wget', '--timestamping', '--adjust-extension', '--no-parent', '--page-requisites', '--convert-links', '--user-agent="Lynx"', '', 'https://example.org']`
2017-07-08 15:19:43 +02:00
Nick Sweeting
4bc3698ee3 Set theme jekyll-theme-minimal 2017-07-07 14:10:59 -05:00
Nick Sweeting
31924c091e add bookmarks output to gitignore 2017-07-06 17:33:57 -05:00
Nick Sweeting
a7d1213459 add note about logged-in sites 2017-07-06 16:58:53 -05:00
Nick Sweeting
7196486766 add docs for changing WGET_USER_AGENT option 2017-07-06 16:54:07 -05:00
Nick Sweeting
acf59faa06 add custom WGET_USER_AGENT override option 2017-07-06 16:51:41 -05:00
Nick Sweeting
83306391ed enforce utf-8 stdout encoding 2017-07-06 16:44:04 -05:00
Nick Sweeting
648223fc6c wonky exception 2017-07-05 17:30:00 -05:00
Nick Sweeting
5122fa0738 better format wget output 2017-07-05 17:28:09 -05:00
Nick Sweeting
02f711b8cb fix CHROME_BINARY and TIMEOUT configs not being used 2017-07-05 17:26:36 -05:00
Nick Sweeting
0d4ebe9418 show full bookmarked time in tooltip 2017-07-05 17:16:10 -05:00
Nick Sweeting
2265f2aaf0 properly handle querystrings for wget .html appended links 2017-07-05 17:15:56 -05:00
Nick Sweeting
6bb91fbb45 dont hide real exceptions 2017-07-05 17:15:39 -05:00
Nick Sweeting
4db30779a3 remove term Star from vocab 2017-07-05 17:15:17 -05:00
Nick Sweeting
b894e0ff92 properly handle Archive.org denied by robots.txt 2017-07-05 16:57:19 -05:00