add bit about live updating

This commit is contained in:
Nick Sweeting 2017-05-05 06:56:12 -04:00 committed by GitHub
parent 031a9ec176
commit b12d7e53bc

View file

@ -8,20 +8,22 @@ Save an archived copy of all websites you star using Pocket, indexed in an html
## Quickstart ## Quickstart
**Runtime:** I've found it takes about an hour to download 1000 articles, and they'll take up roughly 1GB. `archive.py` is a script that takes a [Pocket](https://getpocket.com/export) export, and turns it into a browsable html archive that you can store locally or host online.
Those numbers are from running it on my i5 4-core machine with 50mbps down. YMMV.
**Dependencies:** Google Chrome headless, wget **Runtime:** I've found it takes about an hour to download 1000 articles, and they'll take up roughly 1GB.
Those numbers are from running it signle-threaded on my i5 machine with 50mbps down. YMMV.
**Dependencies:** Google Chrome headless, wget, python3
```bash ```bash
brew install Caskroom/versions/google-chrome-canary brew install Caskroom/versions/google-chrome-canary
brew install wget brew install wget python3
# OR on linux # OR on linux
wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add - wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add -
sudo sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list' sudo sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list'
apt update; apt install google-chrome-beta apt update; apt install google-chrome-beta python3 wget
``` ```
**Archiving:** **Archiving:**
@ -42,6 +44,14 @@ You can tweak parameters like screenshot size, file paths, timeouts, etc. in `ar
You can also tweak the outputted html index in `index_template.html`. It just uses python You can also tweak the outputted html index in `index_template.html`. It just uses python
format strings (not a proper templating engine like jinja2), which is why the CSS is double-bracketed `{{...}}`. format strings (not a proper templating engine like jinja2), which is why the CSS is double-bracketed `{{...}}`.
**Live Updating:** (coming soon)
It's possible to pull links via the pocket API instead of downloading an html export.
Once I write a script to do that, we can stick this in `cron` and have it auto-update on it's own.
For now you just have to download `ril_export.html` and run `archive.py` each time it updates. The script
will run fast subsequent times because it only downloads new links that haven't been archived already.
## Publishing Your Archive ## Publishing Your Archive
The pocket archive is suitable for serving on your personal server, you can upload the pocket The pocket archive is suitable for serving on your personal server, you can upload the pocket