Update README.md

This commit is contained in:
Nick Sweeting 2021-04-08 07:53:20 -04:00 committed by GitHub
parent d37aad4045
commit 1224cd197e
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -50,30 +50,29 @@ At the end of the day, the goal is to sleep soundly knowing the part of the inte
<br/> <br/>
**📦&nbsp; First, get ArchiveBox using [Docker Compose (recommended)](#Quickstart), or Docker, Apt, Brew, Pip (see below for [instructions for each OS](#Quickstart)).** **📦&nbsp; First, get ArchiveBox using [Docker Compose (recommended)](#Quickstart), or Docker, Apt, Brew, Pip ([see the instructions below for your OS](#Quickstart)).**
*No matter which install method you choose, they all roughly follow this process and all provide the same CLI, Web UI, and data folder layout.* *No matter which setup method you choose, they all follow this basic process and provide the same CLI, Web UI, and on-disk data layout.*
1. Once you have ArchiveBox, run this in a new empty folder to get started 1. Run this in a new empty folder to get started
```bash ```bash
archivebox init --setup # create a new collection in the current directory archivebox init --setup # create a new collection in the current directory
``` ```
2. Add some URLs you want to archive 2. Add some URLs you want to archive
```bash ```bash
archivebox add 'https://example.com' # add URLs one at a time via args or piped stdin archivebox add 'https://example.com' # add URLs one at a time via args / piped stdin
archivebox schedule --every=day --depth=1 https://example.com/rss.xml # or have it import URLs on a schedule
archivebox schedule --every=day --depth=1 https://example.com/rss.xml # or pull in URLs on a schedule
``` ```
<sup>For each URL added, ArchiveBox saves several types of HTML snapshot (wget, Chrome headless, singlefile), a PDF, a screenshot, a WARC archive, git repositories, images, audio, video, subtitles, article text, and more.</sup> <sup>ArchiveBox will save HTML snapshots (w/ wget, Chrome headless, singlefile), a PDF, a screenshot, a WARC archive, article text, images, audio/video, subtitles, git repos, and more.</sup>
3. Then view your archived pages 3. Then view your archived pages
```bash ```bash
archivebox server 0.0.0.0:8000 # use the interactive web UI archivebox server 0.0.0.0:8000 # use the interactive web UI
archivebox list 'https://example.com' # use the CLI commands (--help for more) archivebox list 'https://example.com' # use the CLI commands (--help for more)
ls ./archive/*/index.json # or browse directly via the filesystem ls ./archive/*/index.json # or browse directly via the filesystem
``` ```
**⤵️ See the [Quickstart](#Quickstart) below for more...** **⤵️ See the [Quickstart](#Quickstart) below for more...**