2019-01-10 18:59:37 +00:00
![Logo ](https://i.imgur.com/PVO88AZ.png )
2019-01-10 18:51:14 +00:00
# ArchiveBox <br/> The open source self-hosted web archive <img src="https://nicksweeting.com/images/archive.png" height="22px"/> [![Github Stars](https://img.shields.io/github/stars/pirate/bookmark-archiver.svg)](https://github.com/pirate/ArchiveBox) [![Twitter URL](https://img.shields.io/twitter/url/http/shields.io.svg?style=social)](https://twitter.com/thesquashSH)
2017-06-30 18:17:41 +00:00
2018-12-21 23:44:22 +00:00
### (Recently [renamed](https://github.com/pirate/ArchiveBox/issues/108) from `Bookmark Archiver`)
2018-10-23 16:54:03 +00:00
2017-10-30 11:29:25 +00:00
"Your own personal Way-Back Machine"
2019-01-01 03:31:36 +00:00
💻 [Demo ](https://archive.sweeting.me ) | [Website ](https://archivebox.io/ ) | [Source ](https://github.com/pirate/ArchiveBox/tree/master ) | [Changelog ](https://github.com/pirate/ArchiveBox/wiki/Changelog ) | [Roadmap ](https://github.com/pirate/ArchiveBox/wiki/Roadmap )
2017-10-31 23:28:11 +00:00
2019-01-01 01:15:15 +00:00
▶️ [Quickstart ](https://github.com/pirate/ArchiveBox/wiki/Quickstart ) | [Details ](https://github.com/pirate/ArchiveBox/wiki ) | [Configuration ](https://github.com/pirate/ArchiveBox/wiki/Configuration ) | [Troubleshooting ](https://github.com/pirate/ArchiveBox/wiki/Troubleshooting )
2018-12-23 00:01:12 +00:00
2017-10-31 23:28:11 +00:00
---
2019-01-11 12:56:24 +00:00
ArchiveBox saves an archived copy of the websites you visit into a local browsable folder (the actual *content* of each site, not just the list of links). It can archive your entire browsing history, or import links from bookmarks managers, rss, text files and more.
2017-06-30 18:17:41 +00:00
2019-01-11 12:56:24 +00:00
### Can import links from:
2018-06-10 22:32:51 +00:00
2018-09-20 16:32:41 +00:00
- < img src = "https://nicksweeting.com/images/bookmarks.png" height = "22px" /> Browser history or bookmarks (Chrome, Firefox, Safari, IE, Opera)
2017-10-30 11:29:25 +00:00
- < img src = "https://getpocket.com/favicon.ico" height = "22px" /> Pocket
- < img src = "https://pinboard.in/favicon.ico" height = "22px" /> Pinboard
2018-05-31 06:26:41 +00:00
- < img src = "https://nicksweeting.com/images/rss.svg" height = "22px" /> RSS or plain text lists
2017-10-30 11:29:25 +00:00
- Shaarli, Delicious, Instapaper, Reddit Saved Posts, Wallabag, Unmark.it, and more!
2018-06-10 22:32:51 +00:00
2019-01-11 12:56:24 +00:00
### Can save these things for each site:
2018-06-10 22:32:51 +00:00
2019-01-11 12:56:24 +00:00
- Favicon
2018-06-10 22:32:51 +00:00
- Browsable static HTML archive (wget)
- PDF (Chrome headless)
- Screenshot (Chrome headless)
2019-01-11 12:56:24 +00:00
- HTML DUMP after 2s of JS running in Chrome headless
- Git repo download (git clone)
- Media download (youtube-dl: video, audio, subtitles, including playlists)
- WARC archive (wget warc)
2018-06-10 22:32:51 +00:00
- Submits URL to archive.org
- Index summary pages: index.html & index.json
2017-05-05 09:15:19 +00:00
2018-06-10 22:32:51 +00:00
The archiving is additive, so you can schedule `./archive` to run regularly and pull new links into the index.
2019-01-11 12:56:24 +00:00
All the saved content is static and indexed with JSON files, so it lives forever & is easily parseable, it requires no always-running backend.
2017-07-04 10:57:42 +00:00
2019-01-15 23:35:11 +00:00
[DEMO: archive.sweeting.me ](https://archive.sweeting.me ) 网站存档 / 爬虫
2017-10-30 11:29:36 +00:00
2019-01-10 19:03:41 +00:00
```bash
git clone https://github.com/pirate/ArchiveBox.git
2019-01-10 19:06:34 +00:00
cd ArchiveBox
2019-01-10 19:03:41 +00:00
./setup
# Export your bookmarks, then run the archive command to start archiving!
./archive ~/Downloads/firefox_bookmarks.html
2019-01-14 23:19:30 +00:00
# Or to add just one page to your archive
echo 'https://example.com' | ./archive
2019-01-10 19:03:41 +00:00
```
2019-01-01 01:16:16 +00:00
# Documentation
We use the [Github wiki system ](https://github.com/pirate/ArchiveBox/wiki ) for documentation.
2019-01-01 02:01:20 +00:00
You can also access the docs locally by looking in the [`ArchiveBox/docs/` ](https://github.com/pirate/ArchiveBox/wiki/Home ) folder.
2019-01-01 01:16:16 +00:00
## Getting Started
2017-05-05 09:10:50 +00:00
2019-01-01 01:12:17 +00:00
- [Details & Motivation ](https://github.com/pirate/ArchiveBox/wiki )
- [Quickstart ](https://github.com/pirate/ArchiveBox/wiki/Quickstart )
- [Install ](https://github.com/pirate/ArchiveBox/wiki/Install )
2017-06-30 05:57:20 +00:00
2019-01-01 01:16:16 +00:00
## Reference
2017-06-30 05:57:20 +00:00
2019-01-15 00:12:08 +00:00
- [Usage ](https://github.com/pirate/ArchiveBox/wiki/Usage )
2019-01-01 01:12:17 +00:00
- [Configuration ](https://github.com/pirate/ArchiveBox/wiki/Configuration )
- [Chromium Install ](https://github.com/pirate/ArchiveBox/wiki/Chromium-Install )
- [Publishing Your Archive ](https://github.com/pirate/ArchiveBox/wiki/Publishing-Your-Archive )
- [Troubleshooting ](https://github.com/pirate/ArchiveBox/wiki/Troubleshooting )
2017-06-30 05:57:20 +00:00
2019-01-01 01:16:16 +00:00
## More Info
2017-06-30 05:57:20 +00:00
2019-01-01 01:12:17 +00:00
- [Roadmap ](https://github.com/pirate/ArchiveBox/wiki/Roadmap )
- [Changelog ](https://github.com/pirate/ArchiveBox/wiki/Changelog )
- [Donations ](https://github.com/pirate/ArchiveBox/wiki/Donations )
- [Web Archiving Community ](https://github.com/pirate/ArchiveBox/wiki/Web-Archiving-Community )
2019-01-01 02:04:41 +00:00
# Screenshots
< img src = "https://i.imgur.com/q3Oz9wN.png" width = "75%" alt = "Desktop Screenshot" align = "top" > < img src = "https://i.imgur.com/TG0fGVo.png" width = "25%" alt = "Mobile Screenshot" align = "top" > < br / >
< img src = "https://i.imgur.com/3tBL7PU.png" width = "100%" alt = "CLI Screenshot" >
2019-01-11 12:57:46 +00:00
---
[![ ](https://img.shields.io/badge/Donate-Patreon-%23DD5D76.svg )](https://www.patreon.com/theSquashSH)[![Twitter URL](https://img.shields.io/twitter/url/http/shields.io.svg?style=social)](https://twitter.com/thesquashSH)