mlazana
|
4d10568477
|
exclude links that are in blacklist
|
2019-03-24 14:40:26 +02:00 |
|
mlazana
|
417ee9e302
|
add env variable URL_BLACKLIST
|
2019-03-23 21:27:41 +02:00 |
|
Nick Sweeting
|
f7a0568a6c
|
better chrome options loading
|
2019-03-22 23:00:53 -04:00 |
|
Nick Sweeting
|
4c499d77b6
|
move dependency checking into config file
|
2019-03-22 22:05:45 -04:00 |
|
Nick Sweeting
|
69f837bbf6
|
simplify chrome_user_data_dir default
|
2019-03-22 21:37:02 -04:00 |
|
Nick Sweeting
|
8f73fdbe09
|
fix chrome profile precedence order to be equal
|
2019-03-22 21:31:55 -04:00 |
|
noncetonic
|
28758cf16c
|
Adds CHROME_USER_AGENT
|
2019-03-19 10:15:52 -07:00 |
|
Nick Sweeting
|
1c1bc76ac1
|
add chrome headless option and improve default data dir finding
|
2019-03-12 17:50:10 -04:00 |
|
Nick Sweeting
|
8319ccf064
|
add docs link to config.py
|
2019-03-12 12:48:46 -04:00 |
|
Nick Sweeting
|
32c39d0fd0
|
cleaner output dir spec in config
|
2019-03-08 17:51:49 -05:00 |
|
Nick Sweeting
|
2e10f57f6e
|
fix relative links from index files
|
2019-03-08 17:46:14 -05:00 |
|
Nick Sweeting
|
a74d8410f4
|
also check for macOS binary defaults
|
2019-03-08 16:25:42 -05:00 |
|
Nick Sweeting
|
d689264365
|
add new config and dependency options
|
2019-02-21 15:47:15 -05:00 |
|
Nick Sweeting
|
d52c9c5304
|
allow passing COOKIES_FILE to wget
|
2019-02-21 12:58:51 -05:00 |
|
Nick Sweeting
|
5a7d00a639
|
fetch page title during archiving process
|
2019-02-19 01:44:54 -05:00 |
|
Nick Sweeting
|
9eb79258bb
|
check chrome version on startup if using chrome
|
2019-01-29 17:08:15 -08:00 |
|
Nick Sweeting
|
e60070dbb2
|
add youtubedl to help str
|
2019-01-25 17:38:47 -08:00 |
|
Nick Sweeting
|
20de451515
|
add linux config example
|
2019-01-23 01:48:04 -05:00 |
|
Nick Sweeting
|
95301c9306
|
better default config
|
2019-01-23 01:42:55 -05:00 |
|
Nick Sweeting
|
e1be96e597
|
working docker-compose with google chrome
|
2019-01-23 01:08:23 -05:00 |
|
Nick Sweeting
|
f25be8bc24
|
add chrome user agent example in config
|
2019-01-20 14:08:33 -05:00 |
|
Nick Sweeting
|
cc8611de83
|
Update config.py
|
2019-01-14 22:46:35 -05:00 |
|
Nick Sweeting
|
c42fcd42d7
|
fetch warc file inline with wget instead of as separate step
|
2019-01-14 22:43:14 -05:00 |
|
Nick Sweeting
|
300b5c6182
|
put ArchiveBox and wget version in user agent
|
2019-01-14 18:17:30 -05:00 |
|
Nick Sweeting
|
cb60bad1d7
|
disable WARC by default
|
2019-01-12 20:19:17 -05:00 |
|
Nick Sweeting
|
b650c663a0
|
re-enable checking SSL by default
|
2019-01-11 22:48:09 -05:00 |
|
Nick Sweeting
|
f83750c545
|
cleanup options and make cli flags better for chrome headless timeouts
|
2019-01-11 22:38:50 -05:00 |
|
Nick Sweeting
|
e8808b0a1f
|
add WARC downloading
|
2019-01-11 07:02:49 -05:00 |
|
Nick Sweeting
|
a15a331798
|
fix media download with longer timeout
|
2019-01-11 06:33:35 -05:00 |
|
Nick Sweeting
|
c33f7ba91c
|
add ability to fetch media
|
2019-01-11 05:52:29 -05:00 |
|
Nick Sweeting
|
0df098717a
|
make git domains configurable
|
2019-01-11 05:27:25 -05:00 |
|
Nick Sweeting
|
827e15b31a
|
add git downloading
|
2019-01-11 05:18:49 -05:00 |
|
Nick Sweeting
|
5b6c768a47
|
autodetect common chrome binary locations
|
2019-01-11 04:34:16 -05:00 |
|
Nick Sweeting
|
57d42339a4
|
rename pip dir archive to archivebox
|
2018-12-31 20:53:01 -05:00 |
|