Nick Sweeting
|
5a2c78e14b
|
add proper support for URL_WHITELIST instead of using negation regexes
|
2021-07-06 23:42:00 -04:00 |
|
Nick Sweeting
|
e4974d3536
|
support negation patterns by checking both re.search and re.match
|
2021-07-06 23:17:05 -04:00 |
|
Nick Sweeting
|
36f0646501
|
Merge pull request #669 from FliegendeWurst/fix-issue-235
add command: --parser option (fixes #235)
|
2021-03-31 00:53:47 -04:00 |
|
FliegendeWurst
|
60bd9a902e
|
add command: --parser option
|
2021-03-28 10:09:11 +02:00 |
|
Nick Sweeting
|
33d180afe7
|
allow filtering snapshots by timestamp in list, update, and remove cmds
|
2021-02-15 20:48:35 -05:00 |
|
Nick Sweeting
|
d6de04a83a
|
fix lgtm errors
|
2021-01-30 06:07:35 -05:00 |
|
Tim Gates
|
7bf63d91ff
|
docs: fix simple typo, timstamp -> timestamp
There is a small typo in archivebox/index/__init__.py.
Should read `timestamp` rather than `timstamp`.
|
2021-01-06 20:03:40 +02:00 |
|
Cristian
|
ce53b0220c
|
refactor: Remove setup_django from index
|
2020-12-11 17:36:31 -05:00 |
|
Cristian
|
a28547cbca
|
refactor: Remove get_empty_snapshot queryset function and generate it directly
|
2020-12-11 16:27:15 -05:00 |
|
JDC
|
4eeedae815
|
Exception handling for indexing and searching
|
2020-12-06 01:13:39 +02:00 |
|
JDC
|
0f7dba07df
|
feat: add search filter-type to list command
|
2020-12-06 01:13:37 +02:00 |
|
Nick Sweeting
|
7bc13204e6
|
Merge branch 'master' into v0.5.0
|
2020-12-05 17:45:16 -05:00 |
|
Hawken Rives
|
7299b1f5ae
|
fix "inconsisntencies" typo in error message
|
2020-12-02 16:28:26 -06:00 |
|
Cristian
|
fa5de72f9f
|
refactor: Move indexing logic out of logging module
|
2020-11-28 12:34:40 -05:00 |
|
Nick Sweeting
|
c9162a6d09
|
remove finished/not finished spinners
|
2020-11-28 01:07:02 -05:00 |
|
JDC
|
cbb3d04c12
|
Allow list filtering by tag name
|
2020-11-13 12:06:12 -05:00 |
|
Cristian
|
572b46cecf
|
lint: Remove unused imports
|
2020-10-23 06:45:56 -05:00 |
|
Cristian
|
ae1484b8bf
|
feat: Remove index.json and index.html generation from the regular process
|
2020-10-23 06:45:56 -05:00 |
|
Angel Rey
|
ad04fb5300
|
Replaced os.path in init index
|
2020-10-02 15:46:39 -05:00 |
|
Cristian
|
b18bbf8874
|
test: Fix tests post-rebase
|
2020-09-17 09:09:52 -05:00 |
|
apkallum
|
b99784b919
|
pathlib with / syntax for config, index
|
2020-09-17 09:09:52 -05:00 |
|
apkallum
|
594d9e49ce
|
first attempt to migrate to Pathlib
|
2020-09-17 09:09:52 -05:00 |
|
Cristian Vargas
|
5e9b3099c6
|
Update fix_duplicate_links_in_index docstring
Co-authored-by: Nick Sweeting <git@sweeting.me>
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
f55153eab3
|
feat: Update update command to work with querysets
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
fe9604a772
|
feat: Add tests for remove command
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
a8ed72501d
|
feat: Refactor remove command to use querysets
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
be520d137a
|
feat: Refactor add method to use querysets
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
be0dff8126
|
feat: Add tests to refactored init command
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
404f333e17
|
feat: Refactor get_invalid_folders to work with a queryset instead of a list of links
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
6b4b7127b4
|
feat: Remove unused imports
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
b8585dd92e
|
feat: load_main_index returns a queryset now
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
c16fdf1b47
|
feat: Update data folder check
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
874403e667
|
feat: Remove patch_main_index
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
31343c1367
|
feat: Update extractors and add command to use sql index as source of truth
|
2020-09-15 08:05:46 -05:00 |
|
Cristian
|
02f36b2096
|
feat: Replace index.json with index.sql as the main index in init
|
2020-09-15 08:05:46 -05:00 |
|
Nick Sweeting
|
e87f1d57a3
|
fix linters
|
2020-08-18 09:22:12 -04:00 |
|
Nick Sweeting
|
f18d92570e
|
wip attempt to fix timestamp unique constraint errors
|
2020-08-18 08:30:09 -04:00 |
|
Nick Sweeting
|
15efb2d5ed
|
new generic_html parser for extracting hrefs
|
2020-08-18 08:29:05 -04:00 |
|
Nick Sweeting
|
5f84a7bc6e
|
better handle the case where json index lags behind sql index
|
2020-08-18 08:13:13 -04:00 |
|
Nick Sweeting
|
77d2f08a5c
|
show more info in merge conflict error message
|
2020-08-18 08:12:35 -04:00 |
|
Nick Sweeting
|
f371032b71
|
show warning when killing archivebox during index writing
|
2020-08-18 04:38:29 -04:00 |
|
Nick Sweeting
|
225b63b732
|
skip invalid urls at all stages
|
2020-08-17 03:12:17 -04:00 |
|
Cristian
|
c073ea141d
|
feat: Initial oneshot command proposal
|
2020-07-29 11:19:06 -05:00 |
|
Cristian
|
6006b4f93b
|
refactor: Organize code to remove flake8 issues
|
2020-07-24 12:25:25 -05:00 |
|
Cristian
|
100fa5d1f5
|
fix: Guess timestamps and add placeholders to support older indices
|
2020-07-24 09:24:52 -05:00 |
|
Nick Sweeting
|
02a2fefbba
|
Merge pull request #385 from apkallum/origin/output-permissions
|
2020-07-23 11:52:31 -04:00 |
|
apkallum
|
0ed2a23670
|
ensure correct permissions for output folder
|
2020-07-23 10:28:10 -04:00 |
|
Cristian
|
71f5f03a20
|
fix: Add notice for issues with index detail
|
2020-07-22 17:08:32 -05:00 |
|
Cristian
|
a5550b2105
|
fix: Rename logging folder to avoid naming conflicts (and circular import issues)
|
2020-07-22 11:02:13 -05:00 |
|
Cristian
|
f4d1b5121e
|
refactor: Move logging.py to main module to avoid circular import issues
|
2020-07-17 18:00:04 -05:00 |
|