Commit graph

72 commits

Author SHA1 Message Date
Sascha Ißbrücker
7572aa5bc9
Fix auto-tagging when URL includes port (#820) 2024-09-10 21:19:20 +02:00
Sascha Ißbrücker
749bc1ef63
Generate fallback URLs for web archive links (#804)
* generate fallback web archive URL if none exists

* remove fallback web archive snapshot creation

* fix test
2024-08-29 22:45:10 +02:00
Viacheslav Slinko
fa5f78cf71
Automatically add tags to bookmarks based on URL pattern (#736)
* [WIP] DSL

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* dsl2

* full feature

* upd

* upd

* upd

* upd

* rename to auto_tagging_rules

* update migration after rebase

* add REST API tests

* improve settings view

---------

Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@gmail.com>
2024-05-17 09:39:46 +02:00
Sascha Ißbrücker
0f9ba57fef
Load missing thumbnails after enabling the feature (#725) 2024-05-10 09:50:19 +02:00
Viacheslav Slinko
b4376a9ff1
Load bookmark thumbnails after import (#724)
* Update thumbnails after import

* Safer way to download thumbnails

* small test improvements

* add missing tests

---------

Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@gmail.com>
2024-05-10 09:19:00 +02:00
Viacheslav Slinko
87cd4061cb
Add support for bookmark thumbnails (#721)
* Preview Image

* fix tests

* add test

* download preview image

* relative path

* gst

* details view

* fix tests

* Improve preview image styles

* Remove preview image URL from model

* Revert form changes

* update tests

* make it work in uwsgi

---------

Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@gmail.com>
2024-05-07 18:58:52 +02:00
Sascha Ißbrücker
5d2acca122
Allow uploading custom files for bookmarks (#713) 2024-04-20 12:14:11 +02:00
Sascha Ißbrücker
df9f0095cc
Add button for creating missing HTML snapshots (#696)
* add button for creating missing HTML snapshots

* refactor messages in settings view

* show alternative text when there are no missing snapshots
2024-04-14 13:21:15 +02:00
Sascha Ißbrücker
25470edb2c
Remove ads and cookie banners from HTML snapshots (#695)
* integrate ublock with single-file

* reuse chromium profile
2024-04-14 13:09:46 +02:00
pettijohn
2b342c0d56
Add option for passing arguments to single-file command (#691)
* Promoting singlefile timeout to env variable

* Promoting singlefile timeout to env variable

* add tests

* Add LD_SINGLEFILE_OPTIONS support

* add tests

---------

Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@gmail.com>
2024-04-09 20:22:14 +02:00
pettijohn
2d22d6871e
Add option for customizing single-file timeout (#688)
* Promoting singlefile timeout to env variable

* Promoting singlefile timeout to env variable

* add tests

---------

Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@gmail.com>
2024-04-07 20:21:59 +02:00
Sascha Ißbrücker
5e8f5b2c58
Truncate snapshot filename for long URLs (#687) 2024-04-07 18:13:28 +02:00
Sascha Ißbrücker
a6f35119cd
Replace django-background-tasks with huey (#657)
* Replace django-background-tasks with huey

* Add command for migrating tasks

* Add custom admin view

* fix dockerfile

* fix tests

* fix tests in CI

* fix task tests

* implement retries

* improve config

* workaround to avoid running singlefile tasks in parallel

* properly kill single-file sub-processes

* use period task for HTML snapshots

* clear task lock when starting task consumer

* remove obsolete cleanup task command
2024-04-07 11:11:14 +02:00
Sascha Ißbrücker
4280ab40c6
Archive snapshots of websites locally (#672)
* Add basic HTML snapshots

* Implement asset list

* Add snapshot creation tests

* Add deletion tests

* Show file size

* Remove snapshots

* Create new snapshots

* Switch to single-file

* CSS tweak

* Remove auto refresh

* Show delete link when there is no file yet

* Add current date to display name

* Add flag for snapshot support

* Add option for disabling automatic snapshots

* Make snapshots sharable

* Document image variants

* Update README.md

* Add migrations

* Fix tests
2024-04-01 15:19:38 +02:00
Sascha Ißbrücker
98b9a9c1a0 Add black code formatter 2024-01-27 11:29:16 +01:00
Jonathan Sundqvist
150dfecc6f
Support Open Graph description (#602)
* Support pytest for running tests

* Support extracting description from meta og:description property

* Revert changes to TOC

* Add test

---------

Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@gmail.com>
2024-01-27 10:28:46 +01:00
Sascha Ißbrücker
935189ecc2
Improve bulk tag performance (#612) 2024-01-27 09:13:21 +01:00
Sascha Ißbrücker
a9512b2333
Include archived bookmarks in export (#579) 2023-11-24 09:21:23 +01:00
Sascha Ißbrücker
28acf3299c
Add support for exporting/importing bookmark notes (#532) 2023-09-10 23:37:37 +02:00
Sascha Ißbrücker
e2e5930985
Allow bulk editing unread and shared state of bookmarks (#517)
* Move bulk actions into select

* Update tests

* Implement bulk read / unread actions

* Implement bulk share/unshare actions

* Show correct archiving actions

* Allow selecting bookmarks across pages

* Dynamically update select across checkbox

* Filter available bulk actions

* Refactor tag autocomplete toggling
2023-08-25 13:54:23 +02:00
Sascha Ißbrücker
8206705876
Add support for PRIVATE flag in import and export (#505)
* Add support for PRIVATE attribute in import

* Add support for PRIVATE attribute in export

* Update import sync tests
2023-08-20 11:44:53 +02:00
Sascha Ißbrücker
5d9e487ec1
Various improvements to favicons (#504)
* Update default favicon provider

* Add domain placeholder for favicon providers

* Fix favicon loader to handle streaming response

* Handle different mime types for favicons

* Use 32px size by default

* Update documentation

* Skip mime-type test for now

* Manually configure image/x-icon mime type
2023-08-15 16:49:58 +02:00
Sascha Ißbrücker
4220ea0b4c
Fix website loader content encoding detection (#482) 2023-05-30 22:04:54 +02:00
Sascha Ißbrücker
43115fd8f2
Add notes to bookmarks (#472)
* Add basic bookmark notes

* Add bookmark list JS to shared bookmarks page

* Allow testing through ngrok

* Improve CSS

* Set notes through API

* Improve notes editing

* Improve notes icon

* Remove transitions for now

* Update keyboard shortcut

* Add bookmark list tests

* Add setting for showing notes permanently

* Add test for toggling notes

* Update API docs

* Allow searching for notes content

* Skip test
2023-05-20 11:54:26 +02:00
Sascha Ißbrücker
74134d3896
Escape texts in exported HTML (#429) 2023-02-18 18:25:54 +01:00
Sascha Ißbrücker
6b4664117b
Fix favicon being cleared by web archive snapshot task (#405) 2023-01-22 14:07:06 +01:00
Sascha Ißbrücker
814401be2e
Add option for showing bookmark favicons (#390)
* Implement favicon loader

* Implement load favicon task

* Show favicons in bookmark list

* Add missing migration

* Load missing favicons on import

* Automatically refresh favicons

* Add enable favicon setting

* Update uwsgi config to host favicons

* Improve settings wording

* Fix favicon loader test setup

* Document LD_FAVICON_PROVIDER setting

* Add refresh favicons button
2023-01-21 16:36:10 +01:00
Sascha Ißbrücker
30da1880a5
Cache website metadata to avoid duplicate scraping (#401)
* Cache website metadata to avoid duplicate scraping

* fix test setup
2023-01-20 22:28:44 +01:00
Sascha Ißbrücker
021d1cd673
Fix bookmark website metadata not being updated when URL changes (#400) 2023-01-20 20:59:09 +01:00
Sascha Ißbrücker
43d52642a6 Fix website loader test 2023-01-14 12:26:04 +01:00
Sascha Ißbrücker
4f9170c48d Improve website loader logging 2023-01-14 11:24:09 +01:00
Luca
c2d8cde86b
Trim website metadata title and description (#383)
* feat: trim fetched metadata placeholders

* feat: implement trimming serverside

* Add website loader tests

* Address review comments

Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@gmail.com>
2023-01-12 21:06:36 +01:00
Sascha Ißbrücker
2fd7704816
Limit document size for website scraper (#354)
Limits the size of scraped HTML documents to prevent out of memory errors. The scraper will stop reading from the response when it encounters the closing head tag, or if the read content's size exceeds a max limit.

Fixes #345
2022-10-07 21:18:18 +02:00
Sascha Ißbrücker
b94eaee833 Setup logging for background tasks 2022-09-11 07:50:08 +02:00
Sascha Ißbrücker
1b35d5b5ef
Prevent rate limit errors in wayback machine API (#339)
The Wayback Machine Save API only allows a limited number of requests within a timespan. This introduces several changes to avoid rate limit errors:
- There will be max. 1 attempt to create a new snapshot
- If a new snapshot could not be created, then attempt to use the latest existing snapshot
- Bulk snapshot updates (bookmark import, load missing snapshots after login) will only attempt to load the latest snapshot instead of creating new ones
2022-09-10 20:43:15 +02:00
Sascha Ißbrücker
1b67081773
Skip updating website metadata on edit unless URL has changed (#318)
* Skip updating website metadata on edit unless URL has changed

* Prevent form fetching metadata when editing existing bookmark
2022-08-13 11:21:26 +02:00
Sascha Ißbrücker
fec966f687
Add bookmark sharing (#311)
* Allow marking bookmarks as shared

* Add basic share view

* Ensure tag names in tag cloud are unique

* Filter shared bookmarks by user

* Add link for filtering by user

* Prevent n+1 queries when rendering bookmark list

* Prevent empty query params in return URL

* Fix user select template tag name

* Create shared bookmarks through API

* List shared bookmarks through API

* Show bookmark suggestions for shared view

* Show unique tags in search suggestions

* Sort user options

* Add bookmark sharing feature flag

* Add test for share setting default

* Simplify settings view
2022-08-04 19:37:16 +02:00
Sascha Ißbrücker
e6718be53b
Update unread flag when saving duplicate URL (#306)
Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@gmail.com>
2022-07-26 00:13:41 +02:00
Sascha Ißbrücker
13ff9ac4f8
Add read it later functionality (#304)
* Allow marking bookmarks as unread

* Restructure navigation to include preset filters

* Add mark as read action

* Improve description

* Highlight unread bookmarks visually

* Mark bookmarks as read by default

* Add tests

* Implement toread flag in importer

* Implement admin actions

* Add query tests

* Remove untagged link

* Update api docs

* Reduce height of description textarea

Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@gmail.com>
2022-07-23 22:17:20 +02:00
Sascha Ißbrücker
b618a8b10b Do not associate tags if bookmark was not imported 2022-07-03 14:44:16 +02:00
Dustin Blackman
b53bd9f112
Bump waybackpy to 3.0.6 (#281)
* fix wayback

* fix tests

* Reuse user agent from website loader

Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@gmail.com>
2022-07-03 06:26:16 +02:00
wahlm
0829d00e5f
no duplication of imported tags (#289)
* no duplication of imported tags (#287)

* Add importer test

* Revert settings test

Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@gmail.com>
2022-07-03 05:34:40 +02:00
Sascha Ißbrücker
e08bf9fd03
Fake request headers to reduce bot detection (#263)
Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@gmail.com>
2022-05-21 13:25:32 +02:00
Sascha Ißbrücker
f4e3d724f0
Improve import performance (#261)
* Run import in batches, cache tags

* Use bulk operations for bookmarks and assigning tags

* Improve naming

* Restore bookmark validation

* Add logging

* Bulk create tags

* Use HTMLParser for parsing bookmarks

* add parser tests

* Add more importer tests

* Add more importer tests

* Remove pyparsing dependency

Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@gmail.com>
2022-05-21 09:27:30 +02:00
Sascha Ißbrücker
f92c3dd403
Make Internet Archive integration opt-in (#250)
* Make web archive integration opt-in

* Add toast message about web archive integration opt-in

* Improve wording for web archive setting

* Add toast admin

* Fix toast clear button visited styles

* Add test for redirect

* Improve wording

* Ensure redirects to same domain

* Improve wording

* Fix snapshot test

Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@gmail.com>
2022-05-14 09:46:51 +02:00
Sascha Ißbrücker
82b4268a26
Ensure tag names don't contain spaces (#184) 2021-12-12 22:54:22 +01:00
Sascha Ißbrücker
d87dde6bae
Create snapshots on web.archive.org for bookmarks (#150)
* Implement initial background tasks concept

* fix property reference

* update requirements.txt

* simplify bookmark null check

* improve web archive url display

* add background tasks test

* add basic supervisor setup

* schedule missing snapshot creation on login

* remove task locks and clear task history before starting background task processor

* batch create snapshots after import

* fix script reference in supervisord.conf

* add option to disable background tasks

* restructure feature overview
2021-09-04 22:31:04 +02:00
Sascha Ißbrücker
e47c00bd07
Add support for micro-, nanosecond timestamps in importer (#151) 2021-08-26 12:33:54 +02:00
Taku Izumi
937858cf58
Fix website scraper decoding content incorrectly (#126)
* Avoid stall on web scraping

This patch fixes stall on web scraping.
I encountered a stall (scraping never ends) when adding
a bookmark of some site.
To avoid this case, adding a timeout parameter at requests.get()
function is a solution.

Signed-off-by: Taku Izumi <admin@orz-style.com>

* Avoid character corruption of scraping some Japanese sites

This patch fixes character corruption of scraping some Japanese
sites. To avoid character corruption, I use r.content instead
of r.text in load_page function.

The reason of character corruption is encoding problem, I think.
r.text handles data as unicode encoded text, so if scraping
web site's charset is not unicode encoded, character corruption
occurs. r.content handles data as str[], we can avoid encoding
problem.

Signed-off-by: Taku Izumi <admin@orz-style.com>

* use charset_normalizer to determine response encoding

Co-authored-by: Taku Izumi <admin@orz-style.com>
Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@googlemail.com>
2021-08-25 10:16:23 +02:00
Sascha Ißbrücker
8047ba6c63
Fix importer not validating bookmark models (#149) 2021-08-25 09:20:01 +02:00