Commit graph

6 commits

Author SHA1 Message Date
Sascha Ißbrücker
98b9a9c1a0 Add black code formatter 2024-01-27 11:29:16 +01:00
Jonathan Sundqvist
150dfecc6f
Support Open Graph description (#602)
* Support pytest for running tests

* Support extracting description from meta og:description property

* Revert changes to TOC

* Add test

---------

Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@gmail.com>
2024-01-27 10:28:46 +01:00
Sascha Ißbrücker
4220ea0b4c
Fix website loader content encoding detection (#482) 2023-05-30 22:04:54 +02:00
Sascha Ißbrücker
30da1880a5
Cache website metadata to avoid duplicate scraping (#401)
* Cache website metadata to avoid duplicate scraping

* fix test setup
2023-01-20 22:28:44 +01:00
Luca
c2d8cde86b
Trim website metadata title and description (#383)
* feat: trim fetched metadata placeholders

* feat: implement trimming serverside

* Add website loader tests

* Address review comments

Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@gmail.com>
2023-01-12 21:06:36 +01:00
Sascha Ißbrücker
2fd7704816
Limit document size for website scraper (#354)
Limits the size of scraped HTML documents to prevent out of memory errors. The scraper will stop reading from the response when it encounters the closing head tag, or if the read content's size exceeds a max limit.

Fixes #345
2022-10-07 21:18:18 +02:00