mirror of
https://github.com/sissbruecker/linkding
synced 2024-11-22 03:13:02 +00:00
937858cf58
* Avoid stall on web scraping This patch fixes stall on web scraping. I encountered a stall (scraping never ends) when adding a bookmark of some site. To avoid this case, adding a timeout parameter at requests.get() function is a solution. Signed-off-by: Taku Izumi <admin@orz-style.com> * Avoid character corruption of scraping some Japanese sites This patch fixes character corruption of scraping some Japanese sites. To avoid character corruption, I use r.content instead of r.text in load_page function. The reason of character corruption is encoding problem, I think. r.text handles data as unicode encoded text, so if scraping web site's charset is not unicode encoded, character corruption occurs. r.content handles data as str[], we can avoid encoding problem. Signed-off-by: Taku Izumi <admin@orz-style.com> * use charset_normalizer to determine response encoding Co-authored-by: Taku Izumi <admin@orz-style.com> Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@googlemail.com> |
||
---|---|---|
.. | ||
api | ||
components | ||
management/commands | ||
migrations | ||
services | ||
static | ||
styles | ||
templates | ||
templatetags | ||
tests | ||
views | ||
__init__.py | ||
admin.py | ||
apps.py | ||
models.py | ||
queries.py | ||
urls.py | ||
utils.py | ||
validators.py |