ArchiveBox/archivebox/parsers
jim winstead 9f462a87a8 Use feedparser for RSS parsing in generic_rss and pinboard_rss parsers
The feedparser packages has 20 years of history and is very good at parsing
RSS and Atom, so use that instead of ad-hoc regex and XML parsing.

The medium_rss and shaarli_rss parsers weren't touched because they are
probably unnecessary. (The special parse for pinboard is just needing because
of how tags work.)

Doesn't include tests because I haven't figured out how to run them in the
docker development setup.

Fixes #1171
2024-03-01 11:25:45 -08:00
..
__init__.py Merge pull request #1168 from mAAdhaTTah/add-readwise-reader 2023-09-03 21:24:49 -07:00
generic_html.py add timezone support, tons of CSS and layout improvements, more detailed snapshot admin form info, ability to sort by recently updated, better grid view styling, better table layouts, better dark mode support 2021-04-10 04:21:36 -04:00
generic_json.py Handle list of tags in JSON, and be more clever about comma vs. space 2024-02-28 17:38:49 -08:00
generic_rss.py Use feedparser for RSS parsing in generic_rss and pinboard_rss parsers 2024-03-01 11:25:45 -08:00
generic_txt.py add timezone support, tons of CSS and layout improvements, more detailed snapshot admin form info, ability to sort by recently updated, better grid view styling, better table layouts, better dark mode support 2021-04-10 04:21:36 -04:00
medium_rss.py use KEY, NAME, and PARSER to define parsers instead of hardcoding in init 2021-03-31 01:05:49 -04:00
netscape_html.py use KEY, NAME, and PARSER to define parsers instead of hardcoding in init 2021-03-31 01:05:49 -04:00
pinboard_rss.py Use feedparser for RSS parsing in generic_rss and pinboard_rss parsers 2024-03-01 11:25:45 -08:00
pocket_api.py fix typo in pocket_api articl variable name 2021-11-12 19:23:47 -05:00
pocket_html.py use KEY, NAME, and PARSER to define parsers instead of hardcoding in init 2021-03-31 01:05:49 -04:00
readwise_reader_api.py Fix readwise token 2023-10-29 17:27:04 -04:00
shaarli_rss.py use KEY, NAME, and PARSER to define parsers instead of hardcoding in init 2021-03-31 01:05:49 -04:00
url_list.py add timezone support, tons of CSS and layout improvements, more detailed snapshot admin form info, ability to sort by recently updated, better grid view styling, better table layouts, better dark mode support 2021-04-10 04:21:36 -04:00
wallabag_atom.py handle new wallabag export format with newlines mid-tag attributes 2022-05-09 19:07:48 -07:00