[37] Fixes imdb_list and imdb_chart (#2092)

This commit is contained in:
meisnate12 2024-05-31 15:30:14 -04:00 committed by GitHub Action
parent 9a768a77bf
commit 27a6c48b48
8 changed files with 228 additions and 199 deletions

View file

@ -252,6 +252,7 @@ SciFi
ScudLee ScudLee
SDTV SDTV
SemVer SemVer
setuptools
ShawShank ShawShank
Skywalker Skywalker
Sohjiro Sohjiro

View file

@ -1,5 +1,9 @@
# Requirements Update (requirements will need to be reinstalled) # Requirements Update (requirements will need to be reinstalled)
Updated PlexAPI requirement to 4.15.13 Updated PlexAPI requirement to 4.15.13
Update lxml requirement to 5.2.2
Update requests requirement to 2.32.3
Update schedule requirement to 1.2.2
Update setuptools requirement to 70.0.0
# Removed Features # Removed Features
@ -7,6 +11,8 @@ Updated PlexAPI requirement to 4.15.13
Checks requirement versions to print a message if one needs to be updated Checks requirement versions to print a message if one needs to be updated
Added the `mass_added_at_update` operation to mass set the Added At field of Movies and Shows. Added the `mass_added_at_update` operation to mass set the Added At field of Movies and Shows.
Add automated Anime Aggregations for AniDB matching Add automated Anime Aggregations for AniDB matching
Added `top_tamil`, `top_telugu`, `top_malayalam`, `trending_india`, `trending_tamil`, and `trending_telugu` as options for `imdb_chart`
Adds the `sort_by` attribute to `imdb_list`
# Updates # Updates
Changed the `overlay_artwork_filetype` Setting to accept `webp_lossy` and `webp_lossless` while the old attribute `webp` will be treated as `webp_lossy`. Changed the `overlay_artwork_filetype` Setting to accept `webp_lossy` and `webp_lossless` while the old attribute `webp` will be treated as `webp_lossy`.
@ -19,5 +25,7 @@ Fixes #2034 `anilist_userlist` `score` attribute wasn't being validated correctl
Fixes #1367 Error when trying to symlink the logs folder Fixes #1367 Error when trying to symlink the logs folder
Fixes #2028 TMDb IDs were being ignored on the report Fixes #2028 TMDb IDs were being ignored on the report
Fixes a bug when parsing a comma-separated string of ints Fixes a bug when parsing a comma-separated string of ints
Fixes `imdb_chart` only getting 25 results
Fixes `imdb_list` not returning items
Various other Minor Fixes Various other Minor Fixes

View file

@ -1 +1 @@
2.0.1-build36 2.0.1-build37

View file

@ -31,16 +31,22 @@ The expected input are the options below. Multiple values are supported as eithe
The `sync_mode: sync` and `collection_order: custom` Setting are recommended since the lists are continuously updated and in a specific order. The `sync_mode: sync` and `collection_order: custom` Setting are recommended since the lists are continuously updated and in a specific order.
| Name | Attribute | Works with Movies | Works with Shows | | Name | Attribute | Works with Movies | Works with Shows |
|:-------------------------------------------------------------------------------|:-----------------|:------------------------------------------:|:------------------------------------------:| |:-------------------------------------------------------------------------------------|:------------------|:------------------------------------------:|:------------------------------------------:|
| [Box Office](https://www.imdb.com/chart/boxoffice) | `box_office` | :fontawesome-solid-circle-check:{ .green } | :fontawesome-solid-circle-xmark:{ .red } | | [Box Office](https://www.imdb.com/chart/boxoffice) | `box_office` | :fontawesome-solid-circle-check:{ .green } | :fontawesome-solid-circle-xmark:{ .red } |
| [Most Popular Movies](https://www.imdb.com/chart/moviemeter) | `popular_movies` | :fontawesome-solid-circle-check:{ .green } | :fontawesome-solid-circle-xmark:{ .red } | | [Most Popular Movies](https://www.imdb.com/chart/moviemeter) | `popular_movies` | :fontawesome-solid-circle-check:{ .green } | :fontawesome-solid-circle-xmark:{ .red } |
| [Top 250 Movies](https://www.imdb.com/chart/top) | `top_movies` | :fontawesome-solid-circle-check:{ .green } | :fontawesome-solid-circle-xmark:{ .red } | | [Top 250 Movies](https://www.imdb.com/chart/top) | `top_movies` | :fontawesome-solid-circle-check:{ .green } | :fontawesome-solid-circle-xmark:{ .red } |
| [Top Rated English Movies](https://www.imdb.com/chart/top-english-movies) | `top_english` | :fontawesome-solid-circle-check:{ .green } | :fontawesome-solid-circle-xmark:{ .red } | | [Top Rated English Movies](https://www.imdb.com/chart/top-english-movies) | `top_english` | :fontawesome-solid-circle-check:{ .green } | :fontawesome-solid-circle-xmark:{ .red } |
| [Most Popular TV Shows](https://www.imdb.com/chart/tvmeter) | `popular_shows` | :fontawesome-solid-circle-xmark:{ .red } | :fontawesome-solid-circle-check:{ .green } | | [Most Popular TV Shows](https://www.imdb.com/chart/tvmeter) | `popular_shows` | :fontawesome-solid-circle-xmark:{ .red } | :fontawesome-solid-circle-check:{ .green } |
| [Top 250 TV Shows](https://www.imdb.com/chart/toptv) | `top_shows` | :fontawesome-solid-circle-xmark:{ .red } | :fontawesome-solid-circle-check:{ .green } | | [Top 250 TV Shows](https://www.imdb.com/chart/toptv) | `top_shows` | :fontawesome-solid-circle-xmark:{ .red } | :fontawesome-solid-circle-check:{ .green } |
| [Top Rated Indian Movies](https://www.imdb.com/india/top-rated-indian-movies/) | `top_indian` | :fontawesome-solid-circle-check:{ .green } | :fontawesome-solid-circle-xmark:{ .red } | | [Lowest Rated Movies](https://www.imdb.com/chart/bottom) | `lowest_rated` | :fontawesome-solid-circle-check:{ .green } | :fontawesome-solid-circle-xmark:{ .red } |
| [Lowest Rated Movies](https://www.imdb.com/chart/bottom) | `lowest_rated` | :fontawesome-solid-circle-check:{ .green } | :fontawesome-solid-circle-xmark:{ .red } | | [Top Rated Indian Movies](https://www.imdb.com/india/top-rated-indian-movies/) | `top_indian` | :fontawesome-solid-circle-check:{ .green } | :fontawesome-solid-circle-xmark:{ .red } |
| [Top Rated Tamil Movies](https://www.imdb.com/india/top-rated-tamil-movies/) | `top_tamil` | :fontawesome-solid-circle-check:{ .green } | :fontawesome-solid-circle-xmark:{ .red } |
| [Top Rated Telugu Movies](https://www.imdb.com/india/top-rated-telugu-movies/) | `top_telugu` | :fontawesome-solid-circle-check:{ .green } | :fontawesome-solid-circle-xmark:{ .red } |
| [Top Rated Malayalam Movies](https://www.imdb.com/india/top-rated-malayalam-movies/) | `top_malayalam` | :fontawesome-solid-circle-check:{ .green } | :fontawesome-solid-circle-xmark:{ .red } |
| [Trending Indian Movies & Shows](https://www.imdb.com/india/upcoming/) | `trending_india` | :fontawesome-solid-circle-check:{ .green } | :fontawesome-solid-circle-check:{ .green } |
| [Trending Tamil Movies](https://www.imdb.com/india/tamil/) | `trending_tamil` | :fontawesome-solid-circle-check:{ .green } | :fontawesome-solid-circle-xmark:{ .red } |
| [Trending Telugu Movies](https://www.imdb.com/india/telugu/) | `trending_telugu` | :fontawesome-solid-circle-check:{ .green } | :fontawesome-solid-circle-xmark:{ .red } |
```yaml ```yaml
collections: collections:
@ -62,34 +68,48 @@ collections:
Finds every item in an IMDb List. Finds every item in an IMDb List.
The expected input is an IMDb List URL. Multiple values are supported as a list only a comma-separated string will not work. | List Parameter | Description |
|:---------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `list_id` | Specify the IMDb List ID. **This attribute is required.**<br>**Options:** The ID that starts with `ls` found in the URL of the list. (ex. `ls005526372`) |
| `limit` | Specify how items you want returned by the query.<br>**Options:** Any Integer `0` or greater where `0` get all items.<br>**Default:** `0` |
| `sort_by` | Choose from one of the many available sort options.<br>**Options:** `custom.asc`, `custom.desc`, `title.asc`, `title.desc`, `rating.asc`, `rating.desc`, `popularity.asc`, `popularity.desc`, `votes.asc`, `votes.desc`, `release.asc`, `release.desc`, `runtime.asc`, `runtime.desc`, `added.asc`, `added.desc`<br>**Default:** `custom.asc` |
Multiple values are supported as a list only a comma-separated string will not work.
The `sync_mode: sync` and `collection_order: custom` Setting are recommended since the lists are continuously updated and in a specific order. The `sync_mode: sync` and `collection_order: custom` Setting are recommended since the lists are continuously updated and in a specific order.
```yaml ```yaml
collections: collections:
James Bonds: James Bonds:
imdb_list: https://www.imdb.com/list/ls006405458 imdb_list:
list_id: ls006405458
limit: 100
sort_by: rating.asc
collection_order: custom collection_order: custom
sync_mode: sync sync_mode: sync
``` ```
You can also limit the number of items to search for by using the `limit` and `url` parameters under `imdb_list`. You can search multiple lists in one collection by using a list.
```yaml ```yaml
collections: collections:
Christmas: Christmas:
imdb_list: imdb_list:
- url: https://www.imdb.com/list/ls025976544/ - list_id: ls025976544
limit: 10 limit: 10
- url: https://www.imdb.com/list/ls003863000/ sort_by: rating.asc
- list_id: ls003863000
limit: 10 limit: 10
- url: https://www.imdb.com/list/ls027454200/ sort_by: rating.asc
- list_id: ls027454200
limit: 10 limit: 10
- url: https://www.imdb.com/list/ls027886673/ sort_by: rating.asc
- list_id: ls027886673
limit: 10 limit: 10
- url: https://www.imdb.com/list/ls097998599/ sort_by: rating.asc
- list_id: ls097998599
limit: 10 limit: 10
sort_by: rating.asc
sync_mode: sync sync_mode: sync
collection_order: alpha collection_order: alpha
``` ```
@ -184,7 +204,7 @@ The `sync_mode: sync` and `collection_order: custom` Setting are recommended sin
| Search Parameter | Description | | Search Parameter | Description |
|:------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |:------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `limit` | Specify how items you want returned by the query.<br>**Options:** Any Integer greater than `0`<br>**Default:** `100` | | `limit` | Specify how items you want returned by the query.<br>**Options:** Any Integer `0` or greater where `0` get all items.<br>**Default:** `100` |
| `sort_by` | Choose from one of the many available sort options.<br>**Options:** `popularity.asc`, `popularity.desc`, `title.asc`, `title.desc`, `rating.asc`, `rating.desc`, `votes.asc`, `votes.desc`, `box_office.asc`, `box_office.desc`, `runtime.asc`, `runtime.desc`, `year.asc`, `year.desc`, `release.asc`, `release.desc`<br>**Default:** `popularity.asc` | | `sort_by` | Choose from one of the many available sort options.<br>**Options:** `popularity.asc`, `popularity.desc`, `title.asc`, `title.desc`, `rating.asc`, `rating.desc`, `votes.asc`, `votes.desc`, `box_office.asc`, `box_office.desc`, `runtime.asc`, `runtime.desc`, `year.asc`, `year.desc`, `release.asc`, `release.desc`<br>**Default:** `popularity.asc` |
| `title` | Search by title name.<br>**Options:** Any String | | `title` | Search by title name.<br>**Options:** Any String |
| `type` | Item must match at least one given type. Can be a comma-separated list.<br>**Options:** `movie`, `tv_series`, `short`, `tv_episode`, `tv_mini_series`, `tv_movie`, `tv_special`, `tv_short`, `video_game`, `video`, `music_video`, `podcast_series`, `podcast_episode` | | `type` | Item must match at least one given type. Can be a comma-separated list.<br>**Options:** `movie`, `tv_series`, `short`, `tv_episode`, `tv_mini_series`, `tv_movie`, `tv_special`, `tv_short`, `video_game`, `video`, `music_video`, `podcast_series`, `podcast_episode` |

View file

@ -241,7 +241,8 @@ collections:
```yaml ```yaml
collections: collections:
Daniel Craig only James Bonds: Daniel Craig only James Bonds:
imdb_list: https://www.imdb.com/list/ls006405458/ imdb_list:
list_id: ls006405458
filters: filters:
actor: Daniel Craig actor: Daniel Craig
``` ```

View file

@ -1479,7 +1479,7 @@ class CollectionBuilder:
raise Failed(f"{self.Type} Error: imdb_id {value} must begin with tt") raise Failed(f"{self.Type} Error: imdb_id {value} must begin with tt")
elif method_name == "imdb_list": elif method_name == "imdb_list":
try: try:
for imdb_dict in self.config.IMDb.validate_imdb_lists(self.Type, method_data, self.language): for imdb_dict in self.config.IMDb.validate_imdb_lists(self.Type, method_data):
self.builders.append((method_name, imdb_dict)) self.builders.append((method_name, imdb_dict))
except Failed as e: except Failed as e:
logger.error(e) logger.error(e)
@ -1739,9 +1739,8 @@ class CollectionBuilder:
final_attributes["letter"] = util.parse(self.Type, "prefix", dict_data, methods=dict_methods, parent=method_name) final_attributes["letter"] = util.parse(self.Type, "prefix", dict_data, methods=dict_methods, parent=method_name)
final_text += f"\nPrefix: {final_attributes['letter']}" final_text += f"\nPrefix: {final_attributes['letter']}"
if "type" in dict_methods: if "type" in dict_methods:
type_list = util.parse(self.Type, "type", dict_data, datatype="commalist", methods=dict_methods, parent=method_name, options=mal.search_types) final_attributes["type"] = util.parse(self.Type, "type", dict_data, methods=dict_methods, parent=method_name, options=mal.search_types)
final_attributes["type"] = ",".join(type_list) final_text += f"\nType: {final_attributes['type']}"
final_text += f"\nType: {' or '.join(type_list)}"
if "status" in dict_methods: if "status" in dict_methods:
final_attributes["status"] = util.parse(self.Type, "status", dict_data, methods=dict_methods, parent=method_name, options=mal.search_status) final_attributes["status"] = util.parse(self.Type, "status", dict_data, methods=dict_methods, parent=method_name, options=mal.search_status)
final_text += f"\nStatus: {final_attributes['status']}" final_text += f"\nStatus: {final_attributes['status']}"

View file

@ -6,8 +6,11 @@ from modules.util import Failed
logger = util.logger logger = util.logger
builders = ["imdb_list", "imdb_id", "imdb_chart", "imdb_watchlist", "imdb_search", "imdb_award"] builders = ["imdb_list", "imdb_id", "imdb_chart", "imdb_watchlist", "imdb_search", "imdb_award"]
movie_charts = ["box_office", "popular_movies", "top_movies", "top_english", "top_indian", "lowest_rated"] movie_charts = [
show_charts = ["popular_shows", "top_shows"] "box_office", "popular_movies", "top_movies", "top_english", "lowest_rated",
"top_indian", "top_tamil", "top_telugu", "top_malayalam", "trending_india", "trending_tamil", "trending_telugu"
]
show_charts = ["popular_shows", "top_shows", "trending_india"]
charts = { charts = {
"box_office": "Box Office", "box_office": "Box Office",
"popular_movies": "Most Popular Movies", "popular_movies": "Most Popular Movies",
@ -15,8 +18,30 @@ charts = {
"top_movies": "Top 250 Movies", "top_movies": "Top 250 Movies",
"top_shows": "Top 250 TV Shows", "top_shows": "Top 250 TV Shows",
"top_english": "Top Rated English Movies", "top_english": "Top Rated English Movies",
"lowest_rated": "Lowest Rated Movies",
"top_tamil": "Top Rated Tamil Movies",
"top_telugu": "Top Rated Telugu Movies",
"top_malayalam": "Top Rated Malayalam Movies",
"trending_india": "Trending Indian Movies & Shows",
"trending_tamil": "Trending Tamil Movies",
"trending_telugu": "Trending Telugu Movies",
"top_indian": "Top Rated Indian Movies", "top_indian": "Top Rated Indian Movies",
"lowest_rated": "Lowest Rated Movies" }
chart_urls = {
"box_office": "chart/boxoffice",
"popular_movies": "chart/moviemeter",
"popular_shows": "chart/tvmeter",
"top_movies": "chart/top",
"top_shows": "chart/toptv",
"top_english": "chart/top-english-movies",
"lowest_rated": "chart/bottom",
"top_indian": "india/top-rated-indian-movies",
"top_tamil": "india/top-rated-tamil-movies",
"top_telugu": "india/top-rated-telugu-movies",
"top_malayalam": "india/top-rated-malayalam-movies",
"trending_india": "india/upcoming",
"trending_tamil": "india/tamil",
"trending_telugu": "india/telugu",
} }
imdb_search_attributes = [ imdb_search_attributes = [
"limit", "sort_by", "title", "type", "type.not", "release.after", "release.before", "rating.gte", "rating.lte", "limit", "sort_by", "title", "type", "type.not", "release.after", "release.before", "rating.gte", "rating.lte",
@ -40,6 +65,17 @@ sort_by_options = {
"release": "RELEASE_DATE", "release": "RELEASE_DATE",
} }
sort_options = [f"{a}.{d}"for a in sort_by_options for d in ["asc", "desc"]] sort_options = [f"{a}.{d}"for a in sort_by_options for d in ["asc", "desc"]]
list_sort_by_options = {
"custom": "LIST_ORDER",
"popularity": "POPULARITY",
"title": "TITLE_REGIONAL",
"rating": "USER_RATING",
"votes": "USER_RATING_COUNT",
"runtime": "RUNTIME",
"added": "DATE_ADDED",
"release": "RELEASE_DATE",
}
list_sort_options = [f"{a}.{d}"for a in sort_by_options for d in ["asc", "desc"]]
title_type_options = { title_type_options = {
"movie": "movie", "tv_series": "tvSeries", "short": "short", "tv_episode": "tvEpisode", "tv_mini_series": "tvMiniSeries", "movie": "movie", "tv_series": "tvSeries", "short": "short", "tv_episode": "tvEpisode", "tv_mini_series": "tvMiniSeries",
"tv_movie": "tvMovie", "tv_special": "tvSpecial", "tv_short": "tvShort", "video_game": "videoGame", "video": "video", "tv_movie": "tvMovie", "tv_special": "tvSpecial", "tv_short": "tvShort", "video_game": "videoGame", "video": "video",
@ -89,7 +125,8 @@ event_options = {
} }
base_url = "https://www.imdb.com" base_url = "https://www.imdb.com"
git_base = "https://raw.githubusercontent.com/Kometa-Team/IMDb-Awards/master" git_base = "https://raw.githubusercontent.com/Kometa-Team/IMDb-Awards/master"
hash_url = "https://raw.githubusercontent.com/Kometa-Team/IMDb-Hash/master/HASH" search_hash_url = "https://raw.githubusercontent.com/Kometa-Team/IMDb-Hash/master/HASH"
list_hash_url = "https://raw.githubusercontent.com/Kometa-Team/IMDb-Hash/master/LIST_HASH"
graphql_url = "https://api.graphql.imdb.com/" graphql_url = "https://api.graphql.imdb.com/"
list_url = f"{base_url}/list/ls" list_url = f"{base_url}/list/ls"
@ -103,7 +140,8 @@ class IMDb:
self._episode_ratings = None self._episode_ratings = None
self._events_validation = None self._events_validation = None
self._events = {} self._events = {}
self._hash = None self._search_hash = None
self._list_hash = None
self.event_url_validation = {} self.event_url_validation = {}
def _request(self, url, language=None, xpath=None, params=None): def _request(self, url, language=None, xpath=None, params=None):
@ -117,10 +155,16 @@ class IMDb:
return self.requests.post_json(graphql_url, headers={"content-type": "application/json"}, json=json_data) return self.requests.post_json(graphql_url, headers={"content-type": "application/json"}, json=json_data)
@property @property
def hash(self): def search_hash(self):
if self._hash is None: if self._search_hash is None:
self._hash = self.requests.get(hash_url).text.strip() self._search_hash = self.requests.get(search_hash_url).text.strip()
return self._hash return self._search_hash
@property
def list_hash(self):
if self._list_hash is None:
self._list_hash = self.requests.get(list_hash_url).text.strip()
return self._list_hash
@property @property
def events_validation(self): def events_validation(self):
@ -133,26 +177,29 @@ class IMDb:
self._events[event_id] = self.requests.get_yaml(f"{git_base}/events/{event_id}.yml").data self._events[event_id] = self.requests.get_yaml(f"{git_base}/events/{event_id}.yml").data
return self._events[event_id] return self._events[event_id]
def validate_imdb_lists(self, err_type, imdb_lists, language): def validate_imdb_lists(self, err_type, imdb_lists):
valid_lists = [] valid_lists = []
for imdb_dict in util.get_list(imdb_lists, split=False): for imdb_dict in util.get_list(imdb_lists, split=False):
if not isinstance(imdb_dict, dict): if not isinstance(imdb_dict, dict):
imdb_dict = {"url": imdb_dict} imdb_dict = {"list_id": imdb_dict}
if "url" in imdb_dict and "list_id" not in imdb_dict:
imdb_dict["list_id"] = imdb_dict["url"]
dict_methods = {dm.lower(): dm for dm in imdb_dict} dict_methods = {dm.lower(): dm for dm in imdb_dict}
if "url" not in dict_methods: if "list_id" not in dict_methods:
raise Failed(f"{err_type} Error: imdb_list url attribute not found") raise Failed(f"{err_type} Error: imdb_list list_id attribute not found")
elif imdb_dict[dict_methods["url"]] is None: elif imdb_dict[dict_methods["list_id"]] is None:
raise Failed(f"{err_type} Error: imdb_list url attribute is blank") raise Failed(f"{err_type} Error: imdb_list list_id attribute is blank")
else: else:
imdb_url = imdb_dict[dict_methods["url"]].strip() imdb_url = imdb_dict[dict_methods["list_id"]].strip()
if imdb_url.startswith(f"{base_url}/search/"): if imdb_url.startswith(f"{base_url}/search/"):
raise Failed("IMDb Error: URLs with https://www.imdb.com/search/ no longer works with imdb_list use imdb_search.") raise Failed("IMDb Error: URLs with https://www.imdb.com/search/ no longer works with imdb_list use imdb_search.")
if imdb_url.startswith(f"{base_url}/filmosearch/"): if imdb_url.startswith(f"{base_url}/filmosearch/"):
raise Failed("IMDb Error: URLs with https://www.imdb.com/filmosearch/ no longer works with imdb_list use imdb_search.") raise Failed("IMDb Error: URLs with https://www.imdb.com/filmosearch/ no longer works with imdb_list use imdb_search.")
if not imdb_url.startswith(list_url): search = re.search(r"(ls\d+)", imdb_url)
raise Failed(f"IMDb Error: imdb_list URLs must begin with {list_url}") if not search:
self._total(imdb_url, language) raise Failed("IMDb Error: imdb_list list_id must begin with ls (ex. ls005526372)")
list_count = None new_dict = {"list_id": search.group(1)}
if "limit" in dict_methods: if "limit" in dict_methods:
if imdb_dict[dict_methods["limit"]] is None: if imdb_dict[dict_methods["limit"]] is None:
logger.warning(f"{err_type} Warning: imdb_list limit attribute is blank using 0 as default") logger.warning(f"{err_type} Warning: imdb_list limit attribute is blank using 0 as default")
@ -160,14 +207,18 @@ class IMDb:
try: try:
value = int(str(imdb_dict[dict_methods["limit"]])) value = int(str(imdb_dict[dict_methods["limit"]]))
if 0 <= value: if 0 <= value:
list_count = value new_dict["limit"] = value
except ValueError: except ValueError:
pass pass
if list_count is None: if "limit" not in new_dict:
logger.warning(f"{err_type} Warning: imdb_list limit attribute must be an integer 0 or greater using 0 as default") logger.warning(f"{err_type} Warning: imdb_list limit attribute: {imdb_dict[dict_methods['limit']]} must be an integer 0 or greater using 0 as default")
if list_count is None: if "limit" not in new_dict:
list_count = 0 new_dict["limit"] = 0
valid_lists.append({"url": imdb_url, "limit": list_count})
if "sort_by" in dict_methods:
new_dict["sort_by"] = util.parse(err_type, dict_methods, imdb_dict, parent="imdb_list", default="custom.asc", options=list_sort_options)
valid_lists.append(new_dict)
return valid_lists return valid_lists
def validate_imdb_watchlists(self, err_type, users, language): def validate_imdb_watchlists(self, err_type, users, language):
@ -220,63 +271,12 @@ class IMDb:
return [f for f in json.loads(jsonline[jsonline.find('{'):-2])["starbars"]] return [f for f in json.loads(jsonline[jsonline.find('{'):-2])["starbars"]]
raise Failed(f"IMDb Error: Failed to parse URL: {imdb_url}") raise Failed(f"IMDb Error: Failed to parse URL: {imdb_url}")
def _total(self, imdb_url, language): def _graphql_json(self, data, search=True):
xpath_total = "//div[@class='desc lister-total-num-results']/text()" page_limit = 250 if search else 100
per_page = 100
results = self._request(imdb_url, language=language, xpath=xpath_total)
total = 0
for result in results:
if "title" in result:
try:
total = int(re.findall("(\\d+) title", result.replace(",", ""))[0])
break
except IndexError:
pass
if total > 0:
return total, per_page
raise Failed(f"IMDb Error: Failed to parse URL: {imdb_url}")
def _ids_from_url(self, imdb_url, language, limit):
total, item_count = self._total(imdb_url, language)
imdb_ids = []
parsed_url = urlparse(imdb_url)
params = parse_qs(parsed_url.query)
imdb_base = parsed_url._replace(query=None).geturl() # noqa
params.pop("start", None) # noqa
params.pop("count", None) # noqa
params.pop("page", None) # noqa
logger.trace(f"URL: {imdb_base}")
logger.trace(f"Params: {params}")
if limit < 1 or total < limit:
limit = total
remainder = limit % item_count
if remainder == 0:
remainder = item_count
num_of_pages = math.ceil(int(limit) / item_count)
for i in range(1, num_of_pages + 1):
start_num = (i - 1) * item_count + 1
logger.ghost(f"Parsing Page {i}/{num_of_pages} {start_num}-{limit if i == num_of_pages else i * item_count}")
params["page"] = i # noqa
ids_found = self._request(imdb_base, language=language, xpath="//div[contains(@class, 'lister-item-image')]//a/img//@data-tconst", params=params)
if i == num_of_pages:
ids_found = ids_found[:remainder]
imdb_ids.extend(ids_found)
time.sleep(2)
logger.exorcise()
if len(imdb_ids) > 0:
return imdb_ids
raise Failed(f"IMDb Error: No IMDb IDs Found at {imdb_url}")
def _search_json(self, data):
out = { out = {
"locale": "en-US", "locale": "en-US",
"first": data["limit"] if "limit" in data and 0 < data["limit"] < 250 else 250, "first": data["limit"] if "limit" in data and 0 < data["limit"] < page_limit else page_limit,
"titleTypeConstraint": {"anyTitleTypeIds": [title_type_options[t] for t in data["type"]] if "type" in data else []},
} }
sort = data["sort_by"] if "sort_by" in data else "popularity.asc"
sort_by, sort_order = sort.split(".")
out["sortBy"] = sort_by_options[sort_by]
out["sortOrder"] = sort_order.upper()
def check_constraint(bases, mods, constraint, lower="", translation=None, range_name=None): def check_constraint(bases, mods, constraint, lower="", translation=None, range_name=None):
if not isinstance(bases, list): if not isinstance(bases, list):
@ -302,84 +302,96 @@ class IMDb:
if range_data: if range_data:
out[constraint][range_name[i]] = range_data out[constraint][range_name[i]] = range_data
check_constraint("type", [("not", "excludeTitleTypeIds")], "titleTypeConstraint", translation=title_type_options) sort = data["sort_by"] if "sort_by" in data else "popularity.asc" if search else "custom.asc"
check_constraint("release", [("after", "start"), ("before", "end")], "releaseDateConstraint", range_name="releaseDateRange") sort_by, sort_order = sort.split(".")
check_constraint("title", [("", "searchTerm")], "titleTextConstraint")
check_constraint(["rating", "votes"], [("gte", "min"), ("lte", "max")], "userRatingsConstraint", range_name=["aggregateRatingRange", "ratingsCountRange"])
check_constraint("genre", [("", "all"), ("any", "any"), ("not", "exclude")], "genreConstraint", lower="GenreIds", translation=genre_options)
check_constraint("topic", [("", "all"), ("any", "any"), ("not", "no")], "withTitleDataConstraint", lower="DataAvailable", translation=topic_options)
check_constraint("alternate_version", [("", "all"), ("any", "any")], "alternateVersionMatchingConstraint", lower="AlternateVersionTextTerms")
check_constraint("crazy_credit", [("", "all"), ("any", "any")], "crazyCreditMatchingConstraint", lower="CrazyCreditTextTerms")
check_constraint("location", [("", "all"), ("any", "any")], "filmingLocationConstraint", lower="Locations")
check_constraint("goof", [("", "all"), ("any", "any")], "goofMatchingConstraint", lower="GoofTextTerms")
check_constraint("plot", [("", "all"), ("any", "any")], "plotMatchingConstraint", lower="PlotTextTerms")
check_constraint("quote", [("", "all"), ("any", "any")], "quoteMatchingConstraint", lower="QuoteTextTerms")
check_constraint("soundtrack", [("", "all"), ("any", "any")], "soundtrackMatchingConstraint", lower="SoundtrackTextTerms")
check_constraint("trivia", [("", "all"), ("any", "any")], "triviaMatchingConstraint", lower="TriviaTextTerms")
if "event" in data or "event.winning" in data: if search:
input_list = [] out["titleTypeConstraint"] = {"anyTitleTypeIds": [title_type_options[t] for t in data["type"]] if "type" in data else []}
if "event" in data: out["sortBy"] = sort_by_options[sort_by]
input_list.extend([event_options[a] if a in event_options else {"eventId": a} for a in data["event"]]) out["sortOrder"] = sort_order.upper()
if "event.winning" in data:
for a in data["event.winning"]:
award_dict = event_options[a] if a in event_options else {"eventId": a}
award_dict["winnerFilter"] = "WINNER_ONLY"
input_list.append(award_dict)
out["awardConstraint"] = {"allEventNominations": input_list}
if any([a in data for a in ["imdb_top", "imdb_bottom", "popularity.gte", "popularity.lte"]]): check_constraint("type", [("not", "excludeTitleTypeIds")], "titleTypeConstraint", translation=title_type_options)
ranges = [] check_constraint("release", [("after", "start"), ("before", "end")], "releaseDateConstraint", range_name="releaseDateRange")
if "imdb_top" in data: check_constraint("title", [("", "searchTerm")], "titleTextConstraint")
ranges.append({"rankRange": {"max": data["imdb_top"]}, "rankedTitleListType": "TOP_RATED_MOVIES"}) check_constraint(["rating", "votes"], [("gte", "min"), ("lte", "max")], "userRatingsConstraint", range_name=["aggregateRatingRange", "ratingsCountRange"])
if "imdb_bottom" in data: check_constraint("genre", [("", "all"), ("any", "any"), ("not", "exclude")], "genreConstraint", lower="GenreIds", translation=genre_options)
ranges.append({"rankRange": {"max": data["imdb_bottom"]}, "rankedTitleListType": "LOWEST_RATED_MOVIES"}) check_constraint("topic", [("", "all"), ("any", "any"), ("not", "no")], "withTitleDataConstraint", lower="DataAvailable", translation=topic_options)
if "popularity.gte" in data or "popularity.lte" in data: check_constraint("alternate_version", [("", "all"), ("any", "any")], "alternateVersionMatchingConstraint", lower="AlternateVersionTextTerms")
num_range = {} check_constraint("crazy_credit", [("", "all"), ("any", "any")], "crazyCreditMatchingConstraint", lower="CrazyCreditTextTerms")
if "popularity.lte" in data: check_constraint("location", [("", "all"), ("any", "any")], "filmingLocationConstraint", lower="Locations")
num_range["max"] = data["popularity.lte"] check_constraint("goof", [("", "all"), ("any", "any")], "goofMatchingConstraint", lower="GoofTextTerms")
if "popularity.gte" in data: check_constraint("plot", [("", "all"), ("any", "any")], "plotMatchingConstraint", lower="PlotTextTerms")
num_range["min"] = data["popularity.gte"] check_constraint("quote", [("", "all"), ("any", "any")], "quoteMatchingConstraint", lower="QuoteTextTerms")
ranges.append({"rankRange": num_range, "rankedTitleListType": "TITLE_METER"}) check_constraint("soundtrack", [("", "all"), ("any", "any")], "soundtrackMatchingConstraint", lower="SoundtrackTextTerms")
out["rankedTitleListConstraint"] = {"allRankedTitleLists": ranges} check_constraint("trivia", [("", "all"), ("any", "any")], "triviaMatchingConstraint", lower="TriviaTextTerms")
check_constraint("series", [("", "any"), ("not", "exclude")], "episodicConstraint", lower="SeriesIds") if "event" in data or "event.winning" in data:
check_constraint("list", [("", "inAllLists"), ("any", "inAnyList"), ("not", "notInAnyList")], "listConstraint") input_list = []
if "event" in data:
input_list.extend([event_options[a] if a in event_options else {"eventId": a} for a in data["event"]])
if "event.winning" in data:
for a in data["event.winning"]:
award_dict = event_options[a] if a in event_options else {"eventId": a}
award_dict["winnerFilter"] = "WINNER_ONLY"
input_list.append(award_dict)
out["awardConstraint"] = {"allEventNominations": input_list}
if "company" in data: if any([a in data for a in ["imdb_top", "imdb_bottom", "popularity.gte", "popularity.lte"]]):
company_ids = [] ranges = []
for c in data["company"]: if "imdb_top" in data:
if c in company_options: ranges.append({"rankRange": {"max": data["imdb_top"]}, "rankedTitleListType": "TOP_RATED_MOVIES"})
company_ids.extend(company_options[c]) if "imdb_bottom" in data:
else: ranges.append({"rankRange": {"max": data["imdb_bottom"]}, "rankedTitleListType": "LOWEST_RATED_MOVIES"})
company_ids.append(c) if "popularity.gte" in data or "popularity.lte" in data:
out["creditedCompanyConstraint"] = {"anyCompanyIds": company_ids} num_range = {}
if "popularity.lte" in data:
num_range["max"] = data["popularity.lte"]
if "popularity.gte" in data:
num_range["min"] = data["popularity.gte"]
ranges.append({"rankRange": num_range, "rankedTitleListType": "TITLE_METER"})
out["rankedTitleListConstraint"] = {"allRankedTitleLists": ranges}
check_constraint("content_rating", [("", "anyRegionCertificateRatings")], "certificateConstraint") check_constraint("series", [("", "any"), ("not", "exclude")], "episodicConstraint", lower="SeriesIds")
check_constraint("country", [("", "all"), ("any", "any"), ("not", "exclude"), ("origin", "anyPrimary")], "originCountryConstraint", lower="Countries") check_constraint("list", [("", "inAllLists"), ("any", "inAnyList"), ("not", "notInAnyList")], "listConstraint")
check_constraint("keyword", [("", "all"), ("any", "any"), ("not", "exclude")], "keywordConstraint", lower="Keywords", translation=(" ", "-"))
check_constraint("language", [("", "all"), ("any", "any"), ("not", "exclude"), ("primary", "anyPrimary")], "languageConstraint", lower="Languages")
check_constraint("cast", [("", "all"), ("any", "any"), ("not", "exclude")], "creditedNameConstraint", lower="NameIds")
check_constraint("runtime", [("gte", "min"), ("lte", "max")], "runtimeConstraint", range_name="runtimeRangeMinutes")
if "adult" in data and data["adult"]: if "company" in data:
out["explicitContentConstraint"] = {"explicitContentFilter": "INCLUDE_ADULT"} company_ids = []
for c in data["company"]:
if c in company_options:
company_ids.extend(company_options[c])
else:
company_ids.append(c)
out["creditedCompanyConstraint"] = {"anyCompanyIds": company_ids}
check_constraint("content_rating", [("", "anyRegionCertificateRatings")], "certificateConstraint")
check_constraint("country", [("", "all"), ("any", "any"), ("not", "exclude"), ("origin", "anyPrimary")], "originCountryConstraint", lower="Countries")
check_constraint("keyword", [("", "all"), ("any", "any"), ("not", "exclude")], "keywordConstraint", lower="Keywords", translation=(" ", "-"))
check_constraint("language", [("", "all"), ("any", "any"), ("not", "exclude"), ("primary", "anyPrimary")], "languageConstraint", lower="Languages")
check_constraint("cast", [("", "all"), ("any", "any"), ("not", "exclude")], "creditedNameConstraint", lower="NameIds")
check_constraint("runtime", [("gte", "min"), ("lte", "max")], "runtimeConstraint", range_name="runtimeRangeMinutes")
if "adult" in data and data["adult"]:
out["explicitContentConstraint"] = {"explicitContentFilter": "INCLUDE_ADULT"}
else:
out["lsConst"] = data["list_id"]
out["sort"] = {"by": list_sort_by_options[sort_by], "order": sort_order.upper()}
logger.trace(out) logger.trace(out)
return { return {
"operationName": "AdvancedTitleSearch", "operationName": "AdvancedTitleSearch" if search else "TitleListMainPage",
"variables": out, "variables": out,
"extensions": {"persistedQuery": {"version": 1, "sha256Hash": self.hash}} "extensions": {"persistedQuery": {"version": 1, "sha256Hash": self.search_hash if search else self.list_hash}}
} }
def _search(self, data): def _pagination(self, data, search=True):
json_obj = self._search_json(data) json_obj = self._graphql_json(data, search=search)
item_count = 250 item_count = 250 if search else 100
imdb_ids = [] imdb_ids = []
logger.ghost("Parsing Page 1") logger.ghost("Parsing Page 1")
response_json = self._graph_request(json_obj) response_json = self._graph_request(json_obj)
try: try:
total = response_json["data"]["advancedTitleSearch"]["total"] search_data = response_json["data"]["advancedTitleSearch"] if search else response_json["data"]["list"]["titleListItemSearch"]
total = search_data["total"]
limit = data["limit"] limit = data["limit"]
if limit < 1 or total < limit: if limit < 1 or total < limit:
limit = total limit = total
@ -387,16 +399,17 @@ class IMDb:
if remainder == 0: if remainder == 0:
remainder = item_count remainder = item_count
num_of_pages = math.ceil(int(limit) / item_count) num_of_pages = math.ceil(int(limit) / item_count)
end_cursor = response_json["data"]["advancedTitleSearch"]["pageInfo"]["endCursor"] end_cursor = search_data["pageInfo"]["endCursor"]
imdb_ids.extend([n["node"]["title"]["id"] for n in response_json["data"]["advancedTitleSearch"]["edges"]]) imdb_ids.extend([n["node"]["title"]["id"] if search else n["listItem"]["id"] for n in search_data["edges"]])
if num_of_pages > 1: if num_of_pages > 1:
for i in range(2, num_of_pages + 1): for i in range(2, num_of_pages + 1):
start_num = (i - 1) * item_count + 1 start_num = (i - 1) * item_count + 1
logger.ghost(f"Parsing Page {i}/{num_of_pages} {start_num}-{limit if i == num_of_pages else i * item_count}") logger.ghost(f"Parsing Page {i}/{num_of_pages} {start_num}-{limit if i == num_of_pages else i * item_count}")
json_obj["variables"]["after"] = end_cursor json_obj["variables"]["after"] = end_cursor
response_json = self._graph_request(json_obj) response_json = self._graph_request(json_obj)
end_cursor = response_json["data"]["advancedTitleSearch"]["pageInfo"]["endCursor"] search_data = response_json["data"]["advancedTitleSearch"] if search else response_json["data"]["list"]["titleListItemSearch"]
ids_found = [n["node"]["title"]["id"] for n in response_json["data"]["advancedTitleSearch"]["edges"]] end_cursor = search_data["pageInfo"]["endCursor"]
ids_found = [n["node"]["title"]["id"] if search else n["listItem"]["id"] for n in search_data["edges"]]
if i == num_of_pages: if i == num_of_pages:
ids_found = ids_found[:remainder] ids_found = ids_found[:remainder]
imdb_ids.extend(ids_found) imdb_ids.extend(ids_found)
@ -489,35 +502,22 @@ class IMDb:
return parental_dict return parental_dict
def _ids_from_chart(self, chart, language): def _ids_from_chart(self, chart, language):
if chart == "box_office": if chart not in chart_urls:
url = "chart/boxoffice"
elif chart == "popular_movies":
url = "chart/moviemeter"
elif chart == "popular_shows":
url = "chart/tvmeter"
elif chart == "top_movies":
url = "chart/top"
elif chart == "top_shows":
url = "chart/toptv"
elif chart == "top_english":
url = "chart/top-english-movies"
elif chart == "top_indian":
url = "india/top-rated-indian-movies"
elif chart == "lowest_rated":
url = "chart/bottom"
else:
raise Failed(f"IMDb Error: chart: {chart} not ") raise Failed(f"IMDb Error: chart: {chart} not ")
links = self._request(f"{base_url}/{url}", language=language, xpath="//li//a[@class='ipc-title-link-wrapper']/@href") script_data = self._request(f"{base_url}/{chart_urls[chart]}", language=language, xpath="//script[@id='__NEXT_DATA__']/text()")[0]
return [re.search("(tt\\d+)", link).group(1) for link in links] return [x.group(1) for x in re.finditer(r'"(tt\d+)"', script_data)]
def get_imdb_ids(self, method, data, language): def get_imdb_ids(self, method, data, language):
if method == "imdb_id": if method == "imdb_id":
logger.info(f"Processing IMDb ID: {data}") logger.info(f"Processing IMDb ID: {data}")
return [(data, "imdb")] return [(data, "imdb")]
elif method == "imdb_list": elif method == "imdb_list":
status = f"{data['limit']} Items at " if data['limit'] > 0 else '' logger.info(f"Processing IMDb List: {data['list_id']}")
logger.info(f"Processing IMDb List: {status}{data['url']}") if data["limit"] > 0:
return [(i, "imdb") for i in self._ids_from_url(data["url"], language, data["limit"])] logger.info(f" Limit: {data['limit']}")
if "sort_by" in data:
logger.info(f" Sort By: {data['sort_by']}")
return [(i, "imdb") for i in self._pagination(data, search=False)]
elif method == "imdb_chart": elif method == "imdb_chart":
logger.info(f"Processing IMDb Chart: {charts[data]}") logger.info(f"Processing IMDb Chart: {charts[data]}")
return [(_i, "imdb") for _i in self._ids_from_chart(data, language)] return [(_i, "imdb") for _i in self._ids_from_chart(data, language)]
@ -538,7 +538,7 @@ class IMDb:
logger.info(f"Processing IMDb Search:") logger.info(f"Processing IMDb Search:")
for k, v in data.items(): for k, v in data.items():
logger.info(f" {k}: {v}") logger.info(f" {k}: {v}")
return [(_i, "imdb") for _i in self._search(data)] return [(_i, "imdb") for _i in self._pagination(data)]
else: else:
raise Failed(f"IMDb Error: Method {method} not supported") raise Failed(f"IMDb Error: Method {method} not supported")

View file

@ -56,7 +56,7 @@ class Letterboxd:
def get_list_description(self, list_url, language): def get_list_description(self, list_url, language):
logger.trace(f"URL: {list_url}") logger.trace(f"URL: {list_url}")
response = self.requests.get_html(list_url, language=language) response = self.requests.get_html(list_url, language=language)
descriptions = response.xpath("//meta[@property='og:description']/@content") descriptions = response.xpath("//meta[@name='description']/@content")
if len(descriptions) > 0 and len(descriptions[0]) > 0 and "About this list: " in descriptions[0]: if len(descriptions) > 0 and len(descriptions[0]) > 0 and "About this list: " in descriptions[0]:
return str(descriptions[0]).split("About this list: ")[1] return str(descriptions[0]).split("About this list: ")[1]
return None return None