mirror of
https://github.com/sherlock-project/sherlock
synced 2025-02-17 13:08:27 +00:00
Limit Number Of Parallel Requests To 20 (Instead Of Number Of Sites)
Previous code was allocating room for as many workers as there was sites. The problem is that as the number of sites has grown, there has not been enough memory to allocate all of those requests. In reality, having all of these requests in parallel does not really speed the processing: on my computer, the time to do a query for all of the sites was 1 minute 10 seconds before the change, and was 1 minute 9 seconds after the change. Limiting the number of workers to 10 did increase the query time to 1 minute 17s. I am not sure if that is just inconsistencies in network traffic, but I will leave the limit at 20 for now. Note that with the limit of 20, my query detected more sites than it did previously. It appears that some of the requests were failing on my computer because of memory reasons (as opposed to actual detection on the site).
This commit is contained in:
parent
4b6d2c1166
commit
e0d2102810
1 changed files with 10 additions and 6 deletions
16
sherlock.py
16
sherlock.py
|
@ -180,9 +180,6 @@ def sherlock(username, site_data, verbose=False, tor=False, unique_tor=False,
|
|||
"""
|
||||
print_info("Checking username", username, color)
|
||||
|
||||
# Allow 1 thread for each external service, so `len(site_data)` threads total
|
||||
executor = ThreadPoolExecutor(max_workers=len(site_data))
|
||||
|
||||
# Create session based on request methodology
|
||||
if tor or unique_tor:
|
||||
#Requests using Tor obfuscation
|
||||
|
@ -193,9 +190,16 @@ def sherlock(username, site_data, verbose=False, tor=False, unique_tor=False,
|
|||
underlying_session = requests.session()
|
||||
underlying_request = requests.Request()
|
||||
|
||||
# Create multi-threaded session for all requests. Use our custom FuturesSession that exposes response time
|
||||
session = ElapsedFuturesSession(
|
||||
executor=executor, session=underlying_session)
|
||||
#Limit number of workers to 20.
|
||||
#This is probably vastly overkill.
|
||||
if len(site_data) >= 20:
|
||||
max_workers=20
|
||||
else:
|
||||
max_workers=len(site_data)
|
||||
|
||||
#Create multi-threaded session for all requests.
|
||||
session = ElapsedFuturesSession(max_workers=max_workers,
|
||||
session=underlying_session)
|
||||
|
||||
# Results from analysis of all sites
|
||||
results_total = {}
|
||||
|
|
Loading…
Add table
Reference in a new issue