Commit graph

1137 commits

Author SHA1 Message Date
Christopher K. Hoadley
de0ccfebb7 Add option to skip test if site returns error status (e.g. timeout connecting with site). This makes it easier to interpret the test results. 2020-01-02 06:03:41 -06:00
Siddharth Dushantha
514ce0cb08
Merge pull request #466 from ZephrFish/patch-1
Fixed docker file, git is not included in alpine by default
2020-01-01 15:51:57 +01:00
Christopher K. Hoadley
2c9fb4f295 Change SitesInformation() to use a generator when iterating thru the sites. This avoids the problem of the state (i.e. self.__iteration_index) getting corrupted if any of the methods of a given object needed to iterate for their own purposes while a caller was already iterating thru the same object. The code is also much simpler to follow. 2019-12-31 21:18:49 -06:00
Christopher K. Hoadley
1101af8132 Add @sdushantha suggestion in creating directory. 2019-12-31 15:51:40 -06:00
Christopher K. Hoadley
f48a2980f5 Use SitesInformation() object in tests. For now, use the new SitesInformation() object to calculate the original JSON dictionary: the rest of the code will be updated in the future. 2019-12-31 15:33:48 -06:00
Christopher K. Hoadley
8f6938ecb1 Add option to *not* print out results. Configure tests to there is no print output. This simplifies looking at the error output when the tests fail. 2019-12-31 15:26:15 -06:00
Christopher K. Hoadley
f29cab49e4 Add popularity rank to Site Information object. Add method to retrieve list of names of the sites (sorted by alphabetical or popularity rank). 2019-12-31 14:48:21 -06:00
Christopher K. Hoadley
2e195d4439 Move all writing of output files to occur after query takes place. Use with statement for results file, as that is more graceful on errors. Use try block for result directory creation: this has a smaller window for a race condition. 2019-12-31 11:19:15 -06:00
Christopher K. Hoadley
123e4d47e0 Merge remote-tracking branch 'origin/master' into restructure_take1 2019-12-31 10:53:12 -06:00
Christopher Kent Hoadley
37cc116dd9
Merge pull request #500 from sherlock-project/site_updates3
More Site Addition And Fixes
2019-12-31 10:50:43 -06:00
Christopher K. Hoadley
0fd89843b2 Update version and site list. 2019-12-31 10:47:17 -06:00
Christopher K. Hoadley
0fc25e979c Add "nnRU". 2019-12-31 10:36:32 -06:00
Christopher K. Hoadley
9ea42a3207 Add "ingvarr.net.ru". 2019-12-31 10:34:46 -06:00
Christopher K. Hoadley
6369e23ad5 Reinstate "easyen". Looks like some of the links on the site redirect to an internal index, but if you start out with a valid username, things do work. 2019-12-31 10:32:26 -06:00
Christopher K. Hoadley
4c6f9acd53 Fix claimed username for "phpRU". 2019-12-31 10:29:30 -06:00
Christopher Kent Hoadley
5123bf1f74
Merge pull request #499 from sherlock-project/site_updates2
More Fixes To Site Coverage
2019-12-31 08:05:16 -06:00
Christopher K. Hoadley
5649d6b721 Update version and site list. 2019-12-31 08:00:17 -06:00
Christopher K. Hoadley
2b0f1fd55c Merge remote-tracking branch 'origin/master' into site_updates2 2019-12-31 07:55:28 -06:00
Christopher K. Hoadley
5f5a81b083 Fix "Football" claimed username. 2019-12-31 07:47:04 -06:00
Siddharth Dushantha
8c289b1db3
version bump 0.10.1 --> 0.10.2 2019-12-31 14:42:28 +01:00
Christopher K. Hoadley
b9e89edc82 Remove "RamblerDating". As of 2019-12-31, site always times out. 2019-12-31 07:39:52 -06:00
Siddharth Dushantha
37160f259c
Merge pull request #498 from sherlock-project/sdushantha-patch-1
added many more sites requested by @torerobo
2019-12-31 14:39:16 +01:00
Siddharth Dushantha
7441eac71c
added many more sites requested by @torerobo 2019-12-31 14:36:40 +01:00
Christopher K. Hoadley
2ad96a8a7b Remove "YandexMarket". As of 2019-12-31, all usernames are reported as existing. 2019-12-31 07:36:09 -06:00
Christopher K. Hoadley
83ecddac91 Remove "easyen". As of 2019-12-31, usernames appear to redirect to an internal index. 2019-12-31 07:22:21 -06:00
Christopher K. Hoadley
3deb08d724 Fix "opennet" claimed username. 2019-12-31 07:14:09 -06:00
Christopher K. Hoadley
71a6697b20 Remove "Codementor". All usernames come back as unclaimed. 2019-12-31 07:08:37 -06:00
Christopher K. Hoadley
519795a1c8 Update claimed username for "toster". 2019-12-31 06:52:24 -06:00
Christopher K. Hoadley
216e1ea40c Update user URL for "Zomato". Site did work before, but it is better to use preferred location. 2019-12-31 06:51:14 -06:00
Christopher K. Hoadley
b28462d5c9 Fix claimed username for "LOR". 2019-12-31 06:45:02 -06:00
Christopher K. Hoadley
ef0352b0fc Do not use API call for "Brew". It probably needs to be authenticated now. 2019-12-31 06:43:04 -06:00
Christopher K. Hoadley
67693767e2 Update claimed user name for "Gitee". 2019-12-31 06:42:19 -06:00
Christopher K. Hoadley
b1fc363d31 Remove "KiwiFarms". You now have to be logged in to see any profile. 2019-12-31 06:30:53 -06:00
Christopher K. Hoadley
ea173cf313 Fix unclaimed user name for "Insanejournal". 2019-12-31 06:24:56 -06:00
Christopher K. Hoadley
ba0a44e0ae Merge remote-tracking branch 'origin/master' into restructure_take1
# Conflicts:
#	sherlock/resources/data.json
2019-12-29 11:57:25 -06:00
Christopher Kent Hoadley
d47a8b6f72
Merge pull request #486 from sherlock-project/site_updates
Fix "interpals", Add "Windy", "uid", And "opensource"
2019-12-29 11:53:10 -06:00
Christopher K. Hoadley
9abae2e341 Update version and site list. 2019-12-29 11:48:53 -06:00
Christopher K. Hoadley
1373c4c2f9 Add "uid" support. 2019-12-29 11:45:43 -06:00
Christopher K. Hoadley
e40051204c Add "opensource" support. 2019-12-29 11:45:21 -06:00
Christopher K. Hoadley
4144b7ff50 Add Windy support. 2019-12-29 11:32:32 -06:00
Christopher K. Hoadley
a036ca1f32 Fix error message for interpals. 2019-12-29 11:29:55 -06:00
Christopher K. Hoadley
7f87f5fcc4 Add module to store information about the sites. This handles getting the information loaded from the JSON file. For now, use the new SitesInformation() object to calculate the original JSON dictionary: the rest of the code will be updated in the future. 2019-12-29 00:50:06 -06:00
Siddharth Dushantha
deabd42a08
Merge pull request #473 from zero77/patch-2
Update data.json
2019-12-27 21:49:29 +01:00
Christopher K. Hoadley
647aea577c Factor out all print statements from portion of code that determines the query results. 2019-12-27 11:34:33 -06:00
Christopher K. Hoadley
bbb44d7ef9 Add defensive check for unknown Error Type. If it does happen, an exception will be thrown, instead of using the previous site's results. 2019-12-27 11:12:34 -06:00
Christopher K. Hoadley
2a8f83924d Remove some unneeded imports. Add minor comment. 2019-12-27 11:01:34 -06:00
Christopher K. Hoadley
2a1ab1c281 Add result module to hold results of site queries. The QueryResult() object contains an enumeration for the possible status about a given username on a site, and additional error information that might be handy. Rework all code to use this object instead of the "exists" key in the result dictionary that was used previously. 2019-12-27 10:17:10 -06:00
zero77
b5f676be95
Update data.json
Added https://allmylinks.com
2019-12-27 14:26:54 +00:00
Christopher K. Hoadley
519ac34346 Extract all print statements from function that gets the response. Also, print out social network for error messages. 2019-12-26 14:00:44 -06:00
Christopher K. Hoadley
6114ca263d Remove Proxy List Support
While doing the restructuring, I am testing in more depth as I change the code. And, I am trying to grok how the proxy options work. Specifically, how the proxy list works. Or, does not work.

There is code in the main function that randomly selects proxies from a list, but it does not actually use the result. This was noticed in #292. It looks like the only place where the proxy list is used is when there is a proxy error during get_response()...in that case a new random proxy is chosen. But, there is no care taken to ensure that we do not get the same proxy that just errored out. It seems like problematic proxies should be blacklisted if there is that type of failure.

Moreover, there is a check earlier in the code that does not allow the proxy list and proxy command line option to be used simultaneously. So, I can see no way that the proxy list has any functionality: if you do define the proxy list, then there is no way to kick off the general request with a proxy.

I also noticed that the recursive get_response() call does not pass its return tuples back up the call chain. The existing code would never get any good from the switchover to an alternate proxy (even if the other problems mentioned above were resolved).

For now, I am removing the support.  This feature may be looked at after the restructuring is done.
2019-12-26 10:59:52 -06:00