-
Notifications
You must be signed in to change notification settings - Fork 547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTPSConnectionPool(host='stats.nba.com', port=443): Read timed out. (read timeout=30) #176
Comments
I tried it and didn't get a ReadTimeout, is miami's dashboard the only request in your program or are there some before it? |
I think it's because u tried a lot of times and the API blocks you for some time |
I am using the teamgamelog, teamdashboardbyopponent and teamdashboardbygeneralsplits endpoints. All three are working correctly in the local environment. But as soon as I deployed the app, I started receiving the ReadTimeoutError. I have since then deployed the app using only the teamgamelog endpoint but even that one endpoint is not working. I have also increased the timeout to 45 seconds, but that also did not help. Any suggestions ? |
How did you increased the timeout? You need to do the less request possible, to Avoid timeouts I use local variables, for example if I want to know the points/assists and rebounds of a tem, instead of calling 3 times for the api I use a local var to safe that team JSON and then I use that VAR to know everything o wanted. |
@OmegaP1 I copied the teamgamelog file and some other files locally and increased the timeout variable. I did this because I read on a StackOverflow comment that the response time is 40 seconds. |
Then I don't understand why that append, but u could try using time.sleep(.600) between actions u do, that worked for me as well |
I am calling the API only once and that too in the first line. How will time.sleep() help ? |
I mean when u call for the functions of the api, imagine when u ask for a team, that has 17 players and then u want to know all the players info, u call the api 17 times, there between player u should do a sleep |
I am just creating a dashboard of a team currently. So I just use the teamgamelog endpoint to get all the details. Other endpoints I use are for comparison between different teams. Since I have faced the error I have commented out all the other endpoints and only kept the teamgamelog endpoint. But that single endpoint is also not working |
Having the same timeout problem. I'm using the BoxScoreAdvancedV2 endpoint, and seems to be very inconsistent with respect to success. I wonder if there is a maximum times I can pull info from the api before it starts to time out. Especially frustrating as the BoxScoreAdvancedV2 endpoint only provides stats for one game, if iterating in a loop. Currently using v1.1.8. |
I am experiencing the same issue. On my local environment, I have a timeout after each call. and all my endpoint calls work as expected. However, on my web app, I receive a Any insight as to how this might be fixed is extremely appreciated. On Issue 55 some have suggested that NBA might be blocking calls from cloud providers such as AWS and calls from Google Collab. My web app runs on Streamlit and NBA seems to be blocking calls with whatever stack Streamlit use to deploy. |
For those having issues with calls, there are a couple of known factors to consider.
The best option is to always try it locally first to see if all is well. If it is, then it's likely a block. While I have not tried it, there is an option of using a proxy. You could attempt to use that from cloud to determine if your deploy worked, but you are in fact getting blocked. Hope that helps. It's a common issue that is raised often. |
I haven't found the perfect solution to this answer, but as of 2/2 and using the BoxScoreAdvancedV2 endpoint, I've been able to successfully call the endpoint in a loop if I add time.sleep(1) in the loop. Since I'm trying to find all advanced stats for each game for any given season, this turns out to being about 2100 cycles/seconds I believe. Time is not a huge issue since I'm just scraping the data to a csv, but adding a slight delay between loop iterations has helped me achieve consistent results as of now. |
This is incredibly useful. Although I don't know where the source of "The NBA blocks all cloud hosting providers." is, I feel this matches what I am experiencing. |
@leimao - The NBA does not make its firewall rules public. That being said, I have spent some time in the networking space. Here are the basics of what is likely happening. I'm going to assume no prior knowledge. I should extend this and put this out on Medium! 👍 THE CLOUD ARCHITECTURECloud provides, like AWS, millions of physical servers. On top of that, those servers are virtualized, creating millions more virtual machines. You can split that into millions of containers, such as running on top of Kubernetes (K8s). If that's not your preferred route, you can simply run a Serverless Application and use FaaS (Functions as a Service) like Lambda. THE PROBLEMAny single cloud provider has enough computing power that any single person could scale an application to take down any site in the world effectively. This is referred to as DDOS (Distributed Denial of Service); it doesn't even have to be intentional, someone could have just written an infinite loop. DEFENSE IN DEPTHSecurity is managed via the concept of Defense in Depth. This means that security is provided in layers. Should any layer be compromised, there is yet another layer that must be breached. In the same way, protecting a highly available service like the NBA's website, stats, and other services is done using this practice. Multiple tools can be used, and prices range from relatively inexpensive to very expensive. A good article from Fortinet titled, Defense in Depth I will cover three primary DDOS defenses that can be put into place with relative ease; though the extent of implementation determines price. IP ALLOW LISTS AND BLOCK LISTSProbably the easiest implementation is to ask the question, who are my customers? I don't think it will take you long to guess that you and I, running programs on the cloud, that get statistical data from the NBA for free, are their target audience. To the NBA, our programs are nothing more than bots. There is zero revenue to be gained. With that in mind, the NBA can ask themselves why they would allow any cloud provider to connect to our APIs, given the potential for a DDOS attack. There is no good reason. In short, they want human traffic in which they can build their brand, interact with fans, and generate revenue. In this case, it is relatively easy for the NBA to block all IPs from cloud providers. The majority of cloud providers make their IP addresses publicly known. AWS makes their IP address ranges available, and companies can subscribe to them RATE LIMITINGBeyond the allow and block lists, the next defensive measure is to limit how many times an individual can make requests to a given service. This is called rate limiting. Cloudflare has a good article titled, What is rate limiting? Rate limiting and bots. Rate limiting is also a form of DDOS protection. Through trial and error (reverse engineering), I determined I could request the NBA's API once every 600ms. Like allow and block lists, rate limiting is typically implemented within a Firewall. When implementing a rate limit, the firewall rule can be set with filtering keys. A key can be as simple as an IP address and may contain other characteristics. A quick article on AWS WAF (Web Application Firewall) titled Rate-based rule statement will give you an idea. DETECTION AND MITIGATIONThis is where DDOS protection can get pricey. Is it worth it? Yes. There is simply too much risk today not to have DDOS protection. On top of that, several companies and products are available. One of the leading companies in this space is Radware. You can learn more on their DDoS Attack Prevention Services: Multi Layered DDoS Protection and Security Solutions page. Check out their Live Threat Map for some real cool data! The idea here is that even if I have an IP allow list, an IP block list, and have rate limiting configured, that does not stop someone from making repeated calls over and over without end. This is where detection and mitigation come in. Should a firewall become so overwhelmed that it is no longer able to respond to legitimate traffic, a product such as Radware will step in the middle and begin absorbing, filtering, and redirecting that traffic. Note, while I say step in, Radware is always there inspecting the traffic, it's just quietly analyzing it. IN SUMMARYWhile I do not have any details regarding how the NBA has configured their networking infrastructure, there are some general design patterns that the industry uses that can be applied based on observation. Even if you were lucky enough to find someone who works for the NBA and specifically works on their network, they would not tell you either simply for the fact of a potential security breach. We all know how bad that can get. I hope you enjoyed this, was filled within some things that perhaps you didn't know, and didn't bore you so much that you fell asleep reading it. 😂 |
as someone who is hoping to use the NBA API on a site that will allow users to make calls in real time, getting past the cloud block is a massive step. this explanation was extremely helpful, thank you |
Did you find a solution? Will using a custom proxy for each request (and passing that to the nba_api library call) circumvent this? Running an app on digital ocean and hitting the cloud firewall. |
I tried proxy a while ago and it did not work. |
what service? i use smart proxy residential rotating proxy and it works perfectly |
I think I was using the free proxies found online (don't remember exactly what it was). Probably they have all been blocked. |
Hi @ChristopherBanas! Do you mind to share how did you set up requests to endpoints using the smart proxy service? Is there anything similar to the snippet below? # Setting up a proxy dict
proxy_dict = {
"http": f"http://{hostname}:{port}",
"https": f"http://{hostname}:{port}"
}
# Making the request to an endpoint (i.e. commonallplayers)
endpoint = commonallplayers.CommonAllPlayers(proxy=proxy_dict)
data = endpoint.common_all_players.get_data_frame() |
Looks similar to mine. What I did was store mine as an environment variable (needed this to not hard code the proxy in prod) and send it into a forked API I made. It was just a string but the approach you're going with above will probably work. The string proxy I would send to the package looked like this http://<USER_NAME>:@us.smartproxy.com: |
Similarly to how it is used here https://github.com/swar/nba_api/blob/master/docs/nba_api/stats/examples.md#endpoint-usage-example |
Thank you very much! I will try this proxy workaround in order to deploy a Lambda function on AWS. This whole thread helped me a lot to understand the |
Very helpful for deploying on cloud in 2024 with smart proxy |
Noticed there is a few threads on this issue yet the solutions provided haven't worked. Maybe I'm doing something wrong? Thanks for the help in advance!
from nba_api.stats.static.teams import find_teams_by_full_name
from nba_api.stats.endpoints.teamplayerdashboard import TeamPlayerDashboard
mia_id = find_teams_by_full_name("Miami Heat")[0]['id']
mia = TeamPlayerDashboard(measure_type_detailed_defense = "Base",per_mode_detailed = "Totals", team_id = mia_id, season = "2019-20").players_season_totals.get_data_frame()
ERROR:
`---------------------------------------------------------------------------
timeout Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
383 # otherwise it looks like a programming error was the cause.
--> 384 six.raise_from(e, None)
385 except (SocketTimeout, BaseSSLError, SocketError) as e:
24 frames
timeout: The read operation timed out
During handling of the above exception, another exception occurred:
ReadTimeoutError Traceback (most recent call last)
ReadTimeoutError: HTTPSConnectionPool(host='stats.nba.com', port=443): Read timed out. (read timeout=30)
During handling of the above exception, another exception occurred:
ReadTimeout Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
527 raise SSLError(e, request=request)
528 elif isinstance(e, ReadTimeoutError):
--> 529 raise ReadTimeout(e, request=request)
530 else:
531 raise
ReadTimeout: HTTPSConnectionPool(host='stats.nba.com', port=443): Read timed out. (read timeout=30)`
The text was updated successfully, but these errors were encountered: