-
Notifications
You must be signed in to change notification settings - Fork 553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: start the overlay mesh before the routing is added #3227
Comments
@atanas18 can you please share more details?
|
Hi @yabinma
currently this is v0.26.0 version client. thanks |
@atanas18 , if the ip route change is managed by systemd service as well, can you please try to add the dependency in systemd configuration file? |
@yabinma ah, maybe I didn't understand your 3rd question correctly. I meant the ip route that netclient is adding, not that I add custom routes myself.
|
@atanas18 , my bad, I may misunderstanding the issue.
|
I sit down and rethink the situation .. and I think I mislead you somehow. |
@atanas18 , hope my understanding is correct. You may create a customized policy, to tag all the nodes behind NAT as a group and to allow them to be able to communicate each other. For the nodes with public ip, to setup a different policy. |
Well, in the end they must all communicate between each other, it's just in the beginning the nodes with only public IPs should not try to reach the ones with only private IPs :) that's my point. When the nodes with private IP try to reach the public, the punch hole is enough to have the communication both ways. But when the node with only public IP try to reach the ones with private IP is no go anyway.. it can't reach it for sure .. it just generates traffic which Hetzner catch and flag as portscanner and send an abuse email. There's really no point to try to reach rfc1918 IPs if you only have public IP interface (except the interface of the netmaker of course this one can be rfc1918 but either way it shouldn't try do the first try to reach the clients trough itself). Is your proposal going to work for what I'm trying to explain? Can I make the group behind NAT be able to communicate with EVERYTHING, and then nodes with public IPs only to be able to also communicate with the ones behind NAT but only and ONLY after the ones behind NAT initiate the punch hole? The nodes with public IPs should not try to reach the ones behind NAT before that (otherwise as I say it's generating an abuse ticket on Hetzner, which I have to reply and explain the situation so they do not shut off the server... without explanation that's their final decision - shutting of the server). Thanks. |
@atanas18 , what traffic is captured before the netmaker interface up? Is there source ip, destination ip in the Hetzner scan? As I checked the code, there is no peer communication before the netmaker interface up. |
Doesn't the ENDPOINT_DETECTION affect only the server? Will check the documentation about that. The problem for me is on the clients side for sure. |
@atanas18 client will update it's peer endpoint to private IPs, only if it's able to communicate over it otherwise it uses public IP |
ENDPOINT_DETECTION is a server side settings, but it will be cascaded to client side, and then to change the netclient behavior. |
What happened?
On a server reboot, the overlay mesh is started before the necessary ip route rules are added (to route that ip traffic trough the netmaker interface).
Because we have hundreds of mesh nodes under rfc1918 addresses (behind NAT), on a reboot of a public node (which is not behind NAT) start searching nodes in the overlay mesh trough rfc1918 addresses, which triggers Hetzner abuse for Netscan detected (because of hundreds requests to rfc1918). They don't like traffic on rfc1918 subnets over the public interface, and when the route is not up, in the beginning it's sending hundreds of TCP connections trying to reach the nodes under the NAT. Once the route is up, the problem is not happening, and traffic for rfc1918 is not send over the public interface anymore.
This behavior is happening after updating from 0.24.1 to 0.25.0, also happens on 0.26.0. On and before 0.24.1 we didn't have such problem. I guess something has changed between these versions.
Thanks.
Version
v0.25.0
What OS are you using?
Linux
Relevant log output
No response
Contributing guidelines
The text was updated successfully, but these errors were encountered: