-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node Health status #131
base: master
Are you sure you want to change the base?
Node Health status #131
Conversation
added node Health status based on messages in last 5 mins
@waku-org/nwaku would it be possible to have a prometheus entry that returns something similar to |
Yes, I think we can start metric server ahead of initialization just as rest service. @YAMISHKA02 : Thank you for the initiative. I was thinking of this. While the fact that the node can relay messages is a superior indicator of healthy operation, we rather used to check mounted protocols and discovered node count. These can tell the node is up and ready to use. Relaying messages is heavily depends on actual network traffic which independent from the current node. |
Hello, the best way is of course to add something familiar with checkhlth.sh Can you please send me link to file which is reference of metrics exporter? I can modify this file to add new metrics, exported by this. |
@YAMISHKA02 : Sorry for not answering yet. I'm afraid there is no single link I can point to as the health status of a node - if I'm thinking of a continuous report of it - consisting of several properties. We need to think of what is worth measuring. Currently chkhealth.sh is mainly to support node ops about the boot status of the node, because the very first boot with RLN sync can take a while and that was misunderstood in many ways. So of course there is plenty of room for improvement, I believe it will come into scope shortly. |
Added one pannel on dashboard with Node health/unhealth status.
Its based on messages from node, produced last 5 minutes.