Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EC / BN client logging and better request timeouts #13

Open
wants to merge 9 commits into
base: dev
Choose a base branch
from

Conversation

jclapis
Copy link
Member

@jclapis jclapis commented Jun 28, 2024

This PR adds supplemental logging support to the standard EC and BN API bindings, including HTTP tracing if requested. It also refactors the primary / fallback timeout system used by the client managers to make it easier to understand what's going on with the reconnect checks. Finally it adds config parameters for all of the above so users can tweak the new behavior if desired.

This comes from feedback provided by some HD operators while running on Mainnet. They'd like more logging granularity when specifically running requests against the standard EC and BN bindings, so they can diagnose why the daemon(s) can't contact the clients when tools like cURL can.

Comment on lines 43 to 45
fastGetMethod string = "Fast GET"
slowGetMethod string = "Slow GET"
postMethod string = "POST"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of postMethod there's http.MethodPost.

Instead of multiple clients, use https://pkg.go.dev/net/http#NewRequestWithContext which gives you control over the timeout via a context with a timeout

Pass the resulting Request to client.Do and pick a large timeout for client.Timeout (at least the max of the timeouts specified in context.WithTimeout)

Better yet, use the provided context.Context in each of the getters unless nil, and default to an appropriate-lengthed timeout. It would be an inflexible abstraction if it prescribed a two-tier timeout system instead of just using context.Context


logger.Debug("Calling BN request",
slog.String(log.MethodKey, methodName),
slog.String(log.PathKey, path),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to log the host, too...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The host is included in the path; it's the whole URL minus the query because queries can get unwieldy for certain calls.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well i leave it up to you but i would probably parse it using net/url so that we can leverage structured logging fully

return nil, 0, fmt.Errorf("error running GET request to [%s]: %w", trimmedPath, err)
}
return response.Body, response.StatusCode, nil
}

// Log a request and add HTTP tracing to the context if the logger has it enabled
func logRequest(ctx context.Context, methodName string, path string) context.Context {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like this should be a receiver on BeaconHttpProvider so it can log info specific to the instance

// Create a new client instance.
// Most calls will use the fast timeout, but queries to validator status will use the slow timeout since they can be very large.
// Set a timeout of 0 to disable it.
func NewStandardHttpClient(providerAddress string, fastTimeout time.Duration, slowTimeout time.Duration) *StandardHttpClient {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add a type StandardHttpClientOpts struct with two time.Duration fields. if non-0 they can be used as the defaults to context.WithTImeout. If 0, default to something sensible.

NewStandardHttpClient("http://blah", nil) would just use all defaults.

Comment on lines 19 to 22
FastTimeout Parameter[uint64]

// Number of seconds to wait for a slow request to complete
SlowTimeout Parameter[uint64]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add units to these field names? ie FastTimeoutMS

Seconds seems too rough of a granularity

@@ -174,6 +177,18 @@ func NewLoggerConfig() *LoggerConfig {
Network_All: true,
},
},

EnableHttpTracing: Parameter[bool]{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, exciting times

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to wave my hand and say "see above commends on using context.Context for this file too"

log/logger.go Outdated
@@ -71,6 +78,11 @@ func (l *Logger) GetFilePath() string {
return l.path
}

// Get the HTTP client tracer for this logger if HTTP tracing was enabled
func (l *Logger) GetHttpTracer() *httptrace.ClientTrace {
return l.tracer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might as well just expose this field (i'd rename to HttpTracer) and conditionally populate it in NewLogger when options.EnableHttpTracing is true

@@ -73,10 +79,49 @@ func (m *BeaconClientManager) GetClientTypeName() string {

func (m *BeaconClientManager) SetPrimaryReady(ready bool) {
m.primaryReady = ready
if ready {
m.primaryFailTime = time.Time{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of primaryFailTime and fallbackFailTime as fields on BeaconClientManager you could use a unexported struct which has a boolean field to indicate failure (rather than relying on 0 value to mean success)

}
}

func (m *BeaconClientManager) RecheckFailTimes(logger *log.Logger) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to name this function something that contains the word 'throttle', since it throttles reconnection attempts to recheckTime intervals. Might even be sensible to return a boolean to indicate whether or not a reconnection was attempted.

}

logger.Debug("Calling BN request",
slog.String(log.MethodKey, methodName),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems way more practical to log ctx.Deadline() btw

@jclapis
Copy link
Member Author

jclapis commented Jun 30, 2024

Alright I addressed most of the stuff in here, lower priority stuff like refactoring / renaming the reconnect delay is on hold because it's the weekend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants