Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promsafe: Strongly-typed safe labels #1598

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

amberpixels
Copy link

@amberpixels amberpixels commented Aug 28, 2024

Promsafe

Introducing promsafe lib (optional helper lib, similar to promauto) that allows to use type-safe labels.

Motivation

This PR only covers Counter functionality as an example. If idea is fine for community, I'll push further commits expanding logic to Gauge, Histogram, etc
For detailed motivation see my comment below

Fixes #1599

Why?

Currently having unsafe labels lead to several problems: either err-handling nightmare, either panicing (in case if you use "promauto")

Having unsafe labels can lead to following issues:

  • Misspelling
  • Mistyping
  • Misremembering
  • Too many labels
  • Too few labels

As of state of art of modern Go version, we can use Go Generics to solve these issues.

Examples of how to use it

1. Multi-label mode (safe structs)

type MyCounterLabels struct {
	promsafe.StructLabelProvider
	EventType string
	Success   bool
	Position  uint8 // yes, it's a number, but be careful with high-cardinality labels

	ShouldNotBeUsed string `promsafe:"-"`
}

// Note on performance: 
// By default if no custom logic provider the MyCounterLabels will use reflect to extract label names and values
// But if performance matters you can define your own exporting methods:
// Optional! (if not specified, it will fallback to reflect)
func (f MyCounterLabels) ToPrometheusLabels() prometheus.Labels {
	return prometheus.Labels{"event_type": f.EventType, "success": fmt.Sprintf("%v", f.Success), "position": fmt.Sprintf("%d", f.Position)}
}
// Optional! (if not specified it will fallback to reflect)
func (f MyCounterLabels) ToLabelNames() []string { return []string{"event_type", "success", "position"} }

// Creating a typed counter providing specific labels type
c := promsafe.NewCounterVecT[MyCounterLabels](prometheus.CounterOpts{
	Name: "items_counted",
})

// Manually register the counter
if err := prometheus.Register(c.Unsafe()); err != nil {
	log.Fatal("could not register: ", err.Error())
}

//  it ONLY allows you to fill the MyCounterLabels here
c.With(MyCounterLabels{
	EventType: "request", Success: true, Position: 1,
}).Inc()

2. (Experimental, syntax sugar) Single-label mode (Pure string, no structs)

c := promsafe.NewCounterVecT1(prometheus.CounterOpts{
	Name: "items_counted_by_status",
}, "status") // provide only the single label name

// Manually register the counter
if err := prometheus.Register(c.Unsafe()); err != nil {
	log.Fatal("could not register: ", err.Error())
}

// Main difference from original metric that it will require an ONLY string here
c.With("active").Inc()

Compatibility with promauto

1. promauto.With call migration

var myReg = prometheus.NewRegistry()

counterOpts := prometheus.CounterOpts{
	Name: "items_counted",
}

// Old unsafe code
// promauto.With(myReg).NewCounterVec(counterOpts, []string{"event_type", "source"})
// becomes:

type MyLabels struct {
	promsafe.StructLabelProvider
	EventType string
	Source    string
}
c := promsafe.WithAuto(myReg).NewCounterVecT[MyLabels](counterOpts)

c.With(MyLabels{
	EventType: "reservation", Source: "source1",
}).Inc()

Note:

All non-string value types will be automatically converted to string. Here we can add a reasonable type-validation, so we can make it only to work with fields that are strings/bools/ints

2. Global promauto setup (all New* calls will behave same as promauto.New*)

// Setup so every NewCounter* call will use default registry
// like promauto does
// Note: it actually accepts other registry to become a default one
promsafe.SetupGlobalPromauto()

counterOpts := prometheus.CounterOpts{
	Name: "items_counted",
}

// Old code:
//c := promauto.NewCounterVec(counterOpts, []string{"status", "source"})
//c.With(prometheus.Labels{
//	"status": "active",
//	"source": "source1",
//}).Inc()
// becomes:

type MyLabels struct {
	promsafe.StructLabelProvider
	Status string
	Source string
}
// Pointer types work as well
c := promsafe.NewCounterVecT[*MyLabels](counterOpts)

c.With(&MyLabels{
	Status: "active", Source: "source1",
}).Inc()

@amberpixels amberpixels changed the title Promsafe feature introduced Promsafe: Strongly-typed safe labels Aug 28, 2024
@amberpixels amberpixels force-pushed the feature/promsafe branch 3 times, most recently from c064fc4 to 83aba46 Compare August 28, 2024 16:11
Signed-off-by: Eugene <[email protected]>
@bwplotka
Copy link
Member

bwplotka commented Sep 3, 2024

Hi! Thanks for innovating here 💪🏽

I presume this is about using generics for label values type safety -- in the relation to defined label names.

Currently having unsafe labels lead to several problems: either err-handling nightmare, either panicing (in case if you use "promauto")

Can you share exactly the requirements behind promsafe. Perhaps it would allow us to make decision if such package is useful to enough to maintain in client_golang OR existing solutions are enough OR is there a way to extend existing packages with improvements for the same goals.

For example, how often you see those err handling nightmare and panics in practice? Can you share some experience/data?

Generally, what's recommended is hardcoding label values in WithLabelValues, which by testing given code-path you know immediately if it's panicking. If you use dynamic values (e.g. variable) as your values then it's generally prone to cardinality issues anyway, thus we experimented with constraint labels solution.

Thus, let's circle back to barebone requirements we want here 🤗 e.g. generally you should avoid using With. What are the cases we are solving here?

Additionally, performance is important for this increment flow, so it would be nice to check how this applies.

@amberpixels
Copy link
Author

amberpixels commented Sep 4, 2024

Hey. Thanks for a feedback. Let me share details on my motivation behind the provided promsafe package.

By err handling nightmare / panics I meant the following cases:

// Counter registration: we're fine with possible panic here :)
myCounter := promauto.NewCounterVec(prometheus.CounterOpts{
    Name: "items_counted",
}, []string{"event_type", "success", "slot" /* 1/2/3/.../9 */})

// But using counter: where there motivation comes from:

// Using .GetMetricWith*() methods will error if labels are messed up
myCounterWithValues, err := myCounter.GetMetricWith(prometheus.Labels{
    "event_type": "reservation",
    "success":    "true",
    "slot":       "1",
})
if err != nil {
    // TODO: handle error
}
// Same error can happen if using *WithLabelValues version:
// myCounterWithValues, err := myCounter.GetMetricWithLabelValues("reservation", "true", "1")

// To avoid error-handling we can use .With/.WithLabelValues, but it will just panic for the same reasons:
myCounter.WithLabelValues("reservation", "true", "2").Inc()

💡 So here and further i call "panic" both panicing of .With* methods or error-handling in .GetMetricWith* methods

Why Panic? why it matters?

Here are several reasons:

  1. Misspelling. You can misspell label names. (Not relevant for WithLabelValues though)
  2. Misremembering. You can forget the name of the label. In case of using WithLabelValues you still need to remember the number of labels and their order (and what they mean)
  3. Missing labels. You can forget a label (both in map of With() or in slice of WithLabelValues())
  4. Extra labels. You can accidentally pass extra labels (both in map of With() or in slice of WithLabelValues())
  5. Manual string conversion can lead to failures as well. E.g. you must know to use fmt.Sprintf("%v", boolValue), and choosing wrong "%v" placeholder can ruin values.

All these reasons are possible ways to break code because of panicking in .With() or .WithLabelValues(). Let's not spend time and efforts on code-reviews to ensure that new usage of "counter inc" is not breaking everything.

Also, one more reason is not about failing but about consistency:
6. Type-safety allows you to be both less error-prone and more consistent.
E.g. you just pass bool values as label values, and you know it will always be "true"/"false" not "1","0","on","off","yes",...
same for numbers

How it's solved by promsafe?

// Promsafe example:
// Registering a metric with simply providing the type containing labels
type MyCounterLabels struct {
    promsafe.StructLabelProvider
    EventType string
    Success   bool
    Slot      uint8 // yes, it's a number, still should be careful with high-cardinality labels
}
myCounterSafe := promsafe.NewCounterVecT[MyCounterLabels](prometheus.CounterOpts{
    Name: "items_counted_detailed",
})

// Calling counter is simple: just provide the filled struct of the dedicated type.
//
// Neither of 5 reasons can panic here. You simply can't mess up the struct.
// With() accepts ONLY this type of struct, you can't send any other struct.
// You don't need to remember the fields and their order. IDE will show you them.
// You can't send more fields.
// You can send less fields (but it can easily fill up with default values, or other custom non-panicy logic).
// You can't mess up types.
// You're consistent with types.
myCounterSafe.With(MyCounterLabels{
    EventType: "reservation", Success: true, Slot: 1,
}).Inc()

P.S. issue with inconsistency of promsafe-version of WithLabelValues() method

// One thing that I need to specify here is the inconsistency with promsafe-version of WithLabelValues()

// WithLabelValues() excepts ordered raw strings, that unfortunately breaks the "safety" concept. 
// We can't control the order of given strings and even the length of it
// That's why in promsafe, .WithLavelValues() and .GetMetricWithLabelValues() are disabled:
// They are marked deprecated and panic (so they are strongly considered not to be used)

@dmvinson
Copy link

This API is really nice, would love to see this merged. I've already ran into a few of the failure modes @amberpixels mentioned in my first few weeks of using this library.

@amberpixels
Copy link
Author

Small update.

I've pushed some improvements in API, so it's more consistent and stable.
Also, I've update the PR description with cleaner and clearer examples, and added a note on performance issue.

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for amazing work!

I think this is a great direction, but I'm not sure it's at the stage where we want to claim full stability and maintenance of it in the client_golang v1.

I would like to explore slimmer "adapter" that just offers With(labels T) K method -- it would simplify the code to maintain and allow composability.

Then there is efficiency aspect I would like to understand, given this is a hot path.

Also before committing to any of this we have to ask ourselves what to recommend or deprecate in this place. We are getting to the place where there are many ways of doing the same thing, so would love to decide what to remove, if we think this is the way to go.

To achieve and answer all of this, I wonder:
A) How bad would it be to host promsafe in another repository for incubation period?
B) Is there a room for prometheus/client_golang/exp module which we could version v0.x and put other experimental stuff like Remote API?

// limitations under the License.

// Package promauto_adapter provides compatibility adapter for migration of calls of promauto into promsafe
package promauto_adapter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put it in promsafe honestly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Type-safe labels support?
3 participants