fix: values escaping bugs #727

ywwg · 2024-11-21T15:16:07Z

I missed some test cases the first time around. Confirmed that without this fix the test cases fail the round trip.

jesusvazquez

I've had a look and since tests are working It gives us some assurances.

Left a couple questions for my own understanding.

jesusvazquez · 2024-11-22T14:43:08Z

model/metric.go

@@ -391,7 +382,7 @@ func UnescapeName(name string, scheme EscapingScheme) string {
 			var utf8Val uint
 			for j := 0; i < len(escapedName); j++ {
 				// This is too many characters for a utf8 value.
-				if j > 4 {
+				if j > 8 {


Where does this come from?

the max UTF codepoint is '\U0010FFFF', therefore the max number of characters is actually 6, not 8.

Sorry I dived a bit and found a few constants, have a look at this https://go.dev/play/p/PoHxOOeoyQu

The maximum number of bytes needed to represent unicode.MaxRune is: 4 There is also utf8.UTFMAX: : 4

Shouldnt this remain as 4?

Wait, it should be 7, right? The 4 bytes in the longest possible rune encode as 8 hexadecimal digits, so j can legally be between 0 and 7.

ah yup, 7 is correct

well, maxrune is \U0010FFFF so maybe 6 is too many characters?

Sorry for slow thinking. I think I got it now:

utf8.UTFMAX is about the bytes needed in UTF-8 encoded form. But what we put into the U__ encoding is the code point, which is different from the encoded byte sequence. The codepoint \U0010FFFF that @ywwg mentioned would be written as U___10FFFF_. As UTF-8 string, it would take the byte pattern F4 8F BF BF (the 4 bytes max length that utf8.UTFMAX refers to).

But this means the correct number here is 5, don't you think so?

aha yes -- or as you suggest, >=6

jesusvazquez · 2024-11-22T14:44:46Z

model/metric.go

@@ -318,21 +315,15 @@ func EscapeName(name string, scheme EscapingScheme) string {
 		}
 		escaped.WriteString("U__")
 		for i, b := range name {
-			if isValidLegacyRune(b, i) {
+			if b == '_' {
+				escaped.WriteString("__")


Something rings a bell that this was decided when there is an underscore put two underscores but could we add a reference to where this was agreed on?

This is necessary so that we can know to start parsing a unicode value if we see only 1 underscore. this is similar to dots escaping where dots become _dot_ and we double underscores.

model/metric_test.go

Issues with underscores and large unicode value conversion Signed-off-by: Owen Williams <[email protected]>

ywwg force-pushed the owilliams/underscores branch from bf1fba0 to 7d51d16 Compare November 21, 2024 15:18

ywwg marked this pull request as draft November 21, 2024 17:49

ywwg force-pushed the owilliams/underscores branch from 7d51d16 to 0cf7f55 Compare November 21, 2024 18:11

ywwg marked this pull request as ready for review November 21, 2024 18:11

ywwg changed the title ~~fix: values escaping needs to double underscores~~ fix: values escaping bugs Nov 21, 2024

ywwg force-pushed the owilliams/underscores branch from 0cf7f55 to 14d8b64 Compare November 21, 2024 18:12

ywwg requested a review from beorn7 November 21, 2024 18:36

jesusvazquez approved these changes Nov 22, 2024

View reviewed changes

npazosmendez reviewed Nov 22, 2024

View reviewed changes

model/metric_test.go Outdated Show resolved Hide resolved

ywwg force-pushed the owilliams/underscores branch from 9acceae to 95defda Compare November 22, 2024 16:13

ywwg requested review from jesusvazquez and npazosmendez November 25, 2024 15:55

ywwg force-pushed the owilliams/underscores branch from b35c146 to 0288113 Compare November 25, 2024 16:20

fix: values escaping bugs

febd997

Issues with underscores and large unicode value conversion Signed-off-by: Owen Williams <[email protected]>

ywwg force-pushed the owilliams/underscores branch from 0288113 to febd997 Compare November 27, 2024 18:19

beorn7 approved these changes Nov 27, 2024

View reviewed changes

ywwg merged commit 39a62f7 into main Nov 27, 2024
8 checks passed

ywwg deleted the owilliams/underscores branch November 27, 2024 18:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: values escaping bugs #727

fix: values escaping bugs #727

ywwg commented Nov 21, 2024 •

edited

Loading

jesusvazquez left a comment

jesusvazquez Nov 22, 2024

ywwg Nov 22, 2024

jesusvazquez Nov 22, 2024

beorn7 Nov 26, 2024

ywwg Nov 27, 2024

ywwg Nov 27, 2024

beorn7 Nov 27, 2024 •

edited

Loading

ywwg Nov 27, 2024

jesusvazquez Nov 22, 2024

ywwg Nov 22, 2024

fix: values escaping bugs #727

fix: values escaping bugs #727

Conversation

ywwg commented Nov 21, 2024 • edited Loading

jesusvazquez left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

beorn7 Nov 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ywwg commented Nov 21, 2024 •

edited

Loading

beorn7 Nov 27, 2024 •

edited

Loading