Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement :unit as Proposed RECOMMENDED in the registry #922

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

aphillips
Copy link
Member

This introduces :unit as an optional function, per discussion elsewhere.

@aphillips aphillips changed the title Implement :unit as _OPTIONAL_ in the registry Implement :unit as OPTIONAL in the registry Oct 30, 2024
@aphillips aphillips added registry LDML46.1 MF2.0 Draft Candidate labels Oct 30, 2024
@aphillips aphillips marked this pull request as ready for review October 30, 2024 22:56
spec/registry.md Outdated Show resolved Hide resolved
spec/registry.md Outdated Show resolved Hide resolved
spec/registry.md Outdated Show resolved Hide resolved
spec/registry.md Outdated Show resolved Hide resolved
spec/registry.md Show resolved Hide resolved
spec/registry.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with :currency currencyDisplay, I am not comfortable with introducing functionality to the core function set that is not already available in the JS Intl formatters. Here this means that I think we should:

  1. leave out unit conversion,
  2. leave out usage, and
  3. not require support for the full list of unit identifiers.

I'm quite open to discussing these aspects as iterations on top of initial proposals for :unit and :currency, but I find their inclusion to be a significant divergence from what we've done so far. As we want to conclude work on the initial release within a few weeks, we should aim for a smaller :unit that we can all agree to. We can and should improve the functions later, rather than trying to cram in all the features right away.

The basis for my position here is that we are defining a message formatter, and that the original impetus for this work came from TC39 TG2, under which this group was initially organised. We should therefore be able to deliver a message formatter that does not impose additional requirements on other formatters, such as unit or currency formatters, or introduce a deeper dependency on CLDR data than is minimally required.

I am aware that :unit is here proposed as optional, and that unit conversion is even more optional, but they're still all part of the core function set.

Otherwise, this proposal looks quite good.

Co-authored-by: Mark Davis <[email protected]>
@macchiati
Copy link
Member

As with :currency currencyDisplay, I am not comfortable with introducing functionality to the core function set that is not already available in the JS Intl formatters. Here this means that I think we should:

  1. leave out unit conversion,
  2. leave out usage, and
  3. not require support for the full list of unit identifiers.

I'm quite open to discussing these aspects as iterations on top of initial proposals for :unit and :currency, but I find their inclusion to be a significant divergence from what we've done so far. As we want to conclude work on the initial release within a few weeks, we should aim for a smaller :unit that we can all agree to. We can and should improve the functions later, rather than trying to cram in all the features right away.

The basis for my position here is that we are defining a message formatter, and that the original impetus for this work came from TC39 TG2, under which this group was initially organised. We should therefore be able to deliver a message formatter that does not impose additional requirements on other formatters, such as unit or currency formatters, or introduce a deeper dependency on CLDR data than is minimally required.

While the original impetus was TC39, we need to make sure that it meets the needs of the broader community of implementers. The fact that it is optional means that it is not required of conformant implementations; thus a 'deeper dependency on CLDR' is also not required.

usage is a key requirement, because it allows implementers to format units in a customary way for a target locale, without having to investigate what all the customary units are for the target locale (country+region) and special-case each one themselves — just as number formatting doesn't force implementers to investigate what all the decimal/grouping separators and digits are needed for all the target locales. units is not internationalized (the main point of message formatting) without including that.

On the other hand, we should talk about your #3.

I am aware that :unit is here proposed as optional, and that unit conversion is even more optional, but they're still all part of the core function set.

Otherwise, this proposal looks quite good.

@duerst
Copy link

duerst commented Nov 4, 2024

As with :currency currencyDisplay, I am not comfortable with introducing functionality to the core function set that is not already available in the JS Intl formatters. Here this means that I think we should:

  1. leave out unit conversion,

This reminds me of the early work on HTML internationalization (1995). In the first draft (see
https://datatracker.ietf.org/doc/html/draft-ietf-html-i18n-00#section-4.2) we included a facility to convert units. I told my coauthors this was a bad idea (premature and not well cooked), but they said they put it in so that the people who didn't understand the need for internationalization (at that time, that was essentially everybody) had something to shoot down. That's exactly what happened; the relevant section is gone in https://datatracker.ietf.org/doc/html/draft-ietf-html-i18n-01.

It's very clear that in this day and age, we don't need any such maneuvers anymore. It's also still very clear that while unit conversion sounds like a good idea at first, it's very context-dependent and therefore very difficult to get right.

@aphillips
Copy link
Member Author

(as contributor)

I agree with @duerst that conversion is tricky. I think the usage stuff is supposed to make it "skeleton-like" in that locale data would be used to perform appropriate conversions without having to hardcode the mapping into the message. However, it strikes me as a complex mechanism that is implemented in ICU but not generally. I think users will find it more difficult to use, at least initially. But I see the value in providing the syntax and encouraging this part of CLDR to be adopted.

Elsewhere in the commentary with @macchiati, I note that I started from an API that I'm more personally familiar with and that said API in question has some quirks in it. Having looked into the more modern API, I was able to produce this version, but I'm reluctant to commit it. I think we should sideline this work until post-46.1 (focus on finishing standard parts of the spec--we have just a few weeks remaining in our schedule) and ensure that what we do produce is fully baked. It could easily serve as an early instance of API function set management (to proof the process).

@mihnita
Copy link
Collaborator

mihnita commented Nov 4, 2024

The usage stuff is a mixture of several things.
They work part as a skeleton, but sometimes they determines the measurement unit.

If (for example) the usage is "person-height" then the exact same measure amount (let's say 175 cm) would result in "1.75m" or "175cm" or "1m 75cm" depending on locale (works like a skeleton).

For temperature, if the region is Bahamas (BS) and the usage is "weather" the result will use Fahrenheit, but will use Celsius for everything else (determines the unit).


Units are more complicated than they seem.

I think that leaving them post-46.1 would be a good idea.
Take them on only if we finish everything else :-)

@macchiati
Copy link
Member

macchiati commented Nov 4, 2024 via email

- Cleanup the `usage` option (which appeared twice).
- Remove `ordinal` selection and add a note to that effect.
- Clean up text about conversion
@aphillips aphillips changed the title Implement :unit as OPTIONAL in the registry Implement :unit as **standard** in the registry Nov 8, 2024
@aphillips aphillips changed the title Implement :unit as **standard** in the registry Implement :unit as **optional** in the registry Nov 8, 2024
- Replace `Composition` with `Resolved Value`
- Match phrasing to other functions
- Move `Resolved Value` before `Selection` like other functions
Copy link
Member

@macchiati macchiati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks

Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for leaving this until so late, but I simply hadn't had time for this PR. I've left a bunch of concerns inline, and I don't think this is ready yet.

spec/registry.md Outdated Show resolved Hide resolved
spec/registry.md Show resolved Hide resolved
spec/registry.md Outdated Show resolved Hide resolved
spec/registry.md Outdated Show resolved Hide resolved
Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flow of options seems okay to me. I defer to others on error handling, etc.

@macchiati
Copy link
Member

macchiati commented Nov 19, 2024

Note, while doing #949 I realized that we say that all RECOMMENDED functions/options need a maturity status. It sounds like (from the WG discussion) that it would be Proposed for now. Since that is only in exploration/maintaining-registry.md, I presume it would be removed for release.

spec/registry.md Outdated Show resolved Hide resolved
spec/registry.md Show resolved Hide resolved
spec/registry.md Outdated Show resolved Hide resolved
spec/registry.md Outdated Show resolved Hide resolved
Copy link
Member

@macchiati macchiati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made suggestions to address Eemeli's comments, and would approve with those changes.

spec/registry.md Outdated Show resolved Hide resolved
spec/registry.md Outdated Show resolved Hide resolved
spec/registry.md Outdated Show resolved Hide resolved
spec/registry.md Outdated Show resolved Hide resolved
spec/registry.md Outdated Show resolved Hide resolved
spec/registry.md Outdated Show resolved Hide resolved
- Change `:unit` to **Proposed**
- Modify `usage` to use valid
- Modify `usage` to be in its own subsection
- Modify `usage` to SHOULD _Unsupported Operation_ when the conversion isn't supported
@aphillips aphillips changed the title Implement :unit as **optional** in the registry Implement :unit as Proposed RECOMMENDED in the registry Nov 20, 2024
Copy link
Member

@macchiati macchiati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small touchup

spec/registry.md Outdated Show resolved Hide resolved
spec/registry.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine to land as proposed, because that does not imply any stability.

My comments below can be addressed later, separately.

Comment on lines +677 to +679
If the _operand_ of the _expression_ is an implementation-defined type,
such as the _resolved value_ of an _expression_ with a `:unit` _annotation_,
it can include _option_ values.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear here that not all "implementation-defined types" are necessarily supported as operands.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand. I don't think this needs to indicate that some types are not supported as operands.

This text is not about the operand. It's about the handling of option values within a resolved value. It is nearly boilerplate, since most of our functions have a variation of this sentence. In practice, it might be better to say something like the following, which doesn't care what the operand is or is not:

Suggested change
If the _operand_ of the _expression_ is an implementation-defined type,
such as the _resolved value_ of an _expression_ with a `:unit` _annotation_,
it can include _option_ values.
The _resolved value_ of an _expression_ with a `:unit` _annotation_
includes the result of _option resolution_.

Such a boilerplate would be altered if some of the options were required to be filtered by the function. unit is candidate for this in :unit (since the unit presumably is now part of the resolved operand).

Comment on lines +693 to +700
The _resolved value_ of an _expression_ with a `:unit` _function_
consist of an implementation-defined unit value
of the _operand_ of the annotated _expression_,
together with the resolved _options_ and their resolved values.

#### Selection

The _function_ `:unit` performs selection as described in [Number Selection](#number-selection) below.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These don't really work right together. If unit conversion is applied, the formatted value can be completely different from the "unit value of the operand of the annotated expression".

Copy link
Member

@macchiati macchiati Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take 2 cases:

Suppose $n is an implementation-defined unit value, and the locale is en-US.

  1. If $n = {unit=kilogram, number=30} and we format {$n :unit maximumFractionDigits=0 usage=person} we'll get "66 pounds".
  2. If $n = {unit=pound, number=66.1387} and we format {$n :unit maximumFractionDigits=0 usage=person}, we'll also get "66 pounds".

So what is happening in both cases is that the numeric value for selection is equivalent to what the formatted value has. And that is what needs to be selected on, namely 66.

The wording we have for that in selection as described in

where resolvedSelector is the resolved value of a selector

So in both cases, the resolved value has to be 66.

There are of course other cases: n$ = {number=66.1387} and we format {$n :number maximumSignificantDigits=1} we'll get "70" (In English), and "٧٠" in Arabic. And that must match a selection key of 70.

So in this case, the resolved value has to be 70.

I think we haven't yet nailed 'resolved value', because the resolved value for selection and formatting ends up needing to be different than the resolved value for functional composition. But that we cannot 'resolve' at this point in the release.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that what you describe is likely the intent here, but the resolved value description talks of "an implementation-defined unit value of the operand of the annotated expression", with no reference to any conversion possibly being made to it, and so it's reasonable to assume that e.g. your case 1 example would end up with the numerical value 30 being the value that selection is done on, and as the selection is the same as used for other numerical values, it doesn't know about needing to multiply the number by any such factor.

Note how in :math, the numeric value held in resolved value is explicitly modified by the add and subtract option values. We need something like that here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we permitted unit to convert, I agree that the resolved value should be modified. But I think we should do like :currency and only allow unit to compose with a numeric operand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works for me, and is safe and far simpler.

It would still be possible for a future version to allow that, but I think one would have to make a good case for doing so.

spec/registry.md Outdated
Comment on lines 576 to 577
The _option_ `unit` MAY be used to override the units of an implementation-defined type,
provided the units are compatible and the implementation supports conversion.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make it clearer whether a function handler is allowed not to implement this conversion behaviour, of if the "MAY" is permitting a user to use it in such a fashion and expect it to work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. I missed this.

I have never thought of 'unit=x' as involving or requiring conversion, but rather simply using the numeric value of the operand with the specified unit x. I think that you are right; this should take a bit of rewording. Perhaps:

Suggested change
The _option_ `unit` MAY be used to override the units of an implementation-defined type,
provided the units are compatible and the implementation supports conversion.
The _option_ `unit` MAY be used to override the unit value of an implementation-defined type,
if the implementation supports unit conversion.
However, if the units cannot be converted, then the unit value is not overridden and a Bad Option error must be emitted.

Side issue

Frankly, currency is even worse. Suppose we have input of a implementation-defined parameter $n = {value=300, currency=EUR} and we format or select {$n :currency currency=JPY}. That is really nasty, displaying 300¥. However, there is no stable conversion possible.

So for :currency, we should make it really clear that if there is an implementation-defined operand that contains a currency value, the currency option is invalid, no matter its value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currency explicitly says that this is an error.

The unit example is a leftover from early on. I will make it use usage. We should do the same thing with the unit option that we do with the currency option: you can use it to create a unit value with a number operand. Otherwise it is an error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds great.


The _function_ `:unit` performs selection as described in [Number Selection](#number-selection) below.

#### Unit Conversion
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the text above, unit conversion is also possible via the unit option. That should be mentioned here beyond its errant use in the example below, possibly along with some description of what happens if you define both unit and usage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

Copy link
Member

@macchiati macchiati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Eemeli is ok with this, I suggest we merge it, and then have a followup PR where we can discuss touchups.

Copy link
Member

@macchiati macchiati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest changes look good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LDML46.1 MF2.0 Draft Candidate registry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants