-
-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix tests for bidirectional isolation #917
Conversation
{ | ||
"locale": "ar", | ||
"src": "أهلاً {بالعالم :string}", | ||
"exp": "أهلاً \u2067بالعالم\u2069" | ||
"exp": "أهلاً \u2068بالعالم\u2069" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may need to drop this test, or more explicitly define whether :string
applies introspection to its value to determine its directionality. As we don't forbid introspection, an implementation could use either an RLI or FSI here.
Presuming that the introspection would follow UAX#9, the user's perception of the results would be the same, so my preference would be to make the character-level output predictable by requiring that no introspection is done in the absence of a u:dir
override or an operand that includes an explicit direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I puzzled with it for a while but think this is right. The prevalence of FSI/PDI wrappers even in a purely LTR context makes me itch.
@@ -114,7 +115,7 @@ | |||
{ | |||
"description": " name... excludes U+FFFD and U+061C -- this pases as name -> [bidi] name-start *name-char", | |||
"src": ".local $\u061Cfoo = {1} {{ {$\u061Cfoo} }}", | |||
"exp": " 1 " | |||
"exp": " \u20681\u2069 " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess in this case, the direction of the message is determined to be unknown
? Which is a little surprising to me, since the only bidi control character appears in one of the variable names, which I would expect not to have an effect on the formatted result of the message. Maybe what's throwing me off is that the explanation of the "Default Bidi Strategy" at the end of this section doesn't say how the message directionality is determined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bidi controls in the syntax source don't matter here at all, and the message has LTR direction because of the en-US
locale. It's the placeholder's direction that is unknown. For it to be known, we'd need to introspect the 1
string with something like UAX#9 P2, which will ultimately be done by the rendering engine, and even if we do, the direction would still be unknown because European numbers are only weakly directional. And so this is appropriately reflected by FSI/PDI isolating it.
This is much the same thing as I mention in #917 (comment), but for LTR instead of RTL content.
Maybe what's throwing me off is that the explanation of the "Default Bidi Strategy" at the end of this section doesn't say how the message directionality is determined?
Good point. We may want to add a mention under Formatting Context about deducing the message direction from the locale if it's not explicitly given.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If not explicitly provided, the direction of the message should be computed from the locale. As @eemeli notes, controls in the source don't matter. As noted elsewhere, you can't just look at the rendered placeholder to determine its desired direction--even strongly directional characters do not necessarily provide an accurate signal.
@@ -118,6 +118,9 @@ | |||
"src": { | |||
"$ref": "#/$defs/src" | |||
}, | |||
"bidiIsolation": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/unicode-org/conformance/blob/main/schema/message_fmt2/testgen_schema.json will need to be updated to reflect this change.
The current tests do not account for the bidirectional isolation of the formatted results. This adds an option
bidiIsolation
to the tests, with values'default'
and'none'
corresponding to the Default BiDi Strategy and to a strategy of not applying any isolation.Most of the existing tests follow the
'none'
strategy, and have been left unchanged except for noting this state of affairs. Thebidi.json
tests have been updated to use the'default'
strategy, as that seemed appropriate.A new
type: 'bidiIsolation'
part is added, and included in theu-options.json
expParts
results as appropriate; some bugs in those tests have also been addressed.