Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tests for bidirectional isolation #917

Merged
merged 1 commit into from
Nov 4, 2024
Merged

Fix tests for bidirectional isolation #917

merged 1 commit into from
Nov 4, 2024

Conversation

eemeli
Copy link
Collaborator

@eemeli eemeli commented Oct 27, 2024

The current tests do not account for the bidirectional isolation of the formatted results. This adds an option bidiIsolation to the tests, with values 'default' and 'none' corresponding to the Default BiDi Strategy and to a strategy of not applying any isolation.

Most of the existing tests follow the 'none' strategy, and have been left unchanged except for noting this state of affairs. The bidi.json tests have been updated to use the 'default' strategy, as that seemed appropriate.

A new type: 'bidiIsolation' part is added, and included in the u-options.json expParts results as appropriate; some bugs in those tests have also been addressed.

Comment on lines 120 to 124
{
"locale": "ar",
"src": "أهلاً {بالعالم :string}",
"exp": "أهلاً \u2067بالعالم\u2069"
"exp": "أهلاً \u2068بالعالم\u2069"
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may need to drop this test, or more explicitly define whether :string applies introspection to its value to determine its directionality. As we don't forbid introspection, an implementation could use either an RLI or FSI here.

Presuming that the introspection would follow UAX#9, the user's perception of the results would be the same, so my preference would be to make the character-level output predictable by requiring that no introspection is done in the absence of a u:dir override or an operand that includes an explicit direction.

Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I puzzled with it for a while but think this is right. The prevalence of FSI/PDI wrappers even in a purely LTR context makes me itch.

@@ -114,7 +115,7 @@
{
"description": " name... excludes U+FFFD and U+061C -- this pases as name -> [bidi] name-start *name-char",
"src": ".local $\u061Cfoo = {1} {{ {$\u061Cfoo} }}",
"exp": " 1 "
"exp": " \u20681\u2069 "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess in this case, the direction of the message is determined to be unknown? Which is a little surprising to me, since the only bidi control character appears in one of the variable names, which I would expect not to have an effect on the formatted result of the message. Maybe what's throwing me off is that the explanation of the "Default Bidi Strategy" at the end of this section doesn't say how the message directionality is determined?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bidi controls in the syntax source don't matter here at all, and the message has LTR direction because of the en-US locale. It's the placeholder's direction that is unknown. For it to be known, we'd need to introspect the 1 string with something like UAX#9 P2, which will ultimately be done by the rendering engine, and even if we do, the direction would still be unknown because European numbers are only weakly directional. And so this is appropriately reflected by FSI/PDI isolating it.

This is much the same thing as I mention in #917 (comment), but for LTR instead of RTL content.

Maybe what's throwing me off is that the explanation of the "Default Bidi Strategy" at the end of this section doesn't say how the message directionality is determined?

Good point. We may want to add a mention under Formatting Context about deducing the message direction from the locale if it's not explicitly given.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not explicitly provided, the direction of the message should be computed from the locale. As @eemeli notes, controls in the source don't matter. As noted elsewhere, you can't just look at the rendered placeholder to determine its desired direction--even strongly directional characters do not necessarily provide an accurate signal.

@@ -118,6 +118,9 @@
"src": {
"$ref": "#/$defs/src"
},
"bidiIsolation": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aphillips aphillips merged commit 313bc38 into main Nov 4, 2024
2 checks passed
@aphillips aphillips deleted the fix-bidi-tests branch November 4, 2024 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants