Skip to content

Commit

Permalink
templatize CollationTest.html
Browse files Browse the repository at this point in the history
  • Loading branch information
markusicu committed Aug 23, 2023
1 parent 51e7890 commit 1926a16
Show file tree
Hide file tree
Showing 3 changed files with 17 additions and 7 deletions.
5 changes: 5 additions & 0 deletions pub/copy-beta-to-draft.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ UNITOOLS_DATA=$UNICODETOOLS/unicodetools/data
COPY_YEAR=2023
UNI_VER=15.1.0
EMOJI_VER=15.1
# UTS #10 release revision number to be used in CollationTest.html:
# One more than the last release revision number.
TR10_REV=tr10-48

TODAY=`date --iso-8601`

Expand All @@ -25,6 +28,7 @@ s/PUB_DATE/$TODAY/
s/PUB_STATUS/draft/
s/UNI_VER/$UNI_VER/
s/EMOJI_VER/$EMOJI_VER/
s/TR10_REV/$TR10_REV/
s%PUBLIC_EMOJI%Public/draft/emoji/%
s%PUBLIC_UCD_EMOJI%Public/draft/UCD/ucd/emoji/%
eof
Expand All @@ -37,6 +41,7 @@ rm $DRAFT/UCD/ucd/zipped-ReadMe.txt

mkdir -p $DRAFT/UCA
cp -r $UNITOOLS_DATA/uca/dev/* $DRAFT/UCA
sed -i -f $DEST/sed-readmes.txt $DRAFT/UCA/CollationTest.html

mkdir -p $DRAFT/emoji
cp $UNITOOLS_DATA/emoji/dev/* $DRAFT/emoji
Expand Down
5 changes: 5 additions & 0 deletions pub/copy-final.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ UNITOOLS_DATA=$UNICODETOOLS/unicodetools/data
COPY_YEAR=2023
UNI_VER=15.1.0
EMOJI_VER=15.1
# UTS #10 release revision number to be used in CollationTest.html:
# *Two* more than the last release revision number.
TR10_REV=tr10-49

TODAY=`date --iso-8601`

Expand All @@ -25,6 +28,7 @@ s/PUB_DATE/$TODAY/
s/PUB_STATUS/final/
s/UNI_VER/$UNI_VER/
s/EMOJI_VER/$EMOJI_VER/
s/TR10_REV/$TR10_REV/
s%PUBLIC_EMOJI%Public/emoji/$EMOJI_VER/%
s%PUBLIC_UCD_EMOJI%Public/$UNI_VER/ucd/emoji/%
eof
Expand All @@ -38,6 +42,7 @@ mv $DEST/$UNI_VER/ucd/zipped-ReadMe.txt $DEST/zipped/$UNI_VER/ReadMe.txt

mkdir -p $DEST/UCA/$UNI_VER
cp -r $UNITOOLS_DATA/uca/dev/* $DEST/UCA/$UNI_VER
sed -i -f $DEST/sed-readmes.txt $DEST/UCA/$UNI_VER/CollationTest.html

mkdir -p $DEST/emoji/$EMOJI_VER
cp $UNITOOLS_DATA/emoji/dev/* $DEST/emoji/$EMOJI_VER
Expand Down
14 changes: 7 additions & 7 deletions unicodetools/data/uca/dev/CollationTest.html
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@
</tbody></table>
<div class="body">
<h1>Unicode® Collation Algorithm<br>Conformance Tests</h1>
<h2 align="center" class="changed">Version 15.1.0<br>2023-05-16</h2>
<h2 align="center">Version UNI_VER<br>PUB_DATE</h2>
<p>The following files provide conformance tests for the Unicode Collation Algorithm
(<a href="https://www.unicode.org/reports/tr10/tr10-48.html">UTS #10: Unicode Collation Algorithm</a>).</p>
(<a href="https://www.unicode.org/reports/tr10/TR10_REV.html">UTS #10: Unicode Collation Algorithm</a>).</p>
<ul>
<li>CollationTest_SHIFTED.txt</li>
<li>CollationTest_NON_IGNORABLE.txt</li>
Expand Down Expand Up @@ -53,7 +53,7 @@ <h2>Format</h2>
<p>There are four different files:</p>
<ul>
<li>The shifted vs non-ignorable files correspond to the two alternate
<a href="https://www.unicode.org/reports/tr10/tr10-48.html#Variable_Weighting">Variable Weighting</a> values.</li>
<a href="https://www.unicode.org/reports/tr10/TR10_REV.html#Variable_Weighting">Variable Weighting</a> values.</li>
<li>The SHORT versions omit the comments, for more compact storage.</li>
</ul>
<p>The format is illustrated by the following example:</p>
Expand All @@ -75,7 +75,7 @@ <h2>Format</h2>
<h2>Testing</h2>
<p>The files are designed so each line in the file will order as being greater than or equal to
the previous one, when using the UCA and the
<a href="https://www.unicode.org/reports/tr10/tr10-48.html#Default_Unicode_Collation_Element_Table">Default
<a href="https://www.unicode.org/reports/tr10/TR10_REV.html#Default_Unicode_Collation_Element_Table">Default
Unicode Collation Element Table</a>.
A test program can read in each line, compare it to
the last line, and signal an error if order is not correct. The exact comparison that should be
Expand Down Expand Up @@ -123,17 +123,17 @@ <h3>Discontiguous contractions</h3>
<li>S2.1.1 loops over each of the following three characters C,
but there is no table entry for any of those three S+C.
In particular, there is no DUCET mapping for 0FB2+0F71
(see <i><a href="https://www.unicode.org/reports/tr10/tr10-48.html#Well_Formed_DUCET">Tibetan and
(see <i><a href="https://www.unicode.org/reports/tr10/TR10_REV.html#Well_Formed_DUCET">Tibetan and
Well-Formedness of DUCET</a></i>).</li>
<li>The loop exits without finding any match beyond S=0FB2.</li>
</ul>

<p>See “Also note that the Algorithm employs two distinct contraction matching methods:”
at the end of <i>Section 7.2,
<a href="https://www.unicode.org/reports/tr10/tr10-48.html#Step_2">Produce Collation Element Arrays</a></i>.</p>
<a href="https://www.unicode.org/reports/tr10/TR10_REV.html#Step_2">Produce Collation Element Arrays</a></i>.</p>

<hr width="50%">
<p class="copyright">© <span class="changed">2023</span> Unicode, Inc. All Rights Reserved.
<p class="copyright">© COPY_YEAR Unicode, Inc. All Rights Reserved.
The Unicode Consortium makes no expressed or implied warranty
of any kind, and assumes no liability for errors or omissions. No liability
is assumed for incidental and consequential damages in connection with or arising
Expand Down

0 comments on commit 1926a16

Please sign in to comment.