From: Andrew West (andrewcwest@gmail.com)
Date: Fri Mar 23 2007 - 04:59:39 CST
On 21/03/07, Eric Muller <emuller@adobe.com> wrote:
> Andrew West wrote:
> > Take for example the compatability ideographs U+F914, U+F95C and
> > U+F9BF, which are all canonically equivalent to U+6A02 and which all
> > have exactly the same glyph shape. Would it have been acceptable to
> > represent them using variation selectors as 6A02-VS1, 6A02-VS2 and
> > 6A02-VS3 ?
>
> The case of the pronunciation variants is a bit more delicate. With
> today's understanding of what character encoding is about, I think it's
> fair to say that accommodating pronunciation variants in plain text is a
> non-goal, and in fact a misguided effort, in any character standard. Can
> you imagine having two coded characters for each ideograph used in
> Japan, one for On reading and one for Kun reading?
>
I can imagine it, but I can't imagine such a character encoding
standard existing outside of my imagination.
But nobody, especially not me, said anything about representing
pronunciation variants in plain text. The compatibility ideograhs were
encoded for roundtrip compatibility with existing standards, not so
that pronunciation variants of ideographs could be represented in
Unicode. It was you who suggested that variation selectors would have
been a preferable solution than compatibility ideographs, and as most
of the compatibility ideographs in the BMP are pronunciation variants
I wanted to understand how and whether variation selectors could be
used to represent non-glyphic differences.
> > Thinking forward to Tangut,
> I suspect it would be a hard sell today to convince the Unicode
> community to support round-tripping with a "standard" that encodes
> pronunciation differences.
>
It will be a hard sell to get Tangutologists to migrate from the
Mojikyo Tangut encoding to Unicode if we can't guarantee
roundtripping.
Andrew
This archive was generated by hypermail 2.1.5 : Fri Mar 23 2007 - 05:03:12 CST