From: Jim Allan (jallan@smrtytrek.com)
Date: Mon Dec 29 2003 - 13:21:59 EST
Philippe Verdy wrote:
> We have:
> 02A7;LATIN SMALL LETTER TESH DIGRAPH;Ll;0;L;;;;;N;LATIN SMALL LETTER T
> ESH;;;;
> but no canonical or compatibility decomposition as t + esh, even
> though it
> is a clear ligature
> using the short-leg esh.
>
> I wonder why there's no VARIANT defined for the short leg ESH (i.e.
> that has
> no descender
> below the baseline).
>
> In fact other interesting "digraphs" are:
> 02A3;LATIN SMALL LETTER DZ DIGRAPH;Ll;0;L;;;;;N;LATIN SMALL LETTER D
> Z;;;;
> 02A4;LATIN SMALL LETTER DEZH DIGRAPH;Ll;0;L;;;;;N;LATIN SMALL LETTER D
> YOGH;;;;
> 02A5;LATIN SMALL LETTER DZ DIGRAPH WITH CURL;Ll;0;L;;;;;N;LATIN SMALL
> LETTER
> D Z CURL;;;;
> 02A6;LATIN SMALL LETTER TS DIGRAPH;Ll;0;L;;;;;N;LATIN SMALL LETTER T
> S;;;;
> 02A7;LATIN SMALL LETTER TESH DIGRAPH;Ll;0;L;;;;;N;LATIN SMALL LETTER T
> ESH;;;;
> 02A8;LATIN SMALL LETTER TC DIGRAPH WITH CURL;Ll;0;L;;;;;N;LATIN SMALL
> LETTER
> T C CURL;;;;
> 02A9;LATIN SMALL LETTER FENG DIGRAPH;Ll;0;L;;;;;N;;;;;
> 02AA;LATIN SMALL LETTER LS DIGRAPH;Ll;0;L;;;;;N;;;;;
> 02AB;LATIN SMALL LETTER LZ DIGRAPH;Ll;0;L;;;;;N;;;;;
>
> For D Z CURL, it's strange that we don't find in the UCD a decomposition
> similar to the decomposition of D Z...
None of this is strange.
The point of these characters is that they can be used in phonetics for
particular reasons and even contrast with the what appear to be their
graphic elements if these also appear separately.
For example, 02A6 LATIN SMALL LETTER TS DIGRAPH strongly suggests that
the user intends this to represent a single phoneme within whatever
phonemic system is represented and it may contrast to _t_ followed by
_s_ which would be simple /ts/.
None of these are *optional* ligatures which can be broken down into
their graphic components without losing semantics. Therefore they have
no canonical or compatibility equivalents.
> Finally, it seems that these two:
> 021C;LATIN CAPITAL LETTER YOGH;Lu;0;L;;;;;N;;;;021D;
> 021D;LATIN SMALL LETTER YOGH;Ll;0;L;;;;;N;;;021C;;021C
> are variants of
> 01B7;LATIN CAPITAL LETTER EZH;Lu;0;L;;;;;N;LATIN CAPITAL LETTER
> YOGH;;;0292;
> 0292;LATIN SMALL LETTER EZH;Ll;0;L;;;;;N;LATIN SMALL LETTER
> YOGH;;01B7;;01B7
> and I wonder how these YOGH differ from EZH, or if the Unicode 1.0
> name of
> EZH was misleading...
See Michael Everson's discussion at
http://www.evertype.com/standards/wynnyogh/ezhyogh.html
Of course in fact EZH has often been used for YOGH and was well on its
way to becoming the modern glyph for YOGH in citations of Middle English
or in some linguistic work.
This creates a quandary: if citing a text which uses an EZH glyph for
YOGH (as for example is found in many of the "History of Middle-earth"
books containing Christopher Tolkien's editing of his father J.R.R.
Tolkien's unpublished papers) should one quote by spelling and display
the Unicode EZH character or silently substitute the Unicode YOGH character?
There is no obviously right answer for all cases or even for many
individual cases.
When in doubt use the number three. ;-)
Jim Allan
This archive was generated by hypermail 2.1.5 : Tue Dec 30 2003 - 14:01:47 EST