Re: CYRILLIC SMALL/CAPITAL LETTER SELKUP OE (ISO 10756:1996)

From: Philippe Verdy <verdy_p_at_wanadoo.fr>
Date: Mon, 5 Mar 2012 21:30:21 +0100

Le 5 mars 2012 19:35, Denis Jacquerye <moyogo_at_gmail.com> a écrit :
> On Tue, Feb 28, 2012 at 4:00 AM, Philippe Verdy <verdy_p_at_wanadoo.fr> wrote:
>> I am looking for the codes or assignements status of the Cyrillic
>> letter OE/oe (ligatured) as used in Selkup (exactly similar to the
>> Latin pair).
>>
>> This character pair has been part of the registration nr. 223 (in
>> 1998) by ISO of the (8-bit) "extended Cyrillic character set for
>> non-Slavic languages for bibliographic information interchange" :
>>
>> http://www.itscj.ipsj.or.jp/sc2/open/02n3136.pdf
>>
>> According to this document, this character set had also been
>> standardized as ISO 10756:1996. Note that it contains many other
>> characters for which it did not document any mapping to the UCS in the
>> then emerging ISO 10646 standard.
>>
>> It has even been part of proposals at the UTC and ISO the same year
>> for including in the UCS, along with other characters (at that time,
>> Michael Everson wrote a proposal, placing them in U+04EC, U+04ED, but
>> since the, the slots have been used for other characters (that block
>> is now full).
>>
>> It is also referenced in the ISO 9 Cyrillic/Latin transliteration standard.
>>
>> Still, there's no Cyrillic character I can find in the encoded UCS in
>> other Cyrillic extended blocks that are not full (for example,  the
>> CYRILLIC SUPPLEMENT block at U+0500-052F).
>>
>> Where are those characters ? And what about the remaining characters
>> found in the Registration nr. 223 and ISO 10756:1996 ? And their
>> status in the ISO 9 standard itself ?
>>
>> Thanks.
>>
>> -- Philippe.
>>
>
> According to ftp://std.dkuug.dk/jtc1/sc2/WG2/docs/n2463.doc the
> Cyrillic Selkup OE is mapped to Latin OE:
> CYRILLIC SMALL LETTER SELKUP O E to U+0153 LATIN SMALL LIGATURE OE
> CYRILLIC CAPITAL LETTER SELKUP O E to U+0152 LATIN CAPITAL LIGATURE OE
> Several other of those missing Cyrillic characters are simply mapped
> to Latin ones or sort of decomposed.

Apparently this document is obsolete. Some of the proposed mappings to
Latin have been encoded as plain Cyrillic letters such as:

CYRILLIC SMALL LETTER KURDISH QA

(not the initially proposed mapping to LATIN SMALL LETTER Q)

This document was still a draft, and not a decision.

The document specifically says "The issue with these letters is
whether they should be deunified from Latin, and encoded in the
Cyrillic block".
Received on Mon Mar 05 2012 - 14:32:54 CST

This archive was generated by hypermail 2.2.0 : Mon Mar 05 2012 - 14:32:56 CST