Re: Proposal to add four characters for Kashmiri to the BMP of the UCS

From: Christopher Fynn (cfynn@gmx.net)
Date: Sun Jul 06 2008 - 23:16:52 CDT

  • Next message: Pravin S: "Re: [indic] Re: Proposal to add four characters for Kashmiri to the BMP of the UCS"

    Philippe Verdy wrote:

    ...

    > Your suggestion would be valid, if all other Devanagari independant vowels
    > where treated as being like if they were in fact composed with a base
    > "consonnant" letter A (the unpronounced/missing consonnant plus the implicit
    > vowel A) plus an optional vowel sign. This was not done in Devanagari: those
    > independant vowels other than A are not decomposed. There's no reason to
    > decompose them for the case of the Kashmiri variants.

    I suspect that most of the pre-composd isolate vowels were included for
    backwards compatibility with a pre-existing standard(s) like ISCII -
    IMO there is no good reason to add additional pre-composed characters
    when a base character + combining mark will work fine particularly when
    these characters are for a what seems to be pretty well a brand-new
    orthography.

    Generally I think it is a good idea try to conserve as much space as
    possible in the Devanagari block on the BMP as, given the number of
    languages written in Devanagari, it seems likely that there will
    eventually be more characters that it would be best to have there. IMHO
    adding unnecessary pre-composed characters when a combination (base char
    + combining mark) will do is not the best use of valuable space.

    > That's the way I understand it. The proposal is preserving the consistency.

    Preserving consistency could be used the next time someone wants to add
    more pre-composed Latin chars. Actually I don't see that encoding only
    the combining chars breaks the encoding model used for Devanangari which
    already has many combining chars.

    > It would probably be better to use the existing letters U and UU with a
    > visarga for denoting these variants, but I'm quite sure that there exsts
    > cases where visargas are used in Sanskrit (or in other languages written
    > with Devanagari) that do not mean that they are creating variants of the
    > vowel, but instead variants of the base consonnant of the akshara. The kind
    > of modification is also not a nasalisation (so an anusvara can't be used to
    > note these phonetic vowel variants, and in fact the Kashmiri vowels can also
    > be used with or without nasalisation, meaning that anusvara must remain
    > usable separately with them).

    Philippe, you almost sound like you are making an argument to use
    "variation selectors" here.

    > You could have also proposed to not encode the long vowels given that they
    > "look" exactly like pairs of short vowels: it would have been enough to add
    > another UE vowel sign after encoding the first UE vowel sign or independant
    > letter UE. But here also this would contradict the encoding model for the
    > rest of the Devanagari script (and of other Indic scripts as well).
    >
    > For this reason, I don't see any defect in the proposal, and also think
    > that, under the given justificiations, FOUR characters need to be encoded,
    > and not just two or three. It is interesting also to read the introduction
    > to the Devanagari script in TUS (since main version 2.0 and up to current
    > version 5.0 of the book).

    I thought there was a policy not to add more pre-composed characters. Is
    this not the case?

    - Chris



    This archive was generated by hypermail 2.1.5 : Sun Jul 06 2008 - 23:21:59 CDT