Re: Default properties for PUA characters???

From: Mark Davis (mark.davis@jtcsv.com)
Date: Tue Dec 03 2002 - 18:38:33 EST

  • Next message: Vipul Garg: "RE: Devanagari"

    > characters*, we have found that is generally best practice to interpret
    the

    I should make it clear that the "we" above does not refer to the Unicode
    consortium!

    Mark
    __________________________________
    http://www.macchiato.com
    ► “Eppur si muove” ◄

    ----- Original Message -----
    From: "Mark Davis" <mark.davis@jtcsv.com>
    To: "John Cowan" <jcowan@reutershealth.com>; <kenw@sybase.com>
    Cc: <wittern@kanji.zinbun.kyoto-u.ac.jp>; <unicode@unicode.org>
    Sent: Tuesday, December 03, 2002 10:23
    Subject: Re: Default properties for PUA characters???

    > Ken is correct: the default properties are somewhat different for
    ideographs
    > than for PUAs. In addition, PUAs are a special case compared to other
    > characters; implementations are free, within very broad limits, to change
    > the default properties associated with a PUA code point to whatever is
    > appropriate to whatever private-use character definition the application
    > gives to that code point.
    >
    > In other words, an application, if it treats a particular PUA as an
    > ideograph, is free to change the default properties to match Ken's list
    (and
    > for other properties):
    >
    > gc=Lo (general category = Other_Letter)
    > ccc=0 (combining class = 0, i.e. Not_Reordered)
    > bc=L (bidi class = strong Left_To_Right)
    > sc=Hani (script = Han)
    > lb=ID (line break = Ideographic)
    > ea=W (east asian width = Wide)
    >
    > If an application treated a particular PUA character as a Greek Linear B
    > character, on the other hand, it would assign yet different properties.
    >
    > Now in practice, the vast majority of PUA characters in use are
    representing
    > ideographs, mapped from East Asian standards. Due to this fact, *in the
    > absence of other protocols establishing the precise usage of the PUA
    > characters*, we have found that is generally best practice to interpret
    the
    > PUA characters as ideographs. However, applications are free to interpret
    > them however they want.
    >
    > Mark
    > __________________________________
    > http://www.macchiato.com
    > ► “Eppur si muove” ◄
    >
    > ----- Original Message -----
    > From: "John Cowan" <jcowan@reutershealth.com>
    > To: <kenw@sybase.com>
    > Cc: <wittern@kanji.zinbun.kyoto-u.ac.jp>; <unicode@unicode.org>
    > Sent: Monday, December 02, 2002 21:08
    > Subject: Re: Default properties for PUA characters???
    >
    >
    > > Kenneth Whistler scripsit:
    > >
    > > > So I'd say that the XML Core WG has got the situation only
    > > > partially correct for Unicode PUA characters.
    > >
    > > As the actual author of that Core WG text, mea culpa. But I was basing
    > > my remarks on things said on this list.
    > >
    > > --
    > > All Gaul is divided into three parts: the part John Cowan
    > > that cooks with lard and goose fat, the part
    > www.ccil.org/~cowan
    > > that cooks with olive oil, and the part that
    > www.reutershealth.com
    > > cooks with butter. -- David Chessler
    > jcowan@reutershealth.com
    > >
    > >
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Tue Dec 03 2002 - 19:21:12 EST