RE: IJ joint in spaced lettering

From: Kenneth Whistler (kenw@sybase.com)
Date: Mon Jan 09 2006 - 18:25:01 CST

  • Next message: Kenneth Whistler: "Re: IJ joint in spaced lettering"

    Jukka quoted the standard:

    > >> Being a compatibility decomposable
    > >> character, it is not recommended except in the representation
    > >
    > > No, it does not say that.
    >
    > "Compatibility decomposable characters are a subset of compatibility
    > characters included in the Unicode Standard to represent distinctions in
    > other base standards. They support transmission and processing of legacy
    > data. Their use is discouraged other than for legacy data or other special
                                                                ^^^^^^^^^^^^^^^^
    > circumstances."
      ^^^^^^^^^^^^^
      
    There's your escape clause.

    >
    > Definition D21 in section 3,
    > http://www.unicode.org/versions/Unicode4.0.0/ch03.pdf#G748
    >
    > > There are exceptions to that interpretation
    > > of compatibility characters (and compatibility decomposable characters),
    > > the IJ LIGATURE and the LONG S are among them. I think it is perfectly
    > > fine to recommend their use in situations like this
    >
    > I think so too; we seem to agree on the practical point. But I discussed
    > what the standard says (in a somewhat odd place, but the same general idea
    > can be seen elsewhere in the standard, too).
    ...

    > (Maybe some official statement,
    > constituting an explicit exception to the principle of avoiding
    > compatibility decomposable characters, would be in order.)

    Actually, I don't think so. The bullet at the definition
    of compatibility decomposable characters already provides
    sufficient wiggle-room. They are there for:

      1. use with legacy data (which includes ISO 8859-1, by the way)
      
      2. when you need them (special circumstances)
      
    You wouldn't get very far trying to pushing a claim that the
    Unicode Standard has a principle of "avoiding compatibility
    decomposable characters", given that technically, they include
    even such characters as U+00A0 NO-BREAK SPACE -- which is elsewhere
    explicitly recommended for use (in special circumstances) for
    preventing line breaks and for display of nonspacing marks
    in apparent isolation, and so on.

    I'd say the IJ is a similar case. Ordinarily you don't need it
    for representing Dutch data, but U+0132 is encoded for those
    circumstances where you might need it: interoperating with legacy
    data (particularly ISO 6937), and special circumstances where you
    might have requirements for letter spacing that couldn't be
    met simply in plain text otherwise.

    --Ken
      

    >



    This archive was generated by hypermail 2.1.5 : Mon Jan 09 2006 - 18:26:12 CST