Re: Generic base characters

From: Sinnathurai Srivas (sisrivas@blueyonder.co.uk)
Date: Sun Jul 15 2007 - 15:30:38 CDT

  • Next message: Kent Karlsson: "RE: Generic base characters"

    The ideal representation would be without any additional interventions like
    dotted circle or no-break space.

    I noticed that even if some font contains no dotted circle, the actual data
    consisted dotted circle. This means there is always an additional unwanted
    code within text, some times obvious (if a font contains dotted circle) and
    some times not obvious. This lead to misunderstanding of the actual contents
    of codes within text. My guess is that processes such as sorting, searching
    can mislead the users and developers. Is this behaviour the same for
    no-break space? Does the data codes contained within text includes codes of
    no-break space? If this is the case then using the visible devil dotted
    circle would be much better than using the hidden deveil like no-break
    space.

    In any case Grammar specifically states there is no need for additional base
    characters when longer than long vowels are used. Why add unwanted codes to
    text, making unnecessary complications? Why not leave them alone, making it
    simple and compliance, compliance to Grammar.

    Sinnathurai

    ----- Original Message -----
    From: "Peter Constable" <petercon@microsoft.com>
    To: "Unicode List" <unicode@unicode.org>
    Sent: 15 July 2007 20:53
    Subject: RE: Generic base characters

    > It's not entirely clear to me what you want. If you want a vowel mark to
    > appear in isolation without any visible base, then use U+00A0 NO-BREAK
    > SPACE as the base.
    >
    >
    > Peter
    >
    > -----Original Message-----
    > From: Sinnathurai Srivas [mailto:sisrivas@blueyonder.co.uk]
    > Sent: Sunday, July 15, 2007 1:19 AM
    > To: Peter Constable; Unicode List
    > Subject: Re: Generic base characters
    >
    > The principle behind dotted circle is very problem some.
    > One does not always need a base character when writing.
    > I know in Tamil longer than long vowels and longer than long dipthongs are
    > reality and it is part of Grammar. But Unicode defines it differently. So
    > it
    > is not a problem with Microsoft, but a problem with Unicode definition. I
    > think we need to find a way to either remove the ristrictions placed by
    > dotted circle or find other technical solution for this problem
    > Words line
    > Auai, varuuum etc are in use.
    >
    > Is this proposition to remove dotted circle possible?
    >
    >
    > Sinnathurai
    >
    > ----- Original Message -----
    > From: "Peter Constable" <petercon@microsoft.com>
    > To: "Unicode List" <unicode@unicode.org>
    > Sent: 15 July 2007 06:14
    > Subject: RE: Generic base characters
    >
    >
    >> From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On
    >> Behalf Of Christopher Fynn
    >>
    >>> The Microsoft OpenType shaping engine, Uniscribe, seems to
    >>> automatically insert a dotted circle as a base character
    >>> for isolated combining marks - and this behavior is outside
    >>> of the control of the font developer.
    >>>
    >>> IMO it would be much better if font developers were
    >>> responsible for defining their own sets of these base glyphs
    >>> for combining marks - including the base for isolated
    >>> combining marks - and their own lookups for rendering the
    >>> resulting combinations.
    >>
    >> Uniscribe inserts a dotted circle glyph only when the author has not
    >> included a valid base character for the mark. Font developers are always
    >> responsible for their own lookups for rendering a mark glyph on a base.
    >> But there's nothing the font developer can do about the scenario in which
    >> an author fails to include a base.
    >>
    >> Perhaps you have in mind that a font developer should control what glyph
    >> is used in that situation, but I see a need, on the assumption that
    >> authors should, and normally are, explicitly intentional about what is in
    >> their document, and that Uniscribe's fallback rendering is just that: a
    >> fallback.
    >>
    >>
    >>> As base glyphs one might want to include the dotted circle;
    >>> non-breaking space or fixed width spaces such as em space or
    >>> en space;
    >>
    >> Unicode specifies that combining marks in isolation -- with no visible
    >> base -- should be combined with NO-BREAK SPACE, not any of the other
    >> space
    >> characters in the standard.
    >>
    >>
    >>> Right now if I have a lookup in an OpenType font to place
    >>> an isolated mark on the dotted circle Uniscribe also inserts
    >>> dotted circle and I end up getting two doted circles...
    >>> Not all OpenType shaping engines exhibhit this behaviour.
    >>
    >> A bug, which can be looked at.
    >>
    >>
    >>
    >> Peter
    >>
    >>
    >>
    >>
    >
    >
    >
    >
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Sun Jul 15 2007 - 15:35:00 CDT