Re: Prepending vowel exception in Lontara/Buginese script ? from Ngwe Tun on 2011-07-25 (Unicode Mail List Archive)

From: Ngwe Tun <ngwestar_at_gmail.com>
Date: Tue, 26 Jul 2011 06:21:14 +0630

Dear Phillipe,

Burmese was not still support in Windows 7. Hope, we will get burmese
support in Windows 8. We are getting Burmese/Myanmar support in AAT and
language support in Lion.

For the opentype, you can try with tricks for Buginese. Reordering will work
in Uniscribe itself. So. We tried with GSUB features. rlig or liga features
support substitute features of glyph. Here is the trick;
C = Consonant, E= Vowel, M=Medial
1) CE => ECE => EC
2) CME = CEM => ECEM => ECM

For the wikipedia matters;
you should go to the wikipedia incubator. http://incubator.wikimedia.org
you will see about "How to start a new test wiki?" Section. we have started
some minority language of Myanmar in this place.

Best

Ngwe Tun

On Tue, Jul 26, 2011 at 1:57 AM, Philippe Verdy <verdy_p_at_wanadoo.fr> wrote:

> 2011/7/25 Peter Constable <petercon_at_microsoft.com>:
> > From: verdyp_at_gmail.com [mailto:verdyp_at_gmail.com] On Behalf Of Philippe
> Verdy
> >
> >> What would be the behavior of a font that would use GSUB entries (or
> >> ligatures) in a feature to implement the reordering that NO renderer
> >> currently implements for Buginese ? What will happen later if the
> >> renderer does implement it ?
> >
> > Your question is no coherent: OpenType features cannot be used to trigger
> re-ordering.
>
> Hmmm... Your reply is also incoherent:
>
> (1) There are lots of OpenType features registered that actually
> perform contextual reordering in Indic scripts, including when they
> are in fact mandatory for that script (example for repha forms of ra,
> or to move ra to a later position after another base consonnant, to
> make it shown on the next vowel, or other exceptions needed in khmer,
> lao,...).
>
> (2) These features were even registered by Microsoft.
>
> (3) Some of them are for pre-base reordering, other contain exceptions
> to the usually "mandatory" pre-base order, to change it in a post-base
> form in some other contexts.
>
> >> Does the OpenType specification allow specifying a temporary override
> >> for the missing renderer reordering capabilities ?
> >
> > No, and I don't see how that would make any sense: if a rendering system
> support Buginese script, then it supports it and does the reordering
> necessary. It either supports it or it doesn't.
>
> What I asked is if it is possible to have another feature, that would
> be triggered and enabled by default (and should occur before the nukta
> feature and other similar features like repha forms) and tagged with
> the Buginese script, unless the renderer knows that it supports itself
> the reordering of prepending vowels for that Buginese scripts (in
> which case that feature would be ignored).
>
> This is what I would call a smooth transition : existing renderers
> would work with a font presenting that feature, and future renderers
> that perform the necessary reordering would ignore it and would not
> even require that a Buginese script contains this feature.
>
> >> Note: The Microsoft Font Validator (found in Microsoft Typography
> >> website, section for Downloadable Tools) still does not recognize bit
> >> 96 of the ulUnicodeRange field, officially defined for the Buginese
> >> block range (U+1A00..U+1A1F), and reports an error if this bit is set.
> >
> > I'll report that to the team that maintains that tool.
>
> Thanks.
>
> It should also correctly parse the "head" table instead of reporting
> this (non-documented) internal exception in the validation report:
>
> E0041 : An exception occurred preventing completion of table validation"
> System.FormatException: Le format de la chaîne d'entrée est incorrect.
> à System.Number.StringToNumber(String str, NumberStyles options,
> NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal) à
> System.Number.ParseDouble(String value, NumberStyles options,
> NumberFormatInfo numfmt) à System.Double.Parse(String s, NumberStyles
> style, NumberFormatInfo info) à
> OTFontFileVal.val_head.Validate(Validator v, OTFontVal fontOwner) à
> OTFontFileVal.OTFontVal.Validate()
>
> >> And the Fonts folder in Windows 7 Explorer does not say that the font
> >> effectively supports Buginese (a Buginese font says that it supports no
> >> script at all, even if all code points assigned in the Buginese block
> are
> >> mapped, and bit 96 is set in Unicode Ranges of the header).
> >
> > Two issues:
> >
> > 1) Windows 7 does not provide text-display support for Buginese script.
>
> OK, so Uniscribe (and IE) does not perform the reordering. It's then
> impossible to display correctly encoded Buginese text on Windows with
> Uniscribe. Other renderers will be needed (but Pango does not know
> that reordering rule too, and none of the tested browsers on Windows
> are working).
>
> It seems that the script is supported only on MacOS, where there are
> effectively commercial Buginese fonts designed for Mac (example one
> font from Xerox : I've not tested it, I would need a Mac before even
> buying that font).
>
> > 2) The scripts show in the "Designed for" column in the Fonts control
> panel in Windows 7 does not make use of the UnicodeRanges fields in the OS/2
> table. There are a few reasons for this:
> > - that data is not all that reliable since there's no consistent practice
> in how it is set (there's no metric to decide when a bit should or shouldn't
> be set);
> > - the UnicodeRanges fields are not scalable into the future (they were
> exhausted with Unicode 5.1); and
> > - the UnicodeRanges fields are typically set based on some sense of "can
> display" whereas what we were thought was much more useful to users was to
> indicate "was designed for". For example, MS Gothic _can_ display English
> text, but we think it's not a particularly useful choice for English users
> since that's not the audience it was designed for. The intent is to give
> useful recommendations that help users differentiate relevant options from
> distracting noise.
> > Rather than using the OS/2 data, the Fonts cpl uses metadata outside the
> font. Unfortunately, it has it only for a certain set of fonts that were
> known when we shipped to be on most systems; so, if you add a Buginese font,
> the metadata will not include that font.
>
> It's strange : many new international fonts have been added after the
> release of Windows 7. And the CPL explorer extension still detects
> that the fonts support some scripts. How does it perform the test? By
> counting the mapped glyphs? If so it could easily detect Buginese by
> counting that there are at least 28 glyphs mapped from code points in
> the Buginese block.
>
> >> This is the case for all ulUnicodeRange bits defined now after
> >> bit number 87, i.e. the Deseret block of the UCS, meaning that
> >> the validator and the Windows 7 text renderer and Fonts
> >> Explorer are still only based on the (now very old) Unicode 4.1
> >> of... 2003 (with the Deseret additions) or even before in 1996
> >> with Unicode 3.1 only. Who's late ?
> >
> > Font Validator may be out of date; as mentioned, I'll pass that on to the
> relevant team. As for the Fonts control panel, as mentioned it doesn't use
> ulUnicodeRange fields at all; but you have spotted a bug in our metadata:
> Deseret should be listed for the Segoe UI Symbol font.
>
> OK, is it possible to have the Saweri and Code2000 fonts recognized
> (these two free fonts are widely advertized as a possible solution for
> the Buginese edition of Wikipedia, but for now this edition mostly use
> the Latin script for that language).
>
> I was asked on Wikipedia to design a test page for the script, but I
> was completely unable to do that.
>
> All I could make was to try adapting the page presenting the [[Lontara
> script]] with:
>
> - a few text samples (but not sure that the samples are logically
> encoded, it seems that they are visually encoded in some places, and
> one word is most probably incoherent with its Latin transcription),
>
> - and in the Unicode block chart where the vowel e is effectively
> rendered after the base glyph: the chart on English Wikipedia
> currently uses a dotted circle symbol (but there's no warranty that
> reordering would occur with that symbol in a compliant renderer),
> whereas the French Wikipedia page presents all Buginese diacritics
> with the Buginese base letter ka (U+1A00 : it should really work).
>
> This brought me to the question of testing other South-East Asian
> Brahmic scripts, like Hanunoo, Buhid, Javanese, or Balinese. It seem
> that they have the same rendering problem in a few cases for prepended
> vowels (plus other problems remaining in Khmer and Burmese for some
> contextual forms).
>
> The rendering problem will be recurring with all other pending Brahmic
> scripts (still not encoded) that feature prepended diacritics. Why
> can't we have now a registered OpenType feature for handling those
> mandatory contextual reorderings (at least for the most frequent
> cases), waiting for a full support of the script in text renderers?
>
>
>

-- 
We will release Myanmar Linux Desktop in 11/11/11.

Received on Mon Jul 25 2011 - 18:54:55 CDT

This archive was generated by hypermail 2.2.0 : Mon Jul 25 2011 - 18:54:56 CDT