Re: About cultural/languages communities flags from Philippe Verdy on 2015-02-12 (Unicode Mail List Archive)

From: Philippe Verdy <verdy_p_at_wanadoo.fr>
Date: Fri, 13 Feb 2015 06:22:42 +0100

Another solution isalso to not extend the scope of use of RIS characters
(leave them as they are for ISO3166-1 based codes only), but defne a
separate set with "Language Indicator Symbols" (LIS) working the same way,
but based on ISO 639-2 or -3 (3-letter codes, accepting also the language
family codes also encoded on 3 letters, as well as alll -3 macrolanguages
such as "zho" for Chinese or "que" for Quechua).

Exactly the same principle as RIS, and as easy to produce with a generic
font with very few actual glyphs (on the Ligatures OpenType table may look
long, but it can be generated automatically by a basic script, to integrate
it in the font build project). No need of complex ligature support, all can
work based with a single lookup table of pairs (of glyph ids), simply
because there's no need for reordering glyphs. And the default glyph id's
for indidual LIS charactes would be mapped to the default building blocks
shoiowing the "speech bubble frame" (so a baisc renderer not processing the
fonct SUBST tables for ligatures would still produce the basic glyphs and
produce a consistant result (even if no decorated bubble would show the
colorful and decorated content matching a user-expected "flag" that would
be produced in font whose design is based on country/region flags.

No requirement by Unicode about how the decorated glyphs will look or about
their use or color. Just like fonts with various styles for emojis, the
font to use could be a user preference for the reader. No requirement as
well to use an OpenType renderer, applications can use icons as well in any
convenient graphic format (GIF, PNG, SVG...) as long as they match in term
of dimension within the standard line height (not more than about 1.25 em
in height incluiding top and bottom bearings). No requirement as well about
their width. basic font styles (bold, italic) could be rendered as well by
the default glyphs, either on their inner letters, or on the type of bubble
frame, including for colorful bubbles whose generic "rounded rectangle"
frame can also be "italicized" and bolden even when tit has a colorful
complex content.

Nowhere, that will mean that Unicode defines what is a valid language or
not. All well-formed triplets are valid, and users are free to use 3-code
sequences of LIS to do what they want as long as this respects the known
ISO639 standard (otr its history, including retired codes). So it will be
wellformed to use LIS codes to "say": yes or YES, with LIS[Y]+LIS[E]+LIS[S]
(but if there's a ISO 639 language matching the code "yes",it is also valid
to replace it with a bubble showing inside a culturally associated
"flag-like" decoration. French uses could also use LIS[O]+LIS[U]+LIS[I] to
"say": "oui" or "OUI", even if there's another ISO639 language matchin the
code "oui" (there's inherently no violation of the per-character identity
of LIS characters as Unicode does not encode ligatures or require them to
be used for rendering.

2015-02-13 5:15 GMT+01:00 Philippe Verdy <verdy_p_at_wanadoo.fr>:

> RIS could represent languages as well, using BCP47 principle, except that
> they start by an ISO
> 3166 coide (as there's no territory, you'd normally use a 3166 code for
> undetermined region, but there's no 3166 code that starts by an hyphen.
> So to use a BCP47 language tag you could use the hyphen reencoded to RIS
> as the first character.
> The problem is that langauge codes in BCP47 have variable sizes. Even if
> you limit just to the ISO639 compatible repertoire (3 letter codes) you'd
> need to use 4 RIS codes
> And the language flags would be represented as RIS(HYPHEN)+RIS(ISO639-3
> code).
>
> 4 codes would work with font rendering engines that can build 3 successive
> ligatures from left to right
>
> If there's no match for a know flag (or if there's an exact multiple of 4
> RIS codes), the default glyphs would just show a blank flag frame showing
> the RIS Code converted back to ASCII letters (rendered with a small
> capitals style: where the first glyph shows the flag's hoist and the first
> RIS code and i.e. the hyphen, the 2nd and 3rd gyphs shows the top/bottom
> part of the blank frame an the ASCII character the 4th glyph is similar but
> adds the flying end of the flag, possibly decorated with non rectangular
> frame). If there remains less than 4 RIS codes, the flag frame would add
> the flying end of the flag, with no letter (or just the SPACE).. The wole
> would be in a large dotted frame to exhibit the special format.
>
> These default glyphs are easy to produce in the font. Then to support more
> languages (7000 languages : 7000 flags ? certainly not so many exist...),
> you just have to map new ligatures to replace the default ligatures by more
> accurate "flags".
>
> But my opinion is that "flags" (even ifshowing them generically) are not
> the cood concept for languages (I would highly prefer a "speech bubble
> frame" like on comics, even if some applications could render in them a
> colorful regional flag., or the letter code within the "sonor waves" of an
> audio speaker device.
>
>
> 2015-02-09 22:11 GMT+01:00 Joan Montané <joan_at_montane.cat>:
>
>>
>> Hi all,
>>
>> I am the one who made the request to tweemoji Github.
>>
>>
>> 2015-02-09 20:16 GMT+01:00 Markus Scherer <markus.icu_at_gmail.com>:
>>
>>> On Mon, Feb 9, 2015 at 9:54 AM, Andrea Giammarchi <
>>> andrea.giammarchi_at_gmail.com> wrote:
>>>
>>>> > if a cultural/language TLD is typed with Unicode RIS, then show the
>>>> flag for these culture/language:
>>>>
>>>
>>> This does not work. The "Unicode RIS" are defined to be used in pairs,
>>> with semantics according to corresponding ISO 3166 alpha2 codes. In your
>>> examples, each successive pair will encode a flag.
>>>
>>>
>> AFAIK, this is done in font side. Emoji flags are just ligatures, so a
>> font can provide a ligature for 4 RIS characters. This is not an issue here.
>>
>> I agree some strange behaviour can appear if a 3 RIS string, take CAT, is
>> shown in a system with only 2 RIS support (a Canadian will appear followed
>> by a T).
>>
>>
>> If you want to represent every flag of every locality, you first have to
>>> figure out how to catalog and label them. You are mentioning provinces, one
>>> level down from nation states; I guess there are thousands of them. In much
>>> of Europe, every little village
>>> <http://de.wikipedia.org/wiki/Butterstadt> has its own flag and coat of
>>> arms. Where do you want the text encoding and fonts to stop?
>>>
>>>
>> I don't request flag support for every flag in the world. I requested
>> flags for culture/language communities *with* an approved TLD (Top Level
>> Domain).
>>
>> I know flags are an issue, and I know flags represents territories, not
>> languages, but I think some support should be done for these active
>> communities. As I pointed, some country flag collections expand with a fews
>> non-independent country. See [1], [2] and [3] (search for Scottish or
>> Welsh flag). You can check this [4] petition requesting Catalan flag on
>> WhatsApp.
>>
>> So, there is a demand and they are used in real world. What's the way for
>> encoding them in Unicode standard?
>>
>> Thanks,
>>
>> Joan Montané
>>
>> [1] http://www.famfamfam.com/lab/icons/flags/
>> [2] https://www.gosquared.com/resources/flag-icons/
>> [3] http://www.sherv.net/flag-emoticons.html
>> [4]
>> https://www.change.org/p/whatsapp-inc-incloure-la-senyera-de-catalunya-a-whatsapp
>>
>> _______________________________________________
>> Unicode mailing list
>> Unicode_at_unicode.org
>> http://unicode.org/mailman/listinfo/unicode
>>
>>
>

_______________________________________________
Unicode mailing list
Unicode_at_unicode.org
http://unicode.org/mailman/listinfo/unicode
Received on Thu Feb 12 2015 - 23:24:00 CST

This archive was generated by hypermail 2.2.0 : Thu Feb 12 2015 - 23:24:00 CST