Re: transforms and language identifiers (was Re: Dozenal chars in music)

From: Mark Davis (mark.edward.davis@gmail.com)
Date: Wed May 27 2009 - 17:26:06 CDT

  • Next message: Sergey Malkin: "RE: Pb with Unicode Tifinagh with Internet Explorer"

    > Fine.
    >
    > Whatever.
    >
    > No need to try to apply standards.

    Again, no need to be snide.

    Both Peter and I wouldn't be doing what we do if we didn't believe in the
    need for standards. Had you actually followed some of the links I posted in
    this thread, you would have seen the specification for the transform IDs.
    I'll explain briefly, since you didn't follow the link.

    The IDs have a source and a target and optionally a variant, in the form
    source-target/variant. The source and target can each be Unicode language
    IDs, Unicode script IDs (long or short), or other ids. The Unicode
    language/script IDs are basically BCP 47 but with some small
    extensions/restrictions and the use of "_" instead of "-" in the canonical
    form (the syntactic differences are for backwards compatibility). So an
    example is:

    ru_RU-en_US/BGN

    While we could use fonipa, a valid source or target would then be a language
    tag, like "en-fonipa", so we would end up with a transform ID like
    "en-en_fonipa". However, especially since we are intending to use this as a
    pivot between scripts, the fallback behavior of "en_fonipa" isn't ideal, and
    it was simpler to just use a custom value. If instead of "fonipa" we
    actually had a real script ID for IPA (eg 'Fipa'), then that would work. But
    we don't have the equivalent of 'Fipa', so a custom value works better (an
    alternative would be to use a PU script code).

    Mark

    On Wed, May 27, 2009 at 07:41, Michael Everson <everson@evertype.com> wrote:

    > On 27 May 2009, at 15:33, Peter Constable wrote:
    >
    > It's en_US-fonipa, not en_US-ipa
    >>>
    >>
    >> ??!!
    >>
    >> The use of an underscore would make this clearly NOT a BCP 47 tag. If it's
    >> BCP 47, it would be en-US-fonipa; if it's not a BCP 47 tag, it can be
    >> whatever the usage context specifies, including perhaps en_US-ipa.
    >>
    >
    > Oh.
    >
    > Fine.
    >
    > Whatever.
    >
    > No need to try to apply standards.
    >
    > I'm wrong.
    >
    >
    > Michael Everson * http://www.evertype.com/
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Wed May 27 2009 - 17:29:13 CDT