From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Mon Jan 28 2008 - 05:16:27 CST
William J Poser wrote:
> Envoyé : lundi 28 janvier 2008 02:47
> À : cldr-users@unicode.org; jr@qsm.co.il; unicode@unicode.org
> Objet : RE: Unicode Transliteration Guidelines released
>
> I agree that I find it very odd for Unicode to be promulgating
> transliterations, since an appropriate transliteration is not
> only specific to a pair of languages but depends on the purpose
> for which it is intended.
>
> There are, however, uses for ascii transliterations even with the
> advent of Unicode. I have had to create and implement several such
> for the Linguistic Data Consortium. One reason for using them
> is that sometimes people want to use existing software that cannot
> handle Unicode, so you need to ascify the text, run it through,
> and then convert it back. For this purpose, the transliteration can
> be pretty arbitrary so long as it is reversible.
I won't call this a transliteration. In fact it will be much more efficient
to just use an alternate representation of the codepoints, without having to
rely on complex conversion tables.
See the "\uNNNN" syntaxic notation for example (used along with escaping
mechanisms) that can be used for this purpose of "ASCII-fication" and
compatibility with more limited protocols (including in cases where letter
case is not preserved).
It's much easier to use this sort of transform (for which you can really
ensure that it is fully reversible, even in the most tricky cases for
arbitrary Unicode source strings!) But I won't try to convince others that
this is a "transliteration"!
My opinion is that all conversion processes of Unicode texts that are FULLY
reversible should not be named "transliterations", but "format transforms"
(this would include for example all transcoders compatible with Unicode,
lossless data compressors, transport encoding syntaxes like hexadecimal or
Base64...).
On the opposite, the intent of a transliterator is not about preserving the
original text but to provide readability of the text for humanes. Full
reversibility is, most of the time, only a technical need, but not a
linguistic need: the linguistic need is not full reversibility for arbitrary
texts, but for texts that makes sense in some set of humane languages
written in a given source script and with their usually accepted or
preferred orthographies.
This archive was generated by hypermail 2.1.5 : Mon Jan 28 2008 - 12:11:29 CST