Re: Character name translations from Asmus Freytag on 2012-12-20 (Unicode Mail List Archive)

From: Asmus Freytag <asmusf_at_ix.netcom.com>
Date: Thu, 20 Dec 2012 16:45:30 -0800

On 12/20/2012 2:36 PM, Jukka K. Korpela wrote:
> 2012-12-20 14:13, David Starner wrote:
>
>>> It may be useful to try to agree on official or semi-official names for
>>> characters in a language. Such a list hardly needs to cover all of
>>> the over
>>> 100,000 Unicode characters.
>>
>> Why not? Why should an English speaker sticking a arbitrary character
>> into a character map program get a name for it but a non-English
>> speaker not?
>
> For most characters, a “translated” name would be arbitrary. I would
> compare this to names of biological species. Most species lack names
> in most languages, and when names exist, they are often vaguely and
> inconsistently used.

But when real people, not biologists, want to look up information they
have precisely two choices: they can look at a visual index (for species
that can be arranged visually) or they can look up the scientific name
for the species based on the only thing they know: the local popular name.

> That’s why people use scientific (Linnaean) names. We use common names
> for common animals, but it just would not make sense to assign a name
> to the millions of insect species in each human language. The
> scientific name is a crucial key to information. With Unicode
> characters, both the number and the name act as such keys, though the
> name is usually descriptive of meaning, too.

Unlike species, all characters for living scripts have popular local
names in at least one language other than English.

It may not be desirable to blindly translate ALL such names into ALL
languages, but major languages (not only English) may be used by people
that are familiar with or study many other languages and scripts. For
those languages, their community of scholars represents another set of
users who benefit from translated names.

Finally, for arcane scripts, there's usually an easily translatable part
of the character name (think of LATIN LETTER SMALL) and an arbitrary
part of the name (e.g. A) which comes from a transliteration scheme, a
catalog number or the like.

If a language doesn't have a unique transliteration scheme for a
particular script, the choices are to either use the same as present in
the Unicode Standard, or to use one from another, culturally more
relevant language (e.g. a French-based instead of and English-based
transliteration).

>
>>> So Unicode names should not be translated at all, any more than you
>>> translate General Category values for example.
>>
>> Why wouldn't you?
>
> Because those values are identifiers.

No, names have multiple uses; especially if you take the formal name as
one in a series of "aliases" for each character - that's why it's often
more useful to think of translations of the full code charts and
character index, instead of "just" the formal names. (The latter, by
themselves are not so useful).

>
>> There's an argument that they're generally useful
>> for programmers only and programming often requires English knowledge,
>> but if I were explaining the character categories in Esperanto, I
>> would certainly say that Sm is matematikaj simboloj or Simbolo
>> Matematika, not act like "Symbol, Math" should have any importance to
>> my audience.
>
> We can and often should *explain* meanings of identifiers in different
> languages, but that’s different from naming things. The value “Sm” has
> a technical meaning, and it is not identical with the common-language
> expression “mathematical symbol” or its variants, though rather close.
>

The linguistic content of the short labels is indeed limited, however, I
can see good reasons to provide alternate abbreviations for characters,
e.g. for ZWSP or WJ, because these terms are used in places where they
do not act as identifiers.

A./
Received on Thu Dec 20 2012 - 18:48:52 CST

This archive was generated by hypermail 2.2.0 : Thu Dec 20 2012 - 18:48:55 CST