Martin J. Duerst wrote:
> @fr-fr@soixante-dix%@fr-ch@septante%@en@seventy%@de@siebzig
This would work, but is problematic mainly because the "@" is
a toggle. A simplistic non-toggle version could be:
{fr-fr}soixante-dix%{fr-ch}septante%{en}seventy%{de}siebzig
But something along Gavin Nicol's (et.al) suggestion is
much more flexible, and avoids the toggles (there were several
other similar suggestions):
<name>
<alternates>
<name language="english">.....</name>
<name language="japanese">.....</name>
</alternates>
</name>
In this particular context (of names), there should
probably not be a "language" tag, but a "script" tag. E.g., I do
not have an arabic name, but people who can write in the arabic
script have sometimes written my name (or a close approximation
to it) in that script.
In addition, people can have several names for other reasons
than language/script: e.g. a "maiden" name and one (or several)
"married" names (presumably there would be only one current name,
but that is another matter).
And people often have optional, rather than alternative, names.
Gavin's proposal can, at least in principle, easily handle such
things. You may of course wish to put that off to a "future
extension", but you are not locked in. One would be essentially
locked in with MLSF or similar, as MLSF/sim. is very difficult
to extend to new, and initally maybe unpredicted, functionality.
On meta-data marking:
Some have here suggested that meta-text (or other meta-
markers) should be distinguished on a per character basis.
Sure, it might be easier to distinguish proper text from
meta-text if the meta-text was marked as such on a per character
basis, but:
1. We have done without such *per character* "meta"-status
indications for decades now.
2. There seems to be great resistance to introducing any such
"meta"-status indication of any kind in Unicode.
3. If (and that is a big "if") one were to introduce such
per character "meta"-status indication, then:
a. It must *NOT* be restricted to ASCII characters.
There is no reason disallow, e.g., 'รถ' or Han characters
to be used in meta-text, assuming that the meta-data
is at all based on letters/digits/etc. (A possibility
would be to have a "combining meta mark".)
b. It must be made clear which characters are to be marked
as "meta". E.g. is 'Kent' meta-text or not in
'<name value="Kent">' (assuming that the full
'<name value="Kent">' is meta-data.
4. I think it would be a much better idea to have just the
two '<' and '>' (or '{' and '}') characters 'replaced' by
new "open meta parenthesis" and "close meta parenthsis",
for meta-text purposes. But see 2 above.
/kent karlsson
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:34 EDT