From: Leo Broukhis (leob@mailcom.com)
Date: Wed Jan 16 2008 - 18:41:57 CST
On Jan 16, 2008 1:42 PM, Kenneth Whistler <kenw@sybase.com> wrote:
> GOST 10859 and ALCOR were effectively dead encodings long before
> Unicode even got started collecting repertoire,
It might seem funny, but I've heard of operational BESM-6 machines
(that used the GOST encoding)
somewhere in Russia as recently as last year on some military
installation - where it's easier to keep paying for
maintenance, electricity and cooling rather have a headache upgrading
the system.
> > It cannot be replaced by SUBSCRIPT ONE + SUBSCRIPT ZERO, because it
> > has to occupy one character position for the sake of text aligned for
> > a fixed-width font.
>
> That's debatable. For transcoding obscure character encodings,
> there really is no requirement that you have one-to-one
> mappings for every character. You can certainly represent
> the subscript 10 in GOST 10859 with <2081, 2080> in Unicode
> and convert it back losslessly with no problem.
Lossless conversion is fine, but I'm interested in a portable exact
representation of a GOST printout.
I would not object to a rich text approach if there was a way to do
it, e.g. if something like
<halfwidth>₁₀</halfwidth> existed and could do the job.
> > What should an emulator of a computer that used GOST 10859 or ALCOR
> > produce, then?
>
> For an emulator you would have various options, including
> mapping of the sequence <2081, 2080> to your fixed-width
> ACPU-128 drum printer font glyph for a subscript 10. Or,
> if your emulator is making one-to-one character to glyph
> assumptions, then you use a PUA value to stand in for the
> sequence, and map *that* to your fixed-width glyph.
Correct me if I'm wrong, but AFAIK the ways to attach private glyphs
to network documents are not standardized nor widely supported yet.
> sponsorship is not required to simply add one more symbol
> for compatibility with an old encoding to the standard.
That is good to know. I've looked at the submission page before
joining the list;
I'm following the suggestion to discuss proposals first.
> However, justification in terms of emulation of long unused
> character sets and computing machinery isn't a very strong
> case, since emulation software is *software*, after all, and
> always has plenty of options to deal with such problems
> creatively, as long as all the component pieces needed for
> character representation are present in Unicode.
Typesetting software has too, but that did not seem to stop people
from requesting and acquiring separate codepoints for monospaced
letters and digits
(U+1D670 - U+1D6A3, U+1D7F6 - U+1D7FF).
If we're to follow the spirit of UTN28, we should add a mathematical decimal
exponent base character at least to allow for the unambiguous
scientific representation of reals
in math texts. What does 1.5e+3 without a U+2062 (invisible times)
before 'e' really mean? 1500 or 7.077?
Subscripts after numbers already have a different meaning to indicate
the base of the numeral system.
Does it look more convincing now?
This archive was generated by hypermail 2.1.5 : Wed Jan 16 2008 - 18:43:57 CST