From: Lars Kristan (lars.kristan@hermes.si)
Date: Tue Dec 14 2004 - 05:19:22 CST
Kenneth Whistler wrote:
> Lars Kristan stated:
>
> > I said, the choice is yours. My proposal does not prevent
> you from doing it
> > your way. You don't need to change anything and it will
> still work the way
> > it worked before. OK? I just want 128 codepoints so I can
> make my own
> > choice.
>
> You have them: U+EE80..U+EEFF, which are yours to use (or abuse)
> in an application as you see fit. Just don't expect others outside
> your application to interpret them as you do.
Well, I DO want someone to interpret them the way I do. And display them.
And let them be entered. And not risk a clash with someone else, we are
talking about PUA, right?
>
> > And once and for all, you can treat those 128 codepoints just as you
> > do today.
>
> A number of people on the list have patiently explained why what
> you are proposing to do fundamentally breaks UTF-8 and its
> relationship to other Unicode encoding forms.
It does not. I may have suggested at some point that the conversion from
codepoins to UTF-8 should be changed. But I am no longer proposing that. The
conversion to and from UTF-8 remains EXACTLY as it is today. I will use my
own conversion as I see fit and deal with all the consequences. But I need
128 VALID codepoints. Not in PUA, not in any plane, but in BMP. And just
because I say 'I' need, does not mean I am the only one.
One would judge who is right and who is not by the number of responses. But
that is definitely not so. A couple of people keep responding and they have
more or less the same theme. Which is because it has been rehearsed time and
time again. I believe there are people who have long since realized that my
claims are correct. But are just afraid to speak up. Also, wherever I win an
argument, it is just dropped. In the end all that remains is a 'feeling' by
a few people that 'this is not good'.
>
> The chances that you will get the standard extended to incorporate
> these 128 code points and define their mapping to invalid byte
> values in UTF-8 is somewhere between zilch, nada, and nil.
No, not UTF-8. UTF-8 remains as it is. What I will do with them is my
business. I am only telling you about it so you cannot dismiss it as
'encapsulating arbitrary binary data in Unicode'.
Lars
This archive was generated by hypermail 2.1.5 : Tue Dec 14 2004 - 05:22:40 CST