From: Jefsey_Morfin (jefsey@jefsey.com)
Date: Wed Oct 04 2006 - 09:45:57 CST
Dear Stephane,
this kind of ad-hominem with the affimation of points everyone agree
is not much productive. I respect your desire to concentrate on the
English globalization level, however I question the technical
interest of a layer violation. The homograph problem is patched at
globalisition level as we currently see it, without much scalling
benefit. I think it is easier to address at the multilingualisation
layer where IMHO it belongs. I think you should reread RFC 4690 and
try to bring another solution that the one we are a certain number to
read as implied in the text: this would be of interest. All the more
than the IAB shares with you the problem of disregarding the
multilingualisation layer. This is IMHO the reason why it discusses
problems and hesitates to propose the solution its text seem to imply
and I agree with.
This is precisely because we all agree that the choice of Unicode in
protocols is not to be revisited, that we have a difficulty with the
punycode process. John Klensin and the IAB have honnestly considered
the problem and studied the current Unicode response. They certainly
overlooked the technical responsibility of ICANN, but this is not the
point here. They have concentrated on the Unicode aspects and do not
find the Unicode comments satisfactory enough. Today we have no
proposed solution. Or may be you see one and I would be glad you
share it with us. Both on confusive characters and version update
(this would already be a great progess).
This is only related to punycode and Unicode. The initial patch to
address this problem (the language tables) does not work. It is a
partial external "add-on" to the punycode process. It seems that we
need to use of a single grapheme table (by Unicode or others) in
punycoding. This actually means to integrate the language tables
diversity into one common unique (and not unified with several
occurances of a grapheme being supported) universal table. The
Unicode (each code having a character) process was computer inclusive
(adding codes), we need a parallel human exclusive unigraphs (each
graph having a code) process (only keeping one grapheme per homograph
group) to address the homograph issue.
So, I repeat my question: has a unique grapheme system been tried somewhere?
I look for your comments.
jfc
On 09:59 04/10/2006, Stephane Bortzmeyer said:
>On Tue, Oct 03, 2006 at 10:22:33PM +0200,
> Jefsey_Morfin <jefsey@jefsey.com> wrote
>
> > RFC 4690 documents a certain number of difficulties resulting of the
> > choice of Unicode as the reference table of the punycode process.
>
>Not at all (I mention this for the people who still read Jefsey's
>messages).
>
>The RFC is available here:
>
>http://www.ietf.org/rfc/rfc4690.txt
>
>and does not discuss the choice of Unicode. Quite the contrary:
>
>4.1.6. Use of the Unicode Character Set in the IETF
>
> Unicode and the closely-related ISO 10646 are the only coded
> character sets that aspire to include all of the world's characters.
> As such, they permit use of international characters without having
> to identify particular character coding standards or tables. The
> requirement for a single character set is particularly important for
> use with the DNS since there is no place to put character set
> identification. The decision to use Unicode as the base for IETF
> protocols going forward is discussed in [RFC2277]. The IAB does not
> see any reason to revisit the decision to use Unicode in IETF
> protocols.
This archive was generated by hypermail 2.1.5 : Wed Oct 04 2006 - 09:50:29 CST