Re: UTF-8S (was: Re: ISO vs Unicode UTF-8)

From: Mark Davis (markdavis34@home.com)
Date: Tue Jun 05 2001 - 10:30:00 EDT

Next message: Peter_Constable@sil.org: "Re: UTF-8S (was: Re: ISO vs Unicode UTF-8)"
Previous message: Mark Davis: "Re: UTF-8S (was: Re: ISO vs Unicode UTF-8)"
In reply to: Marco Cimarosti: "RE: UTF-8S (was: Re: ISO vs Unicode UTF-8)"
Next in thread: Michael \(michka\) Kaplan: "Re: UTF-8S (was: Re: ISO vs Unicode UTF-8)"
Reply: Michael \(michka\) Kaplan: "Re: UTF-8S (was: Re: ISO vs Unicode UTF-8)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

[Sorry -- hit "Send" again too soon]

It is either one code point (lenient parser) or an error (strict parser). It
is never two.

I put samples on:

http://www.macchiato.com/utc/samples_of_utf8.htm

Mark

----- Original Message -----
From: "Marco Cimarosti" <marco.cimarosti@essetre.it>
To: <unicode@unicode.org>
Cc: "'Mark Davis'" <mark@macchiato.com>
Sent: Tuesday, June 05, 2001 05:03
Subject: RE: UTF-8S (was: Re: ISO vs Unicode UTF-8)

> Mark Davis wrote:
> > - I am well aware that one can accept 6-byte supplementary
> > characters on
> > input in UTF-8. (Did you really think I wasn't?)
>
> (O, no, I know you knew!)
>
> But how should this 6-byte sequence be interpreted by a standard UTF-8
> decoder? Does it become one or two code points?
>
> _ Marco
>
>

Next message: Peter_Constable@sil.org: "Re: UTF-8S (was: Re: ISO vs Unicode UTF-8)"
Previous message: Mark Davis: "Re: UTF-8S (was: Re: ISO vs Unicode UTF-8)"
In reply to: Marco Cimarosti: "RE: UTF-8S (was: Re: ISO vs Unicode UTF-8)"
Next in thread: Michael \(michka\) Kaplan: "Re: UTF-8S (was: Re: ISO vs Unicode UTF-8)"
Reply: Michael \(michka\) Kaplan: "Re: UTF-8S (was: Re: ISO vs Unicode UTF-8)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:18 EDT