Re: UTF-8 syntax

From: Peter_Constable@sil.org
Date: Thu Jun 07 2001 - 09:14:41 EDT


On 06/07/2001 02:32:45 AM Peter Constable wrote:

>We
>are left to infer that "mapped back" means the exact inverse of the
mapping
>defined (in the case of UTF-8) in D36. But note: making that inference
>assumes that the mapping in D36 is invertible. That requires that the
>mapping in D36 is injective; i.e. one-to-one, as D29 requires. This
>reinforces that a 6-byte sequence cannot be used to represent a
>supplementary plane character. But not the corrolary: 6-byte sequences
>cannot be mapped back to a Unicode scalar value, and therefore *are
>illegal*.

That should have been "But note the corrolary..."

- Peter

---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <peter_constable@sil.org>



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:18 EDT