Re: UTF-8 validation rules

From: David Starner (dstarner98@aasaa.ofe.org)
Date: Mon Sep 10 2001 - 23:08:37 EDT


On Mon, Sep 10, 2001 at 12:22:20AM +0100, David Hopwood wrote:
> It's for Arabic presentation forms internal to a rendering implementation,
> I assume (although it's not clear why existing private-use characters
> couldn't have been used for that).

Because if the implementation uses them, then the end user can't. A large
use of the PUA is scripts and characters that will never be encoded, or
corporate logos for internal use. If the implementation uses codepoints that
others use for Shavian (the PUA implementation actually seeing use in the
wild) or the rest of your organization uses for the private symbols, then
you're just out of luck.

I don't think anyone was specific about Arabic presentation forms. From
what I understand, it's more so an application has a large area set aside
for internal use that won't get mixed up with PUA. The Object Replacement
character and the Ruby characters would have been subsumed in this if
this came first. It could be used for markup, making a wordprocessor format
that's plain text once you strip these characters.

-- 
David Starner - dstarner98@aasaa.ofe.org
Pointless website: http://dvdeug.dhis.org
"I don't care if Bill personally has my name and reads my email and 
laughs at me. In fact, I'd be rather honored." - Joseph_Greg



This archive was generated by hypermail 2.1.2 : Mon Sep 10 2001 - 23:53:26 EDT