RE: PUA

From: jameskass@att.net
Date: Mon Oct 20 2003 - 19:41:48 CST


.
Marco Cimarosti wrote,

>
> So far so good. Now I want to use your PUA Plan-14 tags, if present, to
> override the above assumption about PUA characters. E.g., imagine that my
> string contains this:
>
>
> 󠀀󠀂󠁆󠁯󠁏󠁢󠁡󠁲󠀮󠁴󠁴󠁦󠁿> ?
> (U+0E0000 U+0E0002 U+0E0046 U+0E006F U+0E004F U+0E0062 U+0E0061
> U+0E0072 U+0E002E U+0E0074 U+0E0074 U+0E0066 U+0E007F U+E017 U+E009)
>
> This is what I am going to do:
>
> 1) I parsing the tags at the beginning of the string and save the relevant
> information in a temporary variable which we will call PuaInterpretation;
>
> 2) I remove the tags.
>
> Now, my PuaInterpretation variable contains the following information:
>
> Foobar.ttf
>
> And my string contains the following text:
>
> 
> (U+E017 U+E009)
>
> Now, what's the next step? What am I supposed to do to find out whether,
> according to the PUA interpretation called "Foobar.ttf", U+E017 and U+E009
> are letters or not?
>

Hmmm, the UTF-8 non-BMP string apparently got munged.

Anyway, the next step is for your function to load the file
"Foobar.puapropertiesclass".

This file is a plain-text file following the same format as UNIDATA. It's
extensible -- if the font vendor doesn't include it with the font download,
then the savvy end-user can simply construct it with a plain-text editor.

Now your function has all the necessary information and can determine
whether the PUA code points are letters, or not.

Best regards,

James Kass
.



This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST