Private Use Area in Use (from Tag characters and in-line graphics (from Tag characters))
idou747 at gmail.com
Wed Jun 3 19:17:34 CDT 2015
I don’t use old software, I use up to date versions of everything on a Mac. Very standard setup.
There’s a lot of links there. Maybe they do work in PDFs, but they certainly don’t work in the browser, and they don’t work when I click the txt files. Basically what you’re saying is that PDFs have a way to make this work.
Unless we are proposing that everything in the universe be PDF, this doesn’t really help. There should be a standard way to put custom characters anywhere that characters belong and have things “just work”. Clearly right now things don’t just work. And without even bothering to try I know if I tried cutting and pasting from those PDFs into somewhere else, it won’t work.
On Wed, Jun 3, 2015 at 11:20 PM, Philippe Verdy <verdy_p at wanadoo.fr>
> Note that copy-pasting from a PDF to another document is very tricky, the
> PDF format requires that embedded fonts use precise glyph naming
> conventions to map glyphs back to characters, otherwise the Unicode
> characters sequences associated to a glyph (or multiple glyphs if they are
> ligatured or in complex layouts or with uncommon decorations, or rendered
> on a non uniform background, or with glyphs filled with pattern, such as
> labels over a photograph or cartographic map) will not be recognized. This
> remark about PDFs is also applicable to PostScript documents.
> Some PDF readers in that case attempt to perform some OCR (plus dictionary
> lookups to fix mis readings) for common glyph forms, but will almost always
> fail if the glyphs are too specific such as when they include swashes,
> ligatures, or unknown scripts and scripts with complex layouts (such as the
> invented script created by William for noting sentences with specific
> "characters" with new glyphs, and a specific syntax and specific layout
> rules. In other casesn the PDF reader will jsut put in the clipboard only a
> bitmap for the selection, and it will be another software that will attempt
> to interpret the bitmap with OCR.
> The glyph naming conventions are documented in PDF specifications, but many
> PDF creators do not follow these rules, and copying text from these PDFs
> 2015-06-03 15:03 GMT+02:00 Philippe Verdy <verdy_p at wanadoo.fr>:
>> This possibly fails because William possibly forgot to embed his font in
>> the document itself (or Serif PagePlus forgets to do it when it creates the
>> PDF document, and refuses to embed glyphs from the font that are bound to
>> Unicode PUAs when it creates the embeded font). However no such problem
>> when creating PDFs with MS Office, or via the Adobe Acrobat "printer"
>> driver or other printer drivers generating PDF files, including Google
>> Cloud Print).
>> So this could be a misuse of Serif PagePlus when creating the PDF (I don't
>> know this software, may be there are options set up that ells it to not
>> embed fonts from a list of fonts that the recipient is supposed to have
>> installed locally, to save storage space for the document, byt evoiding
>> such embedding). Another reason may be that the font is marked as "not
>> embeddable" within its exposed properties.
>> Another reason may be that John tries to open the document with a software
>> that does not handle embedded fonts, or that ignores it to use only the
>> fonts preinstalled by John in his preferences. And in such case the result
>> depends only on fonts preinstalled on his local system (that does not
>> include the fonts created by William), or his software is setup to use
>> exclusively a specific local "Unicode" font for all PUAs.
>> (Softwares that behaved in this bad way was old versions of Internet
>> Explorer, due to limitation of his text renderers, however this should not
>> happen with PDFs, provided you have used a correct plugion version for
>> displaying PDF in the browser : if this fails in the browser, download the
>> document and view it with Adobe Reader instead of view the plugin: there
>> are many PDF plugins on markets that do not support essential features and
>> just built to display PDF containing scanned bitmaps, but with very poor
>> support of text or vector graphics, or tuned specifically to change the
>> document for another device or paper format).
>> Without citing which softwares are used (and which PDF in the list does
>> not load correctly), it is difficult to tell, but for me I have no problems
>> with a few docs I saw created by William. So:
>> NO F = NO FAIL for me.
>> 2015-06-03 13:38 GMT+02:00 John <idou747 at gmail.com>:
>>> Yep, I clicked on your document and saw an empty square where your
>>> character should be.
>>> F = FAIL.
>>> On Wed, Jun 3, 2015 at 6:30 PM, William_J_G Overington <
>>> wjgo_10009 at btinternet.com> wrote:
>>>> Private Use Area in Use (from Tag characters and in-line graphics (from
>>>> Tag characters))
>>>> >> That's not agreed upon. I'd say that the general agreement is that
>>>> the private ranges are of limited usefulness for some very limited use
>>>> cases (such as designing encodings for new scripts).
>>>> > They are of limited usefulness precisely because it is pathologically
>>>> hard to make use of them in their current state of technological evolution.
>>>> If they were easy to make use of, people would be using them all the time.
>>>> I’d bet good money that if you surveyed a lot of applications where custom
>>>> characters are being used, they are not using private use ranges. Now why
>>>> would that be?
>>>> Actually, I have used Private Use Area characters a lot, and, once I had
>>>> got used to them, I found them incredibly straightforward to use.
>>>> I have made fonts that include Private Use Area encodings using the
>>>> High-Logic FontCreator program and then used those fonts in Serif PagePlus,
>>>> both to produce PDF documents and PNG graphics, as needed for my particular
>>>> project at the time.
>>>> For example,
>>>> William Overington
>>>> 3 June 2015
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Unicode