From: Peter Kirk (peterkirk@qaya.org)
Date: Fri Mar 25 2005 - 17:55:30 CST
On 25/03/2005 22:12, Erik van der Poel wrote:
> Peter Kirk wrote:
>
>> I also note a wide range of which Unicode characters are used for the
>> apostrophe in the various languages, but that is an issue for those
>> who coded the texts (for some of them, it is me).
>
>
> It may also become an issue for those writing the specs for IDNs and
> maybe even the base spec, Stringprep. Would you please list the
> codepoints of the apostrophes that you are aware of?
>
Several, most English language versions, used U+0027. Others used U+2019
- although some used this also as a quotation mark paired with U+2018.
The Greek text uses U+1FBD, although this is a coding error - a Greek
apostrophe is not a KORONIS. None of the texts I looked at use U+02BC,
although this is the character which is supposed to be used at least for
the Azerbaijani apostrophe.
It may help you to look at the lists of near equivalents listed in the
Unicode code charts. These are mostly, not always, sufficiently similar
in shape to be confusable, and so should potentially be folded together
for these purposes. I would think it would also be sensible to fold all
of U+02B9, U+02BB to U+02BF, U+2018, U+2019 and U+201B to U+0027, as all
of these are easily confusable in small font sizes, and some software
rather too freely converts between these as "smart quotes".
Meanwhile I have found a language which appears to use a double quote as
well as a single quote, also a word initial hyphen, as part of its
orthography. See http://www.worldscriptures.org/pages/attie.html. Double
quotes need to be treated with a similar set of lookalike equivalents to
single quotes, also equivalenced to two single quotes to avoid spoofing.
There may be other useful information on published orthographies at this
site, http://www.worldscriptures.org/a-z-frameset.html, although there
is sadly no sample of Fe'fe'.
-- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/ -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.308 / Virus Database: 266.8.1 - Release Date: 23/03/2005
This archive was generated by hypermail 2.1.5 : Fri Mar 25 2005 - 17:56:16 CST