From: Hans Aberg (haberg@math.su.se)
Date: Mon Feb 05 2007 - 06:52:44 CST
On 4 Feb 2007, at 00:02, Doug Ewell wrote:
>> Well, the apostrophe used in language is not semantically a right
>> single quotation mark. There might be some subtle rendering
>> differences between a U+2019 and a proper, linguistic apostrophe,
>> like in spacing.
>>
>> And if U+0027 is a multipurpose character, then a there is a gap
>> in the Unicode character set.
>>
>> And then: a new character should added.
>
> The NamesList file [1], which is a formal part of the Unicode
> Character Database, says U+2019 is the preferred character for
> apostrophe. It has this annotation under three characters: U+0027,
> U+02BC, and U+2019 itself.
>
> Regardless of whether there is a school of thought that
> "apostrophe" and "right single quotation mark" should be different
> characters, this is what the Unicode Technical Committee has
> decided, and while they may change their minds — in Unicode 1.0 the
> preferred apostrophe was U+02BC — I would be amazed if they did so.
>
> I'm sure Ken Whistler will come along soon with a better-
> articulated and more authoritative version of this.
>
> [1] http://www.unicode.org/Public/UNIDATA/NamesList.txt
Though Unicode has decided to recommend the right single quotation
mark U+2019 to double as punctuation apostrophe, they are
semantically different, and even though it may seem clever with such
doubling in a more narrow context, when the context widens, some
problems may ensue. Now, in the case of this particular character,
the problems may very great, but it may still be annoying.
For example, parsing text becomes ambiguous, problematic for computer
programs. If correct parsing is needed for further processing, there
will be annoying failures, and if those should be removed, one will
have to set humans together with some computer language extensions,
removing those ambiguities by hand, which might hev been eliminated
in the first place.
Hans Aberg
This archive was generated by hypermail 2.1.5 : Mon Feb 05 2007 - 06:55:29 CST