Re: apostrophe vs. modifier letter apostrophe

From: Kenneth Whistler (kenw@sybase.com)
Date: Mon Mar 25 2002 - 19:10:04 EST

Previous message: Chookij Vanatham: "Re: "UTR#9: Bidirection" and "UTR#14: Line Breaking""
Maybe in reply to: Peter_Constable@sil.org: "apostrophe vs. modifier letter apostrophe"
Next in thread: Asmus Freytag: "Re: apostrophe vs. modifier letter apostrophe"
Reply: Asmus Freytag: "Re: apostrophe vs. modifier letter apostrophe"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Peter continued:

> OK, both you and John mentioned identifiers. Let me ask a slightly
> different question: I'm thinking about all of our linquists who have
> existing data containing 0x27 to represent a glottal stop (some possibly
> also using it as a quotation mark / apostrophe), and I'm thinking about
> getting them migrating to using Unicode. I know that it would be good for
> them to encode this orthographic representation of glottal stop as U+02BC,
> but if they also use 0x27 for a quotation mark, it may be not so trivial
> to get their data converted correctly, and many might be inclined to just
> map 0x27 > U+0027. I'm trying to think of reasons to give them as to why
> they might not want to do this, and usability for identifiers isn't going
> to particularly grab the attention of many of them.
>
> So, why might a linguist want to go through the extra effort to map 0x27 >
> U+02BC in exactly those contexts when it should map to this and not U+2019
> or something else?

This is just the computer-age version of the age-old question as
to why a linguist would want to distinguish anything that functions
differently.

For years back in the late 70's and early 80's, before I got my
first PC, I typed up index slips with a manual typewriter. That
manual typewriter had various custom keys welded on, so that I could
get schwas, open-o's, lambda's, dead-key commas above, and the like.
To do so, it eliminated various "dispensable" keys. Among the
dispensables were "1" and "0" (odd choices to be missing for a
future computer guy -- but I digress). So my only option was to
type "l" for "1" and "O" for "0". That worked fine for my slips,
because I knew the difference. It also worked fine for correspondence,
although it was a bit hinky. And the post office didn't care for
addresses I typed that way. (They still don't, as a matter of fact.)

But if I took all that data and coded it for entry in a modern
database, would I keep my "l"'s and "O"'s for "1"'s and "0"'s?
Of course not. Because I know the difference, and wouldn't want them
mixed up for computational use.

Now take your linguists. If they have been using ASCII 0x27 for
both a single quote and for glottal stop, they are just the next
step along in overloading functionally different characters.
[In fact, if they've been using Macintoshes all along, they shouldn't
be in this quandary, since they could/should have been using
0xD4/0xD5 (= U+2018/U+2019) for their single quotation marks. If
so, then they would already be distinguished from an 0x27 used for
a glottal stop.] If I were in their shoes and were being offered
a conversion to a character encoding that enabled me to make
the difference systematically, I'd be working to filter my data
to do the right thing -- since I'd care about the data integrity
in the more capable representation. Of course, they might also
prefer to actually make use of U+0294 LATIN LETTER GLOTTAL STOP,
but that would depend, in part, on how entrenched the orthography
using the apostrophe-shaped letters is.

By the way, if identifiers won't grab the attention of your linguists,
then consider a kindred operation: word selection. A properly
implemented word selection should select *inside* quotation marks,
but should include any glottal stops in a word. If your orthography
hopelessly mixes up the two, then your system isn't likely to give very
appropriate word selection feedback.

--Ken

Previous message: Chookij Vanatham: "Re: "UTR#9: Bidirection" and "UTR#14: Line Breaking""
Maybe in reply to: Peter_Constable@sil.org: "apostrophe vs. modifier letter apostrophe"
Next in thread: Asmus Freytag: "Re: apostrophe vs. modifier letter apostrophe"
Reply: Asmus Freytag: "Re: apostrophe vs. modifier letter apostrophe"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Mon Mar 25 2002 - 20:22:44 EST