Re: Demystifying the Politburo (was: Re: Arabic encoding model (alas, static!))

From: Gregg Reynolds (unicode@arabink.com)
Date: Fri Jul 08 2005 - 18:58:27 CDT

Next message: Peter Kirk: "Re: Demystifying the Politburo (was: Re: Arabic encoding model (alas, static!))"

Previous message: Mark Davis: "Re: CLDR plural handling info?"
In reply to: Kenneth Whistler: "Demystifying the Politburo (was: Re: Arabic encoding model (alas, static!))"
Next in thread: Kenneth Whistler: "Re: Demystifying the Politburo (was: Re: Arabic encoding model (alas, static!))"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Kenneth Whistler wrote:
> Gregg Reynolds asked:

Ok, some ideas, questions, etc. (You can't say I didn't warn you!)

(Oh, and by the way, Ken. WOULD YOU STOP BEING SO DARNED REASONABLE?!!
I've got work to do! I'm pretty sure the guy in the next cubicle has
noticed me muttering, darkly. I can cover that (ever try translating
training manuals 8 hours a day for weeks at a time?) But if my boss
ever finds out I've been daydreaming all afternoon about RHO and CHI and
the nature of Unicode, I shall have no choice but to blame Unicode. I
don't want to do that, but you may leave me no choice.)
>
>
>>I think the more interesting question is: what is the composition of
>>these mysterious bodies?
>
>
> http://www.unicode.org/consortium/memblogo.html
>
> http://www.iso.org/iso/en/aboutiso/isomembers/index.html
>
> These are public organizations... it isn't as if anyone is trying
> to hide memberships.

Sure, it *looks* innocent enough; but who is behind the people who are
behind Unicode? And behind them?

(List: that's a JOKE!)

> For what it's worth, since this started out as a thread concerned
> with Arabic encoding, the longtime chair of SC2/WG2 is a native
> speaker of Arabic

Boy, is he gonna be mad at you once he gets my email.

Gosh. A beautiful July afternoon in which actual *information* was
exchanged on the Unicode list! Snotlessly! I feel positively giddy!

Seriously (I'll try), the question of participation of native speakers
is (IMHO) and important and thorny one.

On the one hand, nothing says native speakers are the best informants.
And as a matter of policy I see no reason why a *standards* body
(especially an industry standard body) should have a requirement for
native speaker participation; after all, the (industry-defined) goal is
to get a standard, not to make everybody happy. No doubt such
participation is desirable, but it's quite a different thing to say it's
required. Standards have to work in the marketplace in order to become
standards.

On the other hand, it's pretty obvious (to me at least) that
participation of native speakers in standardization of cultural
artifacts like written language is a Good Thing. (List: I know, I
know, Unicode does not encode written language, it encodes
characters/scripts/whatever. But the perception will always and
inevitably be that it is an encoding or modeling of written language.)

I can't help drawing an analogy (if that's the right word) to the ideas
often discussed by Edward Said, among others. He wrote extensively
about how the West (that fearsome boogeyman) controls the narrative
of/about the East. It doesn't really matter if I as a Westerner get it
right; the East (South, Middle East, slightly East and a little South
but ... etc.) should speak for itself. (Or something like that; it's
been a while). Now, one may agree or disagree with his language (I'm
not so crazy about it myself), but there is no denying that his views
are supported by a large population in both East and West. Defining an
encoding that models (in some way) non-Western languages without
significant - and visible - participation of native speakers seems
analogous to "us" telling their history instead of letting "them" tell
their history.

On the third hand, it's clear (but maybe only to those who follow the
Unicode list) that people like Mr. Everson work very closely with native
speakers, so you can't really argue that the linguistic communities
were/are not represented. We are clearly not the 19th century.

On the fourth hand, it's also clear (to me at least) that Unicode works
great for some linguistic communities and not so great for others. (You
knew it was coming, and here it is: Unicode is very bad indeed for the
RTL community in general and Arabic in particular. ;-) This gets back
to the design principles (and the interests that drive them) of Unicode,
which work better for some languages than others.

And then there are the pragmatic issues which you have outlined
concisely in another message.

Obviously I haven't quite wrapped my mind around these issues yet so I
beg the indulgence of you and other Listerines. I (rashly?) assume that
pretty much everybody on this list is interested in "getting it right"
for everybody, and therefore might be a little interested in such
considerations. It's not a case of blaming, but of understanding. I
think.

Personally, I think Unicode is (well, may be) of enormous historical
significance, yet it flies almost entirely under the cultural radar, at
least in the US. I daresay most places in the world that will
eventually be heavily influenced by Unicode are more or less oblivious
to it.
>>
>>To me, at least, "UTC" and "WG3098098534543" seem to be very fearsome
>>creatures, not so different from the politburo. I'm not saying they
>>*are* like that. I guess I'm saying they have a PR problem.
>
>
> Perhaps -- but this is mostly because the character encoding
> issue touches on very emotional issues of language identity
> and linguistic politics.
>

Understatement of the century.

> Funny that the Unicode Consortium's standard on regular expression
> syntax or the ISO standards on freight containers don't seem to
> draw the same kind of visceral responses and intimations of
> dark conspiracies, despite the fact that they are developed with
> more or less the same procedures as the Unicode Standard and
> ISO/IEC 10646.

*takes a moment to recover from snorting in his tea*

I HATE those freight container standards! Please don't bring them up
again! Sarasvati! Sarasvati!

(Actually, I have some regex ideas for Arabic that will surely cause
numerous aneurisms on the list, but not now.)

>
> The Unicode Consortium is a small non-profit supported almost entirely
> by membership dues:
>
> http://www.unicode.org/consortium/levels.html
>
> Cross-match that with:
>
> http://www.unicode.org/consortium/memblist.html
>
> and you can get a pretty accurate notion of the total annual
> budget of the organization.

<meekly>Ok, do you have a tip-jar somewhere? Can I send a bottle of
Grog somewhere?</>

>
> The organization has 3 employees, listed as staff at:
>

I hope they're well-paid.

>
>>creates an outreach program? Or maybe even an internship
>>program targeting minority language communities? Maybe it's been done.
>
>
> Great ideas require resourcing -- in time, money, and dedicated
> people who will follow up and make things happen. Start and
> maintain an organization to do such things -- that's how they
> happen.

Workin' on it. Well "workin'" might be a little strong. Thinkin' 'bout it.

>
> http://linguistics.berkeley.edu/sei/
>

Thanks, very interesting. I see many of the scripts being worked on
list one "Everson" as the contact. Who is this mysterious and
ubiquitous "Everson", anyway? Is it one person? Sounds an awful lot
like the fictional Cecil Adams to me:
(http://www.straightdope.com/index.html)

-gregg

Next message: Peter Kirk: "Re: Demystifying the Politburo (was: Re: Arabic encoding model (alas, static!))"
Previous message: Mark Davis: "Re: CLDR plural handling info?"
In reply to: Kenneth Whistler: "Demystifying the Politburo (was: Re: Arabic encoding model (alas, static!))"
Next in thread: Kenneth Whistler: "Re: Demystifying the Politburo (was: Re: Arabic encoding model (alas, static!))"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri Jul 08 2005 - 18:59:51 CDT