From: Mark Davis (mark@macchiato.com)
Date: Fri Jan 16 2009 - 14:08:19 CST
Good suggestions. "not in common ordinary use" is way too long for a menu,
but "Uncommon" would probably do the trick.
Mark
On Fri, Jan 16, 2009 at 11:53, Asmus Freytag <asmusf@ix.netcom.com> wrote:
> On 1/15/2009 9:07 PM, Mark Davis wrote:
>
>> Good points. There are two purposes, really.
>>
> I'll address each of them in turn, but that'll destroy the autonumbering...
>
>>
>> 1. I have an UTC action to update UTR#39, which provides for sets
>> of characters that people may want to exclude from identifiers.
>> It has an 'archaic' category, and I need to update the contents.
>>
>> The Latin micro sign does not belong on an "obsolete" list. In an
> identifier context you need to handle it by mapping it to Greek micro, but
> you have to be realistic that many keyboards will support one and not the
> other.
>
>>
>> 1. Independently, in doing a character picker
>> (http://www.macchiato.com/unicode/char-picker), we found it
>> useful to put the archaic/obsolete characters in separate
>> sections. This is work we are looking at at Google, but we're
>> also making the data available so that others could use/tweek if
>> they wish.
>>
>> For a character picker, as you explained elsewhere, the task is not a
> partition, but potentially several overlapping sets, each geared to specific
> orthographies or notations.
>
> For IPA, you suggested, again elsewhere, that you might split the
> "official" from the "unofficial" set. Given that the official set has
> changed, I suggest that you use different names for these sets: "core" and
> "extended". The point that you want to make it easier for people to find
> frequently used characters by removing the distractions is well taken, but
> it's important to do it in a way that suggests nothing about a *preference*.
> Similar approaches are useful whenever you have a "basic" and an "extended"
> repertoire.
>
> For mathematical notation purposes, you might look at the data tables with
> UTR#25 to give you an idea how to structure input and what to cover.
>
> For punctuations and symbols you might look around, there's been some work
> done on arranging symbols by shape (dots, dot patterns, stars, circles,
> crosses, lines, angles, curves, etc.) or by symmetry (rotational, vertical,
> horizontal, both, etc.). There was a site "symbols.org" or so, that used
> that scheme to document a large number of symbols. (But, as on that site,
> once you locate a symbol, you need explanation about its context and
> meaning).
>
> Note that there may have been some confusion from my message. By
>> "obsolete" or "archaic",
>>
> Best avoid such terms - even for #39 I suggest that you rename the category
> to "not in common ordinary use" or something. Remember that any list you
> create *will* be taken out of context by somebody. (That's happened to
> practically all the list you've generated for Unicode, so this one's not
> going to be an exception). Having the category named in a way that clearly
> relates to the criteria for classification is a good method to mitigate that
> problem.
>
> A./
>
> we don't mean that the character itself is deprecated or that people
>> shouldn't use it; what we mean is that it isn't customarily used in modern
>> languages in typical publications (corner newspapers, magazines, etc.). For
>> example, you wouldn't expect to see words written in Cuneiform in the NY
>> Times. Of course, they may occur in technical journals, especially those
>> dealing with archaic languages, or have occasional decorative use.
>>
>> Mark
>>
>>
>>
>
This archive was generated by hypermail 2.1.5 : Fri Jan 16 2009 - 14:09:18 CST