From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Oct 27 2010 - 19:40:25 CDT
Gy. Dobner asked:
> But my original question was not how to encode a combining macron in one
> more possible way but how to encode a length mark that would display as
> something _visually_ _distinguishable_ _from_ _a_ _macron_ (because the
> macron is functionally ambiguous and hence unsuitable for my purposes).
> Is it e.g. possible (i.e. is it Unicode-compliant) to combine a macron with
> some
> non-displaying character for this purpose, and if so, with which
> non-displaying
> character? I understand that ZERO-WIDTH JOINER is not supposed to be used
> in this way (or am I mistaken?).
But this is the wrong question.
The Unicode Standard encodes characters for scripts (and writing
systems).
It doesn't provide a standard for the representation of syllabic
structure or other phonological constructs per se.
Even if you are using some phonetic transcription system like
IPA, which is used as a technical system for representing
sounds, the Unicode Standard's encoding of that is one step
removed. It is the International Phonetic Association that
defines how IPA characters, marks, and other conventions are
used to specify linguistic sounds. What the Unicode Standard
does, in turn, is encode those character and marks for
digital representation on computers.
So I think you have the cart before the horse here.
What you (or the Classicist community in general) need to
do is specify orthographical conventions for the representation
of whatever length distinctions you are trying to systematically
distinguish.
That could be with a colon. It could be with the IPA length
mark. It could be with a doubled-macron. It could be with
some entirely different diacritic. It could be with some
other visible convention.
Once you know *what* you want to write for this, *then* you
ask, how can this written text be represented in Unicode
characters, so I can enter, transmit, print, and otherwise
process it on computers.
It isn't a matter of some hidden format code in Unicode that
normatively denotes lengthiness. Rather, you decide what you
want to write and print for the distinction you need to make.
Hint: Pick some *other* diacritic that already exists in
the Unicode Standard. That way, you won't need to spend two
years hassling with the character encoding process to add some
newly invented mark which isn't yet encoded.
Hint #2: Pick some diacritic mark that is already widely
supported in system fonts. That way you won't need to spend
years hassling system vendors to add the glyphs you need,
or scouring the web looking for custom fonts, in order to
be able to easily display your research on the web and
with easily available tools.
--Ken
This archive was generated by hypermail 2.1.5 : Wed Oct 27 2010 - 19:42:39 CDT