From: Lars Marius Garshol (
Date: Thu May 02 2002 - 05:44:28 EDT

* Marco Cimarosti
| I don't know if Unicode's UTC has a rule or decides case by case.
| Applying common sense, I would say that an important criterion
| should be the appearance of the symbol (that's why I asked you for a
| picture).
| Although Unicode does not encode glyphs, if the glyph is visually
| distinct, then it's hard to say that a vaguely look-alike sequence
| is appropriate.

It's not really very unusual in appearance, so I would expect an
encoding using composed characters to work from a visual point of

(I might be able to produce a scan, but it would require a bit of
| Another criterion could be semantics and character properties. E.g.:
| - Should that symbol be usable in an file name or resource locator?

I would be somewhat surprised to see it used that way, but I'm not
sure why *any* character would be rejected for use in such a context.

| - Should that symbol be recognized as a Norwegian word with a
| specific pronunciation?

That would be application-dependent, I would say. Some would probably
like to see the text as written, while others would like to see the
"9:" replaced by "det vil si", which is what it represents.

| If yes, a sequence that can be confused with something else can be
| inappropriate for, e.g., a screen reader application.

Hmmm. I wouldn't expect that to be a problem in this case.

* Lars Marius Garshol
| What happens if I find a font that has this as a single character,
| for example?
* Marco Cimarosti
| This is a circular argument: fonts don't contain characters, they
| contain glyphs. And each glyph can be mapped to one character or to
| a sequence or characters, and this mapping can even be subject to
| contextual rules.

That's true, but on the other hand, when people propose new characters
one of the reactions seems to be "can you show a font that contains
this character"? But I guess what you are saying is that if the symbol
can be encoded using existing characters having a font that contains
it is not enough.

I was thinking of a font where this was a basic character, and not
composed from smaller parts, however. I don't know if that makes any

