a character for an unknown character
martinmueller at northwestern.edu
Tue Dec 20 20:29:59 CST 2016
I’m new to this list. Please excuse my technical incompetence.
Is there a Unicode character that says “I represent an alphanumerical character, but I don’t know which”. This is a very common problem in the transcription of historical texts where you have lacunas. Often, the extent of the lacuna is known, and the alphabet is known as well. The EEBO TCP transcriptions of English texts before 1700 are good examples. They are SGML transcriptions, where missing stuff is represented by <gap/> elements with attributes about this or that. This is efficient when it comes to pages, very inefficient when it comes to individual characters.
There is a Web character—a diamond with a question mark inside it—which means “I may know what this character represents, but I can’t display it”. Which is a very different message. On the other hand, if you extened the use of that character, it probably wouldn’t’ create much ambiguity.
In the TCP project, various code points from the Geometrical were used to represent lacunae. The black circle (\u25cf) has been used as the character for a missing character.This is OK and unambiguous in its context. But would be nice to have a special character for just that purpose, and given the number of emoji, this doesn’t seem to be a particularly frivolous request. Which alphabet, you might ask. But that doesn’t really matter. There is a very high probability that the missing character comes from the character set of the surrounding words. And if that isn’t the case, the transcriber wouldn’t know it. S/he sees that there is something, perhaps even that there is just one of it, but doesn’t know which
Professor emeritus of English and Classics
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Unicode