From: Doug Ewell (dewell@adelphia.net)
Date: Sun Jun 04 2006 - 11:52:38 CDT
Theodore H. Smith <delete at elfdata dot com> wrote:
> What's that? Like levenshtein? (EditDistance) If you are talking about
> a levenshtein-like thing on Unicode, well you can't do it with
> codepoint processing, because a character is not a codepoint, a
> character is a string of codepoints. So if your "cells" must now be
> strings intead of bytes or UInt32s... you might as well use a string
> of UTF-8 instead of a string of UTF-32.
A character is a code point that has an assignment.
Some "letters" consist of a string of characters, and some characters
can be decomposed into a string of characters. But it is not correct to
say that a character is a string of code points.
-- Doug Ewell Fullerton, California, USA http://users.adelphia.net/~dewell/
This archive was generated by hypermail 2.1.5 : Sun Jun 04 2006 - 11:58:01 CDT