From: Peter Kirk (peterkirk@qaya.org)
Date: Tue Nov 25 2003 - 18:40:38 EST
On 25/11/2003 12:02, Peter Kirk wrote:
> The Unicode conformance clauses, in TUS 4.0 section 3.2, are written
> in terms of what "A process" may or may not do, sometimes in relation
> to "another process". But there doesn't seem to be a definition,
> either on this section or in the glossary, of "process". Is this to be
> understood in a general non-technical sense, or in some specific
> technical sense? What makes "a process" distinct from "another
> process"? Are two instances carrying out the same function to be
> considered the same process or distinct processes?
>
As there hasn't been a rush of on-list responses to this one, and partly
in reply to the one off-list response, let me clarify the issue I am
have in mind.
Instance A of a program P, version X, writes a Unicode character string
S, in a particular normalisation form, to a storage medium Z. Some time
later (maybe seconds, maybe years) instance B of version Y of that same
program P reads that string from the same storage medium. For the
purposes of Unicode conformance, are instances A and B to be considered
one process or separate processes?
Conformance clause C9 states that "no process can assume that another
process will make a distinction between two different, but
canonical-equivalent character sequences", which implies that no process
can assume that another process has correctly normalised any character
sequence. So, if instances A and B are considered separate processes, B
is not permitted to assume that the string S has been correctly
normalised - even if in fact it is known that all strings on medium Z
have been written by program P and that all versions of program P write
strings in a particular normalisation form.
Also, can the storage medium Z be considered a process? Or can low-level
transformations of the data, e.g. defragmentation, backup and
compression, which are invisible to the program P be considered
processes? If so, these processes are permitted to transform S into a
canonically equivalent form; and so instance B of program P is not
permitted to assume that the string it reads from Z is in the same
normalisation form as the string written by instance A.
The potential implication is that it is non-conformant to rely on
normalisation stability.
-- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Tue Nov 25 2003 - 19:28:53 EST