Re: Last Call: Language Tagging in Unicode Plain Text to Proposed

From: John Cowan (cowan@locke.ccil.org)
Date: Fri Jul 10 1998 - 13:11:14 EDT


Chen, Qifan wrote:

> It seems that we only need to have BEGIN TAG and CANCEL TAG two special
> characters. All other UNICODE characters (except of course the special two)
> should be allowed inside a tag. From language processing point of view,
> this
> is not more complicated than the proposed approach.

Using that approach requires tag stripping or tag ignoring to be
stateful: you have to keep track of whether you are in tag mode or
not. With the draft's system, you simply strip or ignore all of
the tag characters.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:40 EDT