Edward Cherlin wrote:
> >But, in this case, each *single* character in the block must be
> >independently flagged with the property, so that it retains
> it also if it is
> >copied&pasted somewhere else: the actual start and end codes
> will only be
> >generated when rebuilding the Unicode string at the end of editing.
>
> Definitely not the case. Copying would be a problem when using
> embedded codes in linear marked up text, like HTML, where you might
> have to search the whole text to determine what tags were active at a
> specific point.
I am not sure that I totally grasp what you mean.
This discussion did not necessarily involve other properties apart embedding
levels.
I was mainly considering plain text -- the simplest case --, so I imagined
that each edit line could be an array of things like this:
struct MyWysiwygGlyph
{
wchar_t GlyphCode;
int EmbeddingLevel;
};
I think that Roozbeh had something quite similar in mind.
To extend this to a rich text context, I can imagine something much more
complex but not conceptually different:
struct MyTag;
struct MyTagList;
struct MyWysiwygGlyph
{
wchar_t GlyphCode;
int EmbeddingLevel;
MyTagList * PointerToInnermostTag;
};
struct MyTagList
{
MyTag * ThisTag;
MyTagList * PointerToNextTag;
};
struct MyTag
{
//... Whatever data may represent a tag internally.
};
> Real rich text editors use *parallel* markup, where each tag is
> associated explicitly with a run of text. The tags can be kept doubly
> indexed. When you cut a section from within a tagged area, you can go
> out and find which tags to associate with the copy very quickly.
I would like to know more about what you are saying here. I am sure that I
have a lot of naive ideas in mind, that a specialist of word processor would
avoid.
But remember that Roozbeh, Peter and I are not designing any actual system
(well, not together at least): we were just discussing about the general
lines of how bidi editing could work.
I am sure that, when you cut a selection of text, there are many good ways
of retrieving the properties associated to that piece of text (e.g. "bold",
inside "italic", inside "font=Helvetica", inside "language=Italian", etc.)
and carry them over to the clipboard. My word processor does this all the
time!
Just I am not so sure that it should work the same way also with bidi
embedding levels, because of a number of caveats:
1) Unicode bidi embedding levels are *numbered* (even numbers represent LTR
text, while odd numbers represent RTL text). On the other hand, there is no
such thing in the nesting of rich text properties: they simply sit in
different positions in a hierarchical structure.
2) Embedding level have a maximum nesting level (64). On the other hand,
rich text and SGML tags normally do not define any maximum depth of nesting.
3) The lowest level in each paragraph *must* be either 0 (for a LTR
paragraph) or 1 (for a RTL paragraph). I don't know how to parallel this to
any rich text feature.
4) Embedding levels are defined implicitly (e.g. a number in Arabic has an
embedding level higher that the surrounding text) or by means of explicit
bidi controls. In any case, they are *orthogonal* to markup tags. So, if you
have a tagging scheme that imposes that tags are nested into each other
(e.g. XML), embedding levels do not necessarily follow the rule. E.g., see
how tagging and Unicode embedding overlap in: "<BOLD> abc &RLE; def </BOLD>
ghi &PDF; ijk".
_ Marco
This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:14 EDT