Re: Writing a proposal for an unusual script: SignWriting

From: Asmus Freytag (
Date: Mon Jun 14 2010 - 16:11:17 CDT

  • Next message: André Szabolcs Szelp: "Re: Writing a proposal for an unusual script: SignWriting"

    On 6/14/2010 1:18 PM, Mark E. Shoulson wrote:
    > On 06/14/2010 02:15 PM, Asmus Freytag wrote:
    >> On 6/14/2010 9:21 AM, Stephen Slevinski wrote:
    >>> Plain text SignWriting should be able to write actual sign language,
    >>> such as "hello world."
    >> You could equally well insist that it should be possible to express the
    >> opening bar of "twinkle, twinkle little star" in plain text, or write
    >> the "square root of the inverse of a plus b" in plain text.
    >> In both cases, you would be disappointed and find that a markup language
    >> is required, such as MathML, although specifically for math, it is
    >> possible to device an extremely light weight markup language that comes
    >> close to plain text.
    > It is all too tempting and too easy for discussions of "Why X Should
    > be Encoded in Unicode" to devolve into "Why X is So Incredibly
    > Useful." In this case, I don't think that's the point.
    Correct, we were not discussing that question.
    > Unlike some other proposals, I think it is clear (to me, anyway) that
    > SignWriting has a fairly solid user-base and also an important use
    > (transcribing signed languages, which don't really have too many other
    > ways of being transcribed. Things like HamNoSys are also not encoded
    > yet).
    Mark (Davis) raised the good point that this needs to be substantiated -
    for now, for the purposes of this discussion, I taken the above as a given.
    > Here, the question is more a matter of "given that SignWriting is
    > nifty, does it qualify as plain text?"

    That is the central question.

    > Or even "Does the way SignWriting does its thing map well to the way
    > Unicode does things?"

    I tried to explain that these are nearly equivalent. A practical
    definition of plain text could be, text encoded as a stream of Unicode
    characters, with no other information. However, there are other
    definitions of plain text based on the ideal concept of the thing, and
    the two don't overlap 100%. Both are useful.

    > If it does not (and cannot be made to do so), then no matter how
    > useful SignWriting is, it may simply not be encodable. It's not
    > because it doesn't deserve to be, and yes, that would really be a
    > bummer because it would relegate signed languages to second-class, but
    > Unicode has its limitations, and SignWriting may well be beyond its
    > capabilities.

    That's where my insistent questions about a layered system come in. One
    where the elements (symbols) are encoded in Unicode, but where some or
    all the details of their relation is encoded in a higher level protocol.

    I suspect that the XML attempts that exist do not implement a correct
    layering, that is, they probably encode the identity of the symbols not
    as character codes but as named entities. That would explain why Steve
    said "same data, only more complex".
    > (That said, I find myself thinking that it *should* be possible to
    > align Unicode and SignWriting. But I recognize that it might not be.)
    As long as the position of the proponents is that all fine details of
    formatting and layout must be carried in the character encoding level,
    I'm not hopeful.
    >> Not all streams of concrete small integers are ispso facto plain text,
    >> even though you can map these integers to the private use space.
    > I guess you would need to establish a distinct and independent meaning
    > for each code-point, which would have to be something more specific
    > than "...and then you give the x-coordinate."
    Generic placement operators I could possibly fathom, since they serve to
    linearize the text - an analogy would be the Ideographic Description
    Symbols that allow description of a two dimensional layout. But the IDS
    stop short of trying to express the subtle modifications that arise out
    of the context and placement of the elements in the final ideograph. For
    that you have to turn to another source, in this case a font.
    >>> For the future, I am considering a browser plugin that will detect and
    >>> render SignWriting character data. A regular expression could scrape
    >>> the appropriate PUA characters. Another regular expression could
    >>> validate that the characters represent valid structures. Then the
    >>> SignWriting display could be built using individual symbols, completed
    >>> signs, or entire columns.
    >> In other words, a layout engine.
    > Is there such a thing as SignWriting without a layout engine? I guess
    > the same question could be asked about Musical notation (though I
    > think it probably could have been coded as plain text. See also
    > for a very powerful musical notation using
    > only ascii, but decidedly *not* plain-text in nature).
    The point is, because one already requires a layout engine (or browser
    plug-in) one might as well use something like MathML in conjuction with
    standard character codes for the basic symbols.
    >> If SignWriting cannot be successfully used except with 2 fonts, then I
    >> see little need for standardizing the code. What you describe is a
    >> private use scheme, even though the private group may have many members.
    > I'm not sure I agree with this. Just because only two fonts are out
    > there so far, and the character-shapes perhaps allow a little less
    > flexibility than some, doesn't mean that other fonts aren't possible.
    > Nor is the multiplicity of fonts a requirement for encoding.
    It's a red flag. If the design is truly this limited, it lacks the
    generalization necessary to evolve gracefully in use. For Unicode to
    enshrine such a design using character codes that can never be
    reassigned (or re-interpreted) would be foolhardy.
    >>> SignWriting has the unusual requirement of a 2 color font. One font
    >>> color for the line of the symbols and another for the fill. The fill
    >>> is needed when symbols overlap.
    >> Hmm.
    > AFAIK, Unicode can't do color. I remember someone mentioning that
    > once. But someone who knows the exact rules can explain better.
    This is a red herring. What he means is that when symbols overlap, their
    insides block out the symbol underneath. That is a departure of how the
    usual layout engines work, but perhaps not an unsurmountable obstacle
    from the point of view of encoding.
    > I think it will help when your proposal is ready for review so people
    > will understand just what it is you are suggesting and can judge how
    > much (if at all) it conflicts with Unicode's capabilities.
    This discussion and the feedback contained in it are designed to help
    Steve address these issues up-front.

    Because sign writing has not had the extensive history of print and then
    digital publication as mathematics or music, a lot of issues probably
    need to be settled, and that will take time. To use mathematics as an
    example: what got encoded in Unicode is not a 1:1 equivalent of what was
    contained in the most successful mathematical layout system ever (TeX),
    but something that corresponded (more or less) to Unicode's concept of
    encoded characters.

    That would be useful to remember.


    This archive was generated by hypermail 2.1.5 : Mon Jun 14 2010 - 16:13:40 CDT