Re: Writing a proposal for an unusual script: SignWriting

From: Asmus Freytag (
Date: Mon Jun 14 2010 - 13:15:42 CDT

  • Next message: Tulasi: "Re: Latin Script"

    On 6/14/2010 9:21 AM, Stephen Slevinski wrote:
    > Greetings Asmus Freytag,
    > Plain text SignWriting should be able to write actual sign language,
    > such as "hello world."
    You could equally well insist that it should be possible to express the
    opening bar of "twinkle, twinkle little star" in plain text, or write
    the "square root of the inverse of a plus b" in plain text.

    In both cases, you would be disappointed and find that a markup language
    is required, such as MathML, although specifically for math, it is
    possible to device an extremely light weight markup language that comes
    close to plain text.
    > This is a combination of 2 signs and 1 punctuation. The first sign
    > has 2 symbols. The second sign has 4 symbols. Punctuation is always
    > used by itself.
    > You can see this online being processed in JavaScript:
    This does not work for me.
    > I dislike the idea of requiring a higher level protocol in order to
    > encode plain text SignWriting. I have used CSS to change the color
    > and size of SignWriting. I chose not to include color or size in the
    > plain text representation of SignWriting because color and size do
    > belong in the higher level protocol.
    You need to distinguish between CSS (styles) and other higher level
    protocols. While CSS allows positioning, I agree with you that it would
    be inappropriate to use it to express the relative position of symbols
    in a sentence. What range of style information, other than your color
    example should exist for SignWriting? That was the purpose of my
    suggestion that you sketch a more complete model.

    Music and mathematics are similar to SignWriting as you have explained
    it, in that the spatial arrangement of visual elements is a big part of
    their meaning. Without a staff and the position within the staff, you
    may know that you have a half-note, but not its pitch. Additional
    elements are needed to tell you how the note is to be played.

    Mathematical notation is similar, but much simpler, in that much of the
    layout is determined by the operators used, as well as the scoping of
    nested expessions (although some arrangements, such as matrices, do not
    have a convenient operator). Because of the strong regularity of
    mathematical expressions, a full-blown markup language is not always
    needed (see "Unicode Technical Note #28
    Unicode Nearly Plain-Text Encoding of Mathematics" - note the use of
    "nearly" in the title.).

    >> From the way you describe the requirements (faithfully representing
    >> the minutest details of the authors choice of placements, etc.) and
    >> your claim that the plain text level should not / does not encode
    >> semantic contents, I get the impression that you have not fully
    >> thought through what information should be represented at what level
    >> of the text architecture.
    > I prefer the term phonemic rather than semantic. Most symbols are
    > phonemic. A minute few of the symbols are featural. It requires
    > several featural symbols to build a phonemic representation, such as a
    > new handshapes that a particular author feels is absolutely needed.
    > While not a linguist, I would say that the semantic meaning of
    > SignWriting is contained in the spatial layout of the symbols. We do
    > no encode the semantic meaning directly, but the semantics can be
    > perceived when a sign is considered as a whole. Some of the semantics
    > are not even included in the writing, but left for the reader to
    > infer. An example would be a starting handshape that is not written
    > but can be assumed.
    Unicode uses the word "semantics" in a funny sense that in that meaning
    is limited to character encoding.
    >> Concretely: do you see the need for, existence of a SignWritingML?
    > I see no advantage to requiring XML for plain text. Years ago, I was
    > using a comma delimited format for the data. I took the advice of
    > others and moved to an XML format: specifically SWML-S. It was a
    > misstep. Same data, but more complex processing.
    Not all XMLs are designed equally. For the same task you can create
    different XML types, so that alone doesn't tell me that an XML solution
    isn't viable.

    Beyond XML there are other markup languages. Perhaps SingWriting would
    lend itself to a "nearly-plaintext" type of lightweight markup.
    > Now, with the character encoding, I can represent plain text
    > SignWriting as character data. Easy to parse, search, sort, and
    > display. There is no advantage to using XML; however, I have created
    > an equivalent XML (called BSWML) with roundtrip mapping between
    > character data and XML. I personally will not be using XML for plain
    > text, but I thought others might appreciate the possibility.
    It's not plain text unless you get all the elements encoded in Unicode.

    For which you first need to demonstrate that what you are proposing to
    encode matches Unicode's definition of plain text.

    Not all streams of concrete small integers are ispso facto plain text,
    even though you can map these integers to the private use space.
    >> Do you think, existing HTML could correctly render SignWriting if
    >> that was presented as part of the plain text data (under your proposal)?
    > I have both client side and server side processing of SignWriting.
    > The server side uses PHP and passes completed sign or column images to
    > the client. The client side processing uses JavaScript, CSS, and HTML
    > to correctly represent the SignWriting. The HTML specifically uses
    > DIV tags with relative positioning. Each symbol is positioned with
    > it's own DIV.
    OK, we both agree that this is far from plain text.
    > For the future, I am considering a browser plugin that will detect and
    > render SignWriting character data. A regular expression could scrape
    > the appropriate PUA characters. Another regular expression could
    > validate that the characters represent valid structures. Then the
    > SignWriting display could be built using individual symbols, completed
    > signs, or entire columns.
    In other words, a layout engine.
    >> What happens when a user agent selects a different font, because the
    >> one the author used is unavailable on the system used by the reader?
    > For SignWriting the font designer has strict rules of size and shape.
    > The font designer can modify the symbol glyphs only under these
    > restrictions. With the number of symbols involved, no font designer
    > would break these rules and waste their time designing a font that
    > would not work with the writing system.
    That's an example of the statements that make it sound like that
    SignWriting is not following the character glyph model.

    Some notations do enforce certain restrictions on font design. For
    mathematics they are limited to ensuring that certain characters are
    rendered distinctly (where they might use the same shape in an ordinary
    text font). This includes a requirement on the relative sizes of
    certain symbols, as well as their default alignment on the line. Other
    than that, math fonts may vary widely, and designers have tried to
    create variants that meet their specific artistic interpretation of a
    readable math font.
    > The major choice will be between a raster font and an SVG font. The
    > raster font is completed. The SVG font is a work in progress.
    If SignWriting cannot be successfully used except with 2 fonts, then I
    see little need for standardizing the code. What you describe is a
    private use scheme, even though the private group may have many members.
    > SignWriting has the unusual requirement of a 2 color font. One font
    > color for the line of the symbols and another for the fill. The fill
    > is needed when symbols overlap.
    > Here's a few simplified examples.
    > First the sign for dinner in American Sign Language.
    > Next is the sign for German in German Sign Langauge:
    > Each sign has two symbols: one hand and one head. The fill color of
    > each hand symbol covers part of the head line. The symbols represent
    > the phonemic information, while the semantics are perceived by the
    > symbol position.
    > In SignWriting we write the symbols in space, so all we encode is the
    > positions of the symbols. In HamNoSys, you would encode dinner as
    > round index hand on chin, and German as closed index hand on forehead.
    > The difference effects how the writers think while they write. For
    > sign language, everything is visual. The language center of their
    > brain is wired to their eyes. They look at the symbols and see
    > phonemic information. They construct a sign by placing symbols on a
    > canvas. They perceive the semantic information not only between any
    > two symbols, but the semantic information contained by all of the
    > symbols taken together. They look for shape and pattern. This
    > greatly affects how someone reads.
    > We've had large discussions about spelling on the SignWriting list.
    > Someone will write a sign and others will comment. People will try to
    > rewrite the sign to be easier to read. It's amazing what a difference
    > a small adjustment in symbol placement can make. A sign that was
    > difficult to read automatically becomes easy and clear. The
    > improvement in the sign was only possible because of an excellent
    > writer who understood the sign as a whole and was able to build a
    > cohesive representation. SignWriting is part artistry. Is simple to
    > start, but the best writers have an eye for symbol choice and symbol
    > position.
    > Maybe that's why I'm such a proponent for exact symbol placement: it's
    > the only way to achieve the best writing. I'm sure a system could be
    > devised that produced average or adequate writing, but it will never
    > be able to produce the quality of writing that an experienced writer
    > can produce.
    The best, and most readable layout of text in the Latin script is
    usually not the one generated from plain text. In Arabic, sophisticated
    text engines can create layouts that involve convoluted spacial
    re-arrangement of symbols and their elements to make the result more
    readable. Neither script violates the character glyph model, not is it
    limited to just 2 fonts, one of which is not completed.
    > My focus is to make the best encoding for the reader and the writer.
    > Let the designers and programmers handle the extra complexity.
    The reader doesn't care how the text is represented internally, only
    that it is rendered acceptably (or beautifully).

    The writer cares about the tools used to enter the text, so that it can
    be rendered acceptably (or beautifully) for the eventual reader in a way
    that represents the writer's intent. Unless the writer is a programmer,
    the writer does not care how the text is represented internally.
    Usually, the writer cares about the total text input experience.

    Note that I am specifically not claiming that there is something wrong
    in your statement that beautifully written sign language is sensitive to
    small differences in placement of the symbols. But I am not convinced
    that all of these small differences really have a place in plain text,
    the way "plain text" is defined in the context of the Unicode Standard.
    >> In some of your answers you've given a few hints, but for someone
    >> like me who has no firsthand experience of signing and difficulties
    >> visualizing sign writing, you probable will want to be way more
    >> explicit and concrete in your description and examples, so that it
    >> becomes possible to evaluate whether your choices in the encoding
    >> model are the correct ones, or possibly the only ones, or whether, on
    >> the contrary, the represent an unnecessary departure from the way
    >> Unicode deals with non-linear notations.
    > I appreciate your consideration and thought. I will put more detail
    > in the official proposal later this year. The current documents are
    > to define the open standards for data sharing between projects.
    > Perhaps you'd like to read a short piece by Adam Frost titled "Why
    > SignWriting?" in English and ASL.
    > Regards,
    > -Steve

    This archive was generated by hypermail 2.1.5 : Mon Jun 14 2010 - 13:19:10 CDT