Re: Unicode lexer

From: Tex Texin (tex@i18nguy.com)
Date: Wed Apr 20 2005 - 17:50:05 CST

  • Next message: Tom Emerson: "Re: Unicode lexer"

    Hans Aberg wrote:

    > >5. Use some Unicode based String data type as primitive datatype to
    > >return the result in the token.[?]
    >
    > Again, it is unclear what you mean here, as the lexer just returns
    > the int token values indicated by hand in the rule actions.
    >
    > More advanced Unicode support might involve support for recognizing
    > common Unicode character classes. For example, one might want to
    > recognize letters, so that one can easily admit identifiers using
    > letters.
    > --
    > Hans Aberg

    We would want to make use of the character classes and in general follow
    UAX 31.

    Anyone have experience good or bad with the UAX 31 model?

    -- 
    -------------------------------------------------------------
    Tex Texin   cell: +1 781 789 1898   mailto:Tex@XenCraft.com
    Xen Master                          http://www.i18nGuy.com
                             
    XenCraft		            http://www.XenCraft.com
    Making e-Business Work Around the World
    -------------------------------------------------------------
    


    This archive was generated by hypermail 2.1.5 : Wed Apr 20 2005 - 17:50:39 CST