UnicodeIUC14
Abstract

Most input method frameworks proposed so far try to bridge the gap between an input method implementation, the text editor and the input events. In this paper, we propose to specify the implementation of input methods using a declarative approach using extended finite-state transducers. We implemented a finite-state model in Java that handles Unicode strings and events called Salsa.

A transducer specifying an input method is defined classically as a finite-state transducer where transitions between 2 states contain the left-projection of the transduction (input), the right-projection (output) and an event which sends back a value:

Transition from state i to state j with input "a", event Null, output "a";

The transition is traversed when the FST receives a character string corresponding to the left-projection. The associated event is fired and if it handled successfully, the FST sends as output the string on the right-projection. The Salsa language provides a textual syntax for specifying transducers. This language allows the factorization of transitions such that simple FST with few states but many transitions can be written economically. The language also allows the specification of classes of input (character classes or string classes). By default, Strings are Unicode strings: the text specifying the transducer is UTF8. However, any character set can be specified as input or output using hexadecimal codes for specifying characters.

Salsa comes with a compiler which reads a Salsa specification and builds a runtime representation of the transducer, and a Salsa runtime module which can be accessed through the Salsa runtime API.

The class implementing the InputMethodListener interface bridges the gap between the transducer and the outside world. This class is parameterized by a Salsa transducer which defines the input method.

Other applications of Salsa include the specification and implementation of context analysis, and the implementation of codeset converters.

Unicode
When the world wants to talk, it speaks Unicode
ProgramShowcasePast ConferencesRegistrationUnicode StandardCall for Papers
AccommodationSponsorsTalks and PapersTravelConference BoardNext Conference
UnicodeIUC14
International Unicode Conferences are organized by Global Meeting Services, Inc., (GMS). GMS is pleased to be able to offer the International Unicode Conferences under an exclusive license granted by the Unicode Consortium. All responsibility for conference finances and operations is borne by GMS. The independent conference board serves solely at the pleasure of GMS and is composed of volunteers active in Unicode and in international software development. All inquiries regarding International Unicode Conferences should be addressed to info@global-conference.com.

Unicode and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission.

24 January 1999, Webmaster