L2/09-348
Source: Mark Davis
Subject: Recommended Unicode escaping mechanism
Date: October 20, 2009


====

Martin Duerst talked about some nice syntax for escaping Unicode characters that is used in Ruby, to wit:
This has a number of good features; it can be more compact than simply using the \u notation, and it consistently handles supplemental characters. For example, take the string containing the two characters:

U+12000 ( 𒀀 ) CUNEIFORM SIGN A
U+12001 ( 𒀁 ) CUNEIFORM SIGN A TIMES A

This can be represented in Ruby's notation as \u{12000 12001} instead of resorting to other notation like \U0012000\U0012001.

I'd like to discuss recommending this notation in UTS #18 and other appropriate places.

Mark