From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Thu Nov 21 2002 - 04:22:03 EST
Carl W. Brown wrote:
> I think that the bigger issue might be how do you extend Morse code to
> incorporate the Unicode character set.
> [...]
Carl, this is unfair!! You spoiled my April 1st joke in mid November!
Ciao.
Marco :-)
----------------------------------------------------------------------
UTF-Morse - "Bringing Unicode in the telegraph age!"
1. Unicode characters U+0020..U+007E are encoded according to the
following table:
Code: UTF-Morse: Character name:
------ ----------- --------------------------
U+0020 / SPACE
U+0021 -----. EXCLAMATION MARK [1]
U+0022 .-..-. QUOTATION MARK
U+0023 .-.-.. NUMBER SIGN [1]
U+0024 ..-... DOLLAR SIGN [1]
U+0025 ..-..- PERCENT SIGN [1]
U+0026 ..-.-. AMPERSAND [1]
U+0027 .----. APOSTROPHE
U+0028 -.--.- LEFT PARENTHESIS
U+0029 -.---. RIGHT PARENTHESIS [1]
U+002A -.---- ASTERISK [1]
U+002B --.... PLUS SIGN [1]
U+002C --..-- COMMA
U+002D -....- HYPHEN-MINUS
U+002E .-.-.- FULL STOP
U+002F -..-. SOLIDUS [1]
U+0030 ----- DIGIT ZERO
U+0031 .---- DIGIT ONE
U+0032 ..--- DIGIT TWO
U+0033 ...-- DIGIT THREE
U+0034 ....- DIGIT FOUR
U+0035 ..... DIGIT FIVE
U+0036 -.... DIGIT SIX
U+0037 --... DIGIT SEVEN
U+0038 ---.. DIGIT EIGHT
U+0039 ----. DIGIT NINE
U+003A ---... COLON
U+003B ---..- SEMICOLON [1]
U+003C ---.-. LESS-THAN SIGN [1]
U+003D ----.. EQUALS SIGN [1]
U+003E ---.-- GREATER-THAN SIGN [1]
U+003F ..--.. QUESTION MARK
U+0040 -.-.-. COMMERCIAL AT [1]
U+0041 ..-- .- LATIN CAPITAL LETTER A [2]
U+0042 ..-- -... LATIN CAPITAL LETTER B [2]
U+0043 ..-- -.-. LATIN CAPITAL LETTER C [2]
U+0044 ..-- -.. LATIN CAPITAL LETTER D [2]
U+0045 ..-- . LATIN CAPITAL LETTER E [2]
U+0046 ..-- ..-. LATIN CAPITAL LETTER F [2]
U+0047 ..-- --. LATIN CAPITAL LETTER G [2]
U+0048 ..-- .... LATIN CAPITAL LETTER H [2]
U+0049 ..-- .. LATIN CAPITAL LETTER I [2]
U+004A ..-- .--- LATIN CAPITAL LETTER J [2]
U+004B ..-- -.- LATIN CAPITAL LETTER K [2]
U+004C ..-- .-.. LATIN CAPITAL LETTER L [2]
U+004D ..-- -- LATIN CAPITAL LETTER M [2]
U+004E ..-- -. LATIN CAPITAL LETTER N [2]
U+004F ..-- --- LATIN CAPITAL LETTER O [2]
U+0050 ..-- .--. LATIN CAPITAL LETTER P [2]
U+0051 ..-- --.- LATIN CAPITAL LETTER Q [2]
U+0052 ..-- .-. LATIN CAPITAL LETTER R [2]
U+0053 ..-- ... LATIN CAPITAL LETTER S [2]
U+0054 ..-- - LATIN CAPITAL LETTER T [2]
U+0055 ..-- ..- LATIN CAPITAL LETTER U [2]
U+0056 ..-- ...- LATIN CAPITAL LETTER V [2]
U+0057 ..-- .-- LATIN CAPITAL LETTER W [2]
U+0058 ..-- -..- LATIN CAPITAL LETTER X [2]
U+0059 ..-- -.-- LATIN CAPITAL LETTER Y [2]
U+005A ..-- --.. LATIN CAPITAL LETTER Z [2]
U+005B ..---. LEFT SQUARE BRACKET [1]
U+005C .-.... REVERSE SOLIDUS [1]
U+005D ..---- RIGHT SQUARE BRACKET [1]
U+005E .-...- CIRCUMFLEX ACCENT [1]
U+005F ------ LOW LINE [1]
U+0060 ...--- GRAVE ACCENT [1]
U+0061 .- LATIN SMALL LETTER A
U+0062 -... LATIN SMALL LETTER B
U+0063 -.-. LATIN SMALL LETTER C
U+0064 -.. LATIN SMALL LETTER D
U+0065 . LATIN SMALL LETTER E
U+0066 ..-. LATIN SMALL LETTER F
U+0067 --. LATIN SMALL LETTER G
U+0068 .... LATIN SMALL LETTER H
U+0069 .. LATIN SMALL LETTER I
U+006A .--- LATIN SMALL LETTER J
U+006B -.- LATIN SMALL LETTER K
U+006C .-.. LATIN SMALL LETTER L
U+006D -- LATIN SMALL LETTER M
U+006E -. LATIN SMALL LETTER N
U+006F --- LATIN SMALL LETTER O
U+0070 .--. LATIN SMALL LETTER P
U+0071 --.- LATIN SMALL LETTER Q
U+0072 .-. LATIN SMALL LETTER R
U+0073 ... LATIN SMALL LETTER S
U+0074 - LATIN SMALL LETTER T
U+0075 ..- LATIN SMALL LETTER U
U+0076 ...- LATIN SMALL LETTER V
U+0077 .-- LATIN SMALL LETTER W
U+0078 -..- LATIN SMALL LETTER X
U+0079 -.-- LATIN SMALL LETTER Y
U+007A --.. LATIN SMALL LETTER Z
U+007B --.-.. LEFT CURLY BRACKET [1]
U+007C --.--. VERTICAL LINE [1]
U+007D --.-.- RIGHT CURLY BRACKET [1]
U+007E --.--- TILDE [1]
2. All other Unicode characters are encoded with one of seven
multi-Morse schemes:
Code range: Scheme
----------------- ------
U+0000..U+0007 1
U+0008..U+001F 2
U+007F..U+01FF 3
U+0200..U+0FFF 4
U+1000..U+7FFF 5
U+8000..U+3FFFF 6
U+40000..U+10FFFF 7
Each scheme uses a Morse sequence of the form ".-.yyy", possibly
preceded by one or more Morse sequences in the form ".-.yyy":
Scheme Bits (x: 0 or 1): UTF-Morse (y: "." if x is 0, "-" if x is 1):
------ --------------------
------------------------------------------------
1 00000000000000000xxx .-.yyy
2 00000000000000xxxxxx -..yyy .-.yyy
3 00000000000xxxxxxxxx -..yyy -..yyy .-.yyy
4 00000000xxxxxxxxxxxx -..yyy -..yyy -..yyy .-.yyy
5 000000xxxxxxxxxxxxxx -..yyy -..yyy -..yyy -..yyy .-.yyy
6 000xxxxxxxxxxxxxxxxx -..yyy -..yyy -..yyy -..yyy -..yyy .-.yyy
7 xxxxxxxxxxxxxxxxxxxx -..yyy -..yyy -..yyy -..yyy -..yyy -..yyy
.-.yyy
3. Notes
[1]: Some sequences are unique to UTF-Morse, and are unknown in
traditional Morse code.
[2]: Capital letters use the same code as small letter, preceded by
sequence "..--" (which is unique to UTF-Morse).
----------------------------------------------------------------------------
-
This archive was generated by hypermail 2.1.5 : Thu Nov 21 2002 - 05:14:37 EST