From: Charlie Ruland (ruland@luckymail.com)
Date: Tue Jun 16 2009 - 17:47:27 CDT
Oh, it seems I’ve just found the complete list of my IME’s Cangjie 5 codes.
Go to http://hyperrate.com/thread.php?tid=6172 and click on cj5.cin.bz2
<http://cle.linux.org.tw/trac/attachment/wiki/GcinTables/cj5.cin.bz2?format=raw>
to download.
Regards,
Charlie
<http://cle.linux.org.tw/trac/attachment/wiki/GcinTables/cj5.cin.bz2?format=raw>
-------- Original Message --------
Subject: Re: [unicode] Unihan database: kCangjie field
From: Charlie Ruland <ruland@luckymail.com>
To: Edward Cherlin <echerlin@gmail.com>
Date: Wed Jun 17 2009 00:15:33 GMT+0200
> I don’t know if the following is helpful:
>
> After downloading the 2008 version of the Cangjie 5 IME 第五代仓颉输入
> 法 (2008年最新版) from Malaysia’s Friends of Cangjie at
> http://www.chinesecj.com/newsoftware/index3.php?Type=1 and installing
> it on my WinXP machine, the file cj5-win.MB was copied to the
> C:\WINDOWS\system32 folder.
>
> This UTF-16LE-encoded file seems to contain all Cangjie codes that the
> IME makes use of in the following format:
>
> <code><ctrl1><char><ctrl2>
>
> where:
> <code> is the Cangjie 5 code (up to five Latin small letters a-z);
> <ctrl1> is a control character below U+0020;
> <char> is a single Han or other character (incl. Latin a-z), or a
> sequence* of Han characters;
> <ctrl2> is another control character below U+0020, but missing for the
> very last entry.
>
> The start after the file header is: <U+0061> <U+0001> <U+65E5>
> <U+0001> <U+0061> <U+0001> <U+66F0> <U+0002> ...
>
> *The IME supports input of words 詞語輸入 using four-letter codes.
> These Chinese words (i.e., character sequences), as well as letters,
> punctuation, symbols and the like, are of no significance to our
> purpose of mapping Cangjie codes to single Han characters.
>
> Please note that a single Han character may be mapped to several
> Cangjie codes due to glyph variation. Please also note that only
> Chinese glyph variants are taken into account, e.g. ‘禅’ is only
> mapped to ‘ifcwj’, not to ‘iffwj’ according to its standard Japanese
> form. It would of course be nice to have codes for non-Chinese glyph
> variants too.
>
> Thanks, Edward, for your help,
>
> Charlie
>
> -------- Original Message --------
> Subject: Re: [unicode] Unihan database: kCangjie field
> From: Edward Cherlin <echerlin@gmail.com>
> To: John H. Jenkins <jenkins@apple.com>
> Date: Tue Jun 16 2009 09:07:42 GMT+0200
>> Here is a link for Cangjie 5 tables, 第五代倉頡字碼表. It is arranged in
>> "alphabetical" order of Cangjie codes, in 25 pages. (There is no
>> Cangjie code mapped to 'z'.)
>>
>> http://cbflabs.com/book/ocj5/ocj5/16.htm
>>
>> 附錄六
>>
>> 第五代倉頡字碼表
>>
>> ───────────────────────────
>>
>> 以下為第五代倉頡字碼表,根據字母順序,從日部到卜部依序排列。表中
>> 第一欄為中文字形,第二欄字級稍小者,為該字形的中文字碼,第三欄為相對
>> 應的英文字母。
>>
>> Once we get clear on the license, I can download all of this and put
>> it into a comma-delimited file. Someone else will have to fill in
>> characters and provide the Unicode mapping, since a lot of characters
>> are missing from these tables.
>>
>> On Mon, Jun 15, 2009 at 11:32 PM, Edward Cherlin<echerlin@gmail.com>
>> wrote:
>>
>>> On Sun, Jun 14, 2009 at 6:45 PM, John H. Jenkins<jenkins@apple.com>
>>> wrote:
>>>
>>>> If someone is willing to do the work to contact these people, get
>>>> their
>>>> permission, and write up a document for the UTC describing the data
>>>> and
>>>> provide Richard Cook or me with the actual data, then I don't think
>>>> that
>>>> there would be any real problem to adding it.
>>>>
>>> I'll write to them, and to Edouard Butler, author of Cangjie Method
>>> (in English), who works with Chu Bong-Foo, inventor of Cangjie.
>>>
>>>
>>>> Basically, here as elsewhere, the actual work involved is likely to
>>>> be more
>>>> time-consuming than one thinks and neither Dr. Cook nor I have as
>>>> much time
>>>> as we would like to devote to it. The best way to see that
>>>> something makes
>>>> it into the Unihan database is to do the work of data collection
>>>> for us.
>>>>
>>>> 在 Jun 15, 2009 1:57 AM 時, Charlie Ruland 寫到:
>>>>
>>>>
>>>>> If it is true that the Unihan database has Cangjie v.3 input codes
>>>>> for
>>>>> only 29,148 characters, whereas Malaysia’s Friends of Cangjie have
>>>>> Cangjie
>>>>> v.5 codes for all CJK[V] unified ideographs of Unicode 4.0, why
>>>>> not add a
>>>>> “kCangjie5” field based on the more exhaustive data from Malaysia
>>>>> to the
>>>>> Unihan database (or, entirely replace the Cangjie v.3 data of the
>>>>> “kCangjie”
>>>>> field with the Cangjie v.5 data)?
>>>>>
>>>>> BTW, Malaysia’s Friends of Cangjie seem to be willing to have
>>>>> their data
>>>>> published: e.g., the English Wiktionary has the page
>>>>> http://en.wiktionary.org/wiki/Wiktionary:Chinese_Cangjie_index
>>>>> where it
>>>>> says: “Cāngjié data was taken from www.chinesecj.com with
>>>>> permission.”
>>>>>
>>>>> Charlie
>>>>>
>>>>> -------- Original Message --------
>>>>> Subject: Re: [unicode] Unihan database: kCangjie field
>>>>> From: mpsuzuki@hiroshima-u.ac.jp
>>>>> To: Charlie Ruland <ruland@luckymail.com>
>>>>> Date: Sun Jun 14 2009 07:30:59 GMT+0200
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Checking the kCangjie entry for U+9762 (面) in Unihan.txt,
>>>>>> we can find this line:
>>>>>>
>>>>>> U+9762 kCangjie MWYL
>>>>>>
>>>>>> I guess, this is Cangjie version 3 style.
>>>>>> If it's version 5 style, it should be MWSL.
>>>>>>
>>>>>>
>>>>>> http://zh.wikipedia.org/wiki/%E5%80%89%E9%A0%A1%E8%BC%B8%E5%85%A5%E6%B3%95
>>>>>>
>>>>>>
>>>>>> According to UTR#38, kCangjie field is based on Christian
>>>>>> Wittern's cangjie-table.b5.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Tag: kCangjie
>>>>>>> Status: Provisional
>>>>>>> Category: Dictionary-like Data
>>>>>>> Separator: space
>>>>>>> Syntax: [A-Z]+
>>>>>>> Description: The cangjie input code for the character.
>>>>>>> This incorporates data from the file cangjie-table.b5
>>>>>>> by Christian Wittern.
>>>>>>>
>>>>>>>
>>>>>> According to Christian Wittern's web site at Kyoto Univ.,
>>>>>> it seems that he has not updated cangjie-table.b5 since
>>>>>> 1993-Nov.
>>>>>>
>>>>>> http://kanji.zinbun.kyoto-u.ac.jp/~wittern/publications/data/index.html
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Cangjie Table: Table of all cangjie input keys,
>>>>>>> with radical / stroke and BIG5 code ,
>>>>>>> in: ftp://ifcss.org/software/data, November 1993.
>>>>>>>
>>>>>>>
>>>>>> I think the popular version of cangjie-table.b5 used in
>>>>>> various free softwares is 1.02 released on 1993-May.
>>>>>> e.g.
>>>>>>
>>>>>> http://linenum.info/p/emacs/22.1/leim/MISC-DIC/cangjie-table.b5?page=1
>>>>>>
>>>>>>
>>>>>> http://linenum.info/p/emacs/22.1/leim/MISC-DIC/cangjie-table.b5?page=27
>>>>>>
>>>>>> It includes 13059 entries to cover Big5 with ETen extension.
>>>>>>
>>>>>> On the other hand, Unihan.txt 5.1.0 (2008-Mar-03) includes
>>>>>> 29148 entries. I don't know who added extra kCangjie to
>>>>>> cover the characters which are not included in original
>>>>>> cangjie-table.b5 by Christian.
>>>>>>
>>>>>> Regards,
>>>>>> mpsuzuki
>>>>>>
>>>>>> On Sat, 13 Jun 2009 19:14:49 +0200
>>>>>> Charlie Ruland <ruland@luckymail.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> The Cangjie input code of which Cangjie version is given in the
>>>>>>> Unihan
>>>>>>> database?
>>>>>>> I couldn't find any explicit information on this in the Unicode
>>>>>>> Standard
>>>>>>> Annex #38: Unicode Han Database (Unihan) at
>>>>>>> http://www.unicode.org/reports/tr38/ .
>>>>>>> FYI, I use a Cangjie version 5 IME (第五代倉頡輸入法) designed
>>>>>>> by and downloaded
>>>>>>> from Malaysia’s Friends of Cangjie (倉頡之友。馬來西亞 at
>>>>>>> http://www.chinesecj.com/newsoftware/index3.php?Type=1 ) and
>>>>>>> which promises
>>>>>>> to support input of some 70,000 characters.
>>>>>>> Are all Unihan kCangjie codes usable on my IME?
>>>>>>>
>>>>>>> Charlie
>>>>>>>
>>>>>>> --
>>>>>>> ___ Charlie Ruland ___ 冉書慧 ___
>>>>>>> ERROR__COMMVNIS__FACIT__IVS
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>>> — Charlie Ruland — 冉書慧 —
>>>>> ERROR·COMMVNIS·FACIT·IVS
>>>>>
>>>>>
>>>>>
>>>> =====
>>>> John H. Jenkins
>>>> jenkins@apple.com
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> Silent Thunder (默雷/धर्ममेघशब्दगर्ज/دھرممیگھشبدگر ج) is my name
>>> And Children are my nation.
>>> The Cosmos is my dwelling place, The Truth my destination.
>>> http://earthtreasury.org/worknet (Edward Mokurai Cherlin)
>>>
>>>
>>
>>
>>
>>
>
-- — Charlie Ruland — 冉書慧 — ERROR·COMMVNIS·FACIT·IVS
This archive was generated by hypermail 2.1.5 : Tue Jun 16 2009 - 17:50:58 CDT