Re: FW: Web Form: Other Question: CJK

From: John Delacour ([email protected])
Date: Sat Oct 11 2003 - 10:49:09 CST

Next message: Patrick Andries: "Tai Xuan Jing Symbols, any background information ?"
Previous message: Edward H. Trager: "Re: FW: Web Form: Other Question: CJK"
Maybe in reply to: Magda Danish \(Unicode\): "FW: Web Form: Other Question: CJK"
Next in thread: John Jenkins: "Re: Web Form: Other Question: CJK"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> > Contact: [email protected]
> > Report Type: Other Question, Problem, or Feedback
> >
> > My problem is to recognize from the 32 bit value of unicode
> > character if this is a chinese character or korean or japanese.
> How can do this?

You can tell if it is NOT from a legacy character set such as
shift_jis or big5 by failing to convert it to that character set. Or
you can look it up in unihan.txt
<http://www.unicode.org/Public/UNIDATA/Unihan.txt> (25 megabytes,
also at the ftp site). There are also Perl routines for getting at
the information.

U+4E01 kAlternateKangXi 0075.003

Next message: Patrick Andries: "Tai Xuan Jing Symbols, any background information ?"
Previous message: Edward H. Trager: "Re: FW: Web Form: Other Question: CJK"
Maybe in reply to: Magda Danish \(Unicode\): "FW: Web Form: Other Question: CJK"
Next in thread: John Jenkins: "Re: Web Form: Other Question: CJK"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST