[Corpora-List] Codes of Chinese half-characters

Mats Huselius mats.huselius at bredband.net
Mon Sep 10 09:28:33 UTC 2007


You may download the whole UniHan database as a text file  at 
www.unicode.org. There are tags for the radicals of each character in 
Unicode as well as phonetic tags. These are unfortunately not referring to 
Unicode numbers and are not listed for every item.
Download link ftp://ftp.unicode.org/Public/UNIDATA/Unihan.zip

Mats Huselius

----- Original Message ----- 
From: "Dan Cristea" <dcristea at info.uaic.ro>
To: <corpora at uib.no>
Sent: Saturday, September 08, 2007 2:46 PM
Subject: [Corpora-List] Codes of Chinese half-characters


Most of the Chinese characters are composed of two elements: one of a 
phonetic nature (it is pronounced) and the other - of a semantic nature (it 
is not pronounced). Certainly, Chinese characters are Unicoded.

I wonder whether somebody has an electronic list of there component elements 
but, more important, how could one recongise the semantic component of a 
character in (electronical) written Chinese.

More exactly, I would like to have a table of the following kind:

Unicode of a Chinese character | code for its semantic ghaphical element | 
code for its phonetic graphical element (this last one could even be 

Thanks, regards,

Prof. dr. Dan Cristea
Department of Computer Science
Alexandru Ioan Cuza University of Iasi - Romania

Corpora mailing list
Corpora at uib.no

__________ NOD32 2517 (20070910) Information __________

This message was checked by NOD32 antivirus system.

Corpora mailing list
Corpora at uib.no

More information about the Corpora mailing list