[Corpora-List] Codes of Chinese half-characters

Daniel Zeman zeman at ufal.mff.cuni.cz
Mon Sep 10 08:21:55 UTC 2007


This is not precisely what you ask for, though it might help: the Unihan 
database at unicode.org. Try, for instance, 
http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=8000. The 
output contains both the Radical and the Phonetic fields. Unfortunately, 
the numbers in the fields are apparently not unicode codes of other 
characters, so you would have to find the appropriate tables at the 
site, too.

Best,
Dan

Dan Cristea napsal(a):
> Hi, 
>
> Most of the Chinese characters are composed of two elements: one of a phonetic nature (it is pronounced) and the other - of a semantic nature (it is not pronounced). Certainly, Chinese characters are Unicoded. 
>
> I wonder whether somebody has an electronic list of there component elements but, more important, how could one recongise the semantic component of a character in (electronical) written Chinese. 
>
> More exactly, I would like to have a table of the following kind: 
>
> Unicode of a Chinese character | code for its semantic ghaphical element | code for its phonetic graphical element (this last one could even be missing)
>
> Thanks, regards,
> Dan 
>
> ------
> Prof. dr. Dan Cristea
> Department of Computer Science
> Alexandru Ioan Cuza University of Iasi - Romania
>
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>   

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list