Unicode

Johanna Laakso johanna.laakso at univie.ac.at
Thu Jun 14 17:05:20 UTC 2001


Dear All,

Daniel Abondolo wrote:

>At this end (with Unicode UTF-8) encoding on a fairly early version of IE5,
>all seem to be coming through fine EXCEPT the letters for Mari front
rounded
>vowels.

In Japan at least, Win 2000 + IE 5.5 fares better. So does
Netscape 6.01, which is my default Web browser.

NS 4.7x, though it recognizes Mari umlaut vowel letters,
doesn't fully respect page layout instructions (e.g., color,
font, ...) in the style sheet.

Lucida sans Unicode, a Windows ttf font which comes with Win
2000/98, seems to have the most complete repetoire of IPA
characters/symbols as well as Cyrillic characters.

It appears, however, early versions of IE 5 doesn't use this
font to show UTF-8-encoded pages -- font specifications in
the CSS style sheet notwithstanding.

I have all my (standard) Mari corpus material in Latin-1
transcription at the moment. Latin-1-to-UTF-8 conversion of
Mari and Russian text was done for the Web pages by a Perl
script, so was HTML tagging. UTF-8 is the only Unicode encoding
Perl programming language knows. (If you're thinking of learning
one more "language", choose Perl and you won't regret :-)

As for FUPA extention in Unicode, I am as patient as the "susukas"
Finns. You should note, however, that Unicode membership does
not automatically mean you can use them in your work. The
characters and symbols must first be included in some Unicode-
compatible font. Even our big brother IPA doesn't seem to enjoy
full Unicode support in the Windows environment yet.

Hiljaa kauas pa"astaan...

Regards
Kazuto Matsumura
kmatsum at tooyoo.l.u-tokyo.ac.jp

>>From ura-list-owner at kantti.Helsinki.FI Thu Jun 14 04:17:53 2001
>From: "Johanna Laakso" <johanna.laakso at univie.ac.at>
>To: <ura-list at helsinki.fi>
>Subject: Re: Unicode dictionaries of Mari and Estonian
>Date: Wed, 13 Jun 2001 21:05:01 +0200
>
>Dear All
>
>re Kazuto's ongoing Mari-Russian-Japanese dictionary
>
>At this end (with Unicode UTF-8) encoding on a fairly early version of IE5,
>all seem to be coming through fine EXCEPT the letters for Mari front
rounded
>vowels.
>
>Best
>
>Daniel Abondolo
><ye99 at dial.pipex.com>



More information about the Ura-list mailing list