[Lexicog] Tone languages in Toolbox/Shoebox

Tim Gaved tim_gaved at SIL.ORG
Wed Apr 12 19:19:10 UTC 2006


I get the same results as Kim but with Internet Explorer 6.0, Win Xp SP 2. I think this show that it is basically a font question. Doulos SIL has the smarts built in to correctly render the text. Code 2000 and Arial Unicode MS do not. Charis SIL would work the same as Doulos SIL. Gentium won’t at the moment, as it doesn’t have the smart capability,  though it will one day.

 

Tim Gaved

SIL Senegal

 

  _____  

From: lexicographylist at yahoogroups.com [mailto:lexicographylist at yahoogroups.com] On Behalf Of Kim Blewett
Sent: 12 April 2006 01:46
To: lexicographylist at yahoogroups.com
Subject: Re: [Lexicog] Tone languages in Toolbox/Shoebox

 

When I view the sample paragraph in Firefox, WinXP sp2, I see what Mike describes below until I choose Doulos SIL, which looks quite good to me—all accents and dots appear well-placed in both the paragraph and the character list. I wonder how Gentium looks?

I'm learning a lot but it sure seems complicated ...

Kim Blewett (the only special character we need is a glottal stop; I didn't realize what fun I'm missing out on!  ;o)

Mike Maxwell wrote: 

neduchi at netscape.net wrote:
  

In what sense do these programs handle the dotted vowel + tone mark as
two characters? Are they displaying the tone marks to the right of the
vowel, or is the problem s.t. more subtle than this?
      

 
Please have a look at the link below. It is a sample of an arbitrarilly 
tone-marked Igbo text I put together with Andrew Cunningham. You can 
switch between the three different fonts used:  Arial Unicode MS, 
CODE2000 and Doulos SIL. Try out the fonts and observe the location of 
the sub-dots and the tone marks:
http://www.openroad.net.au/languages/african/igbo/sample.html
 
I would like to see the sub-dotted and tone-marked characters 
'compactly' displayed with the tone marks as ONE composite whole and 
not as two or three estranged neighbours.
    

 
OK, now I'm beginning to understand the problem you're seeing!
 
Yes, this appears to be a rendering problem, not a Unicode problem per 
se.  That is to say, either there's a problem with the font, or with the 
technology that displays the font (I'm not sure which).
 
Let me summarize the rendering issues I see, and let me know if I'm 
missing s.t.
 
First, the accent is much too low over upper case vowels.  It's also too 
far to the left over the lower and upper case 'i/I' (these appear in the 
sample paragraph, but not in the list of sample characters).  Also, the 
dot under the upper case 'U' is too far to the right (both in the 
undotted U in the para, and the dotted U in the sample chars), and the 
dot under the lower case 'i' is much too far to the left (in fact, 
almost under the preceding letter).
 
Also, the upper case N with grave (U+01F8) shows up as a box in many 
apps (it looks OK in Firefox).
 
(I also see a dot _over_ n/N in the sample chars--is that correct?)
 
Some of these problems would be solved by using pre-composed chars. 
(That is, many of the chars in the sample para appear to be in NFD 
normlization, rather than NFC.)  For example, the grave vowels without 
dots would probably look just fine if they used the pre-composed 
equivalents.  (If you are going to use a decomposed character, the grave 
accented 'i' should probably be produced with the dotless-i, U+0131. 
This unfortunately doesn't solve the problem of the grave accent being 
too far to the left.)
 
The dot under problem is more difficult, because there are few 
pre-composed dot-under characters (maybe none, I can't remember), and 
certainly no pre-composed characters having both the dot under and an 
acute or grave.  But the fact that the dots on these characters don't 
show up in the right position is a font/rendering issue, which hopefully 
will get fixed.  FWIW, the problem is noted at the wikipedia page 
(http://en.wikipedia.org/wiki/UniCode#Ready-made_versus_composite_characters). 
  Of course that's no help right now...
 
In sum, this appears to me to be a rendering issue, not a Unicode issue 
per se.  It also appears to be a somewhat different question than the 
original posters brought up, who I believe were asking for tools to do 
phonology and/or morphology.
 
    Mike Maxwell
 
 
 
Yahoo! Groups Links
 
 
 
 
 
 
 
  

 

  _____  

YAHOO! GROUPS LINKS 

 

*	 Visit your group "lexicographylist <http://groups.yahoo.com/group/lexicographylist> " on the web.
  
*	 To unsubscribe from this group, send an email to:
 lexicographylist-unsubscribe at yahoogroups.com <mailto:lexicographylist-unsubscribe at yahoogroups.com?subject=Unsubscribe> 
  
*	 Your use of Yahoo! Groups is subject to the Yahoo! <http://docs.yahoo.com/info/terms/>  Terms of Service. 

 

  _____  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lexicography/attachments/20060412/32fccce5/attachment.htm>


More information about the Lexicography mailing list