[Corpora-List] Request
maxwell
maxwell at umiacs.umd.edu
Tue Apr 24 16:03:52 UTC 2012
On Tue, 24 Apr 2012 17:16:15 +0200, Serge Heiden <slh at ens-lyon.fr> wrote:
> Have a look at TXM 0.6 (free AND open-source):
> https://sourceforge.net/projects/txm [1]
> It handles right-to-left writing systems display*.
> You can check in the demo portal:
> http://txm.risc.cnrs.fr/demo/?locale=en [2]
> in which 'ONUAR' is a small sample UNO based arabic texts corpus.
> (build a lexicon, double-clic on a word line then double-clic
> on a KWIC line to get to the text edition)
>
> Best,
> Serge
> ____________________
>
> (*) Even if this is far from perfect (let alone the bad arabic
> tokenization, etc.).
> This is done nearly automagically by the technology we use
> (Java+Javascript GWT
> or Java Eclipse RCP/SWT for the desktop version)
In case anyone else is working on the Dhivehi language, there's a bug in
Java which (as far as we have been able to discover) prevents proper
rendering of the Thaana script used for Dhivehi. Thaana has long had a
Unicode block, but Java seems not to recognize that Thaana is written
right-to-left. The bug does not affect Arabic script. I've never checked
about other right-to-left scripts, like Syriac.
Mike Maxwell
University of Maryland
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list