[Corpora-List] What about Maltese texts in Latin letters?
Claudia
claudiaborg at gmail.com
Thu Oct 18 13:43:10 UTC 2007
Hi All,
generally Maltese is written in latin characters with the exception of
ċ
ż
ġ
ħ
à
However, it is quite the norm (although discouraged) not to use these
characters and to use the normal c, z, g, h, a, instead. This can be
confusing since there are also g, h, z, a, in the rest of the
alphabet. UTF-8 can handle the above characters easily.
As for a Maltese Corpus, there is an ongoing project to produce an
official corpus - check this website
http://mlrs.cs.um.edu.mt/
It might be possible to obtain an unreleased version - but I would
have to check if and what is actually available. Please feel free to
contact me if you require more information or would like to work with
Maltese.
Regards
Claudia
--
Research Assistant
Department of Artificial Intelligence
University of Malta
www.cs.um.edu.mt/~claudia
(T): +356 2340 2252
On 16/10/2007, Daniel Zeman <zeman at ufal.mff.cuni.cz> wrote:
> The JRC Acquis corpus (EU legislation) contains Maltese.
> There is also a Maltese Wikipedia, look at
> http://mt.wikipedia.org/wiki/Lingwa_Maltija
>
> best,
> Dan Zeman
>
> Prof.Dr. Yuri, Alina and Yuliana Tambovtsev napsal(a):
> > Dear Corpora colleagues, I wonder if you know where to get Maltese
> > texts in Latin letters? Looking forward to hearing from you to my new
> > email address yutamb at mail.ru Remain yours sinceley Yuri Tambovtsev, Russia
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > Corpora mailing list
> > Corpora at uib.no
> > http://mailman.uib.no/listinfo/corpora
> >
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list