[Corpora-List] extract.py

Rahma Sellami rahma.sellami at gmail.com
Thu Sep 20 15:18:28 UTC 2012


Hello,

Thank a lot Motaz.
Are these corpora from UN 2000. And  they  are aligned manually or
automatically?
I execute the script extract.py to extract text plain UN corpora,  and I'm
aligning this corpora with hunalign toolkit.
I also need Arabic French aligned UN corpora. I wondered if you have these
corpora?

Thanks.



2012/9/20 Motaz SAAD <motaz.saad at inria.fr>

> Hello,
>
> Please find the aligned corpora in the attached files,
>
> best regards,
> Motaz
>
> ------------------------------
>
> *From: *"Rahma Sellami" <rahma.sellami at gmail.com>
> *To: *corpora at uib.no
> *Sent: *Friday, September 14, 2012 10:27:42 PM
> *Subject: *[Corpora-List] extract.py
>
>
> Hi,
> How can I execute the scripts extract.py to align UN corpora.
> I use this syntax: "python extract.py en ar" but always "0 documents in
> all languages" is returned.
> arabic files are in the directory: xmlar/2000/ and english files are
> in:xmlen/2000.
> Thanks
>
> --
>
>
> RAHMA Sellami
> PhD Computer Science Student
> http://sites.google.com/site/rahmasellami/
>  <http://sites.google.com/site/rahmasellami/>
> Faculty of Economic Sciences and management of Sfax
> ANLP Research Group
> http://sites.google.com/site/anlprg
>
> MIRACL Laboratory
> www.miracl.rnu.tn
>
> Email: rahma.sellami at gmail.com
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
>


-- 

RAHMA Sellami
PhD Computer Science Student
http://sites.google.com/site/rahmasellami/
<http://sites.google.com/site/rahmasellami/>
Faculty of Economic Sciences and management of Sfax
ANLP Research Group
http://sites.google.com/site/anlprg

MIRACL Laboratory
www.miracl.rnu.tn

Email: rahma.sellami at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20120920/1377e646/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list