[Corpora-List] machine translation

amin farajian ma.farajian at gmail.com
Tue Dec 18 16:12:29 UTC 2012


 Dear Karine,
the corpus that you talked about (in Payame Noor University of Yazd) is
actually the one which is available in ELRA. There is also another parallel
corpus entitled PEN, developed by myself. It is not still publicly
available, but I'm going to publish it. In the following paper you can find
some information about it:
Mohammad Amin Farajian (2011). PEN: Parallel English-Persian News
Corpus<http://world-comp.org/p2011/ICA4953.pdf>.
Proceedings of 2011 International Conference on Artificial Intelligence
(ICAI'11), Nevada, USA.

There are some other researchers (Dr. khadivi in Amirkabir University, Dr.
Faili in University of Tehran, Dr. Analoui in Iran University of Science
and Technology) and research centers (ITRC and SCICT) in Iran which are
working on SMT and are building some parallel corpora, but as I know their
corpora are not available yet.

Best regards,
Amin

On 12/18/2012 03:33 PM, Megerdoomian, Karine wrote:

 I haven’t seen any other parallel English-Persian corpora besides the ones
already mentioned below. However, I have heard about a corpus being
developed by the English department at Payame Noor University in Yazd,
Iran. You may want to contact them. Here’s the info online:
http://www.eurac.edu/it/newsevents/focus/Newsdetails.html?entryid=22181****

** **

“Our developmental English-Persian parallel corpus consists of about *three
million words* (more than 50,000 corresponding sentences in two languages).
This is a kind of ongoing corpus, that is, an open corpus in which more
material can be added as the need arises.”****

** **

Karine****

** **

** **

*From:* corpora-bounces at uib.no
[mailto:corpora-bounces at uib.no<corpora-bounces at uib.no>]
*On Behalf Of *Hieu Hoang
*Sent:* Tuesday, December 18, 2012 7:31 AM
*To:* Khamesi Fahime
*Cc:* corpora at uib.no
*Subject:* Re: [Corpora-List] machine translation****

** **

Hi Khamesi

According to this website
   http://opus.lingfil.uu.se/
There are 3 freely available parallel corpora for persian-english:
  TEP
  KDE
  OpenSubtitles

I've noticed other people, especially in Tehran, are also working on MT and
collect data, eg.
  http://ece.ut.ac.ir/iis/resources.html

Kind Regards
Hieu

****

On 12 December 2012 21:15, Khamesi Fahime <khamesi_fahime at yahoo.com> wrote:*
***

Hi,
I am student of Linguistics in Iran and i am working on English to Persian
statistical machine translation .****

unfortunately  I haven't found any EN-PER corpus except TEP and ELRA .****

There are many restrictions in Iran(boycott) for ordering ELRA .
I appreciate if u can help me in this respect.****

I am looking forward to your reply.****

Best regards,****

Khamesi****


_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora****

** **


_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing listCorpora at uib.nohttp://mailman.uib.no/listinfo/corpora
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20121218/089796af/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list