[Corpora-List] Bilingual Dictionary from Comparable Corpora

Emad Mohamed emohamed at umail.iu.edu
Sun Oct 5 21:51:50 UTC 2014


Hi Javid,
If you have a parallel corpus, you can use an alignment tool such as Giza++
or fast-aligner to get word alignments. You can run the alignment in both
directions to get a bi-directional dictionary.
If your corpus is comparable (which I understand as not completely
parallel, but I may be wrong here), then a tool like hungalign can help
extract the parallel sentences, which you can then use for building your
dictionary.
HTH,
Emad

On Sun, Oct 5, 2014 at 1:00 PM, javid dadashkarimi <
javiddadashkarimi at gmail.com> wrote:

> Dear Ramesh,
> I only want to extract dictionary within an aligned bilingual corpus. I
> know that Moses can do it for parallel and sentence-level aligned corpus,
> but are the tools like SketchEngine or Tshwanelex extracting such a
> knowledge?
> Best,
> Javid
>
> On Sun, Oct 5, 2014 at 7:23 PM, Krishnamurthy, Ramesh <
> r.krishnamurthy at aston.ac.uk> wrote:
>
>> hi javid
>> not sure quite what you want,
>> but i'd suggest contacting the
>> people at SketchEngine
>> http://www.sketchengine.co.uk/
>> and Tshwanelex
>> http://tshwanedje.com/tshwanelex/
>> best
>> ramesh
>> -------------
>> Date: Sat, 4 Oct 2014 15:11:02 +0330
>> From: javid dadashkarimi <javiddadashkarimi at gmail.com>
>> Subject: [Corpora-List] Bilingual Dictionary from Comparable Corpora
>> To: corpora at uib.no, gate-users-request at lists.sourceforge.net
>>
>> Hi,
>> Is there any tool for extracting probabilistic bilingual dictionary for a
>> bilingual comparable corpora? Does Moses support such a task?
>> Best,
>> Javid
>>
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>


-- 
Emad Mohamed
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20141005/a7fc4beb/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list