[Corpora-List] Collocations from parallel corpora

Philippa Maurer-Stroh aon.912130291 at aon.at
Wed Jul 16 10:13:58 UTC 2003


Dear all, 

I've recently come across Frank Smadja et al.'s Xtract and Champollion and I wonder if the two programs are available for research purposes. 

I'm now doing test runs with Mike Barlow's ParaConc and Collocate as well as Bill Fletcher's kfngrams on a (admittedly) rather small sentence-aligend German-English parallel corpus (about 10,000 words each). For my Ph.D., however, I am planning to work with Philipp Koehn's EU proceedings with about 11 mio words each (does anyone know if its also available already tagged?)

Furthermore, following the discussion on legal aspects of corpus compilation & exploitation on this list, I'd like to know if there are any legal problems concerning the use of the EU texts for (Ph.D.) research work?

Thanks

Philippa


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20030716/45820d7e/attachment.htm>


More information about the Corpora mailing list