<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">

<META content="MSHTML 6.00.2600.0" name=GENERATOR>

<STYLE></STYLE>

</HEAD>

<BODY bgColor=#ffffff>

<DIV><FONT face=Arial size=2>Dear all, </FONT></DIV>

<DIV><FONT face=Arial size=2></FONT> </DIV>

<DIV><FONT face=Arial size=2>I've recently come across Frank Smadja et al.'s 

Xtract and Champollion and I wonder if the two programs are available for 

research purposes. </FONT></DIV>

<DIV><FONT face=Arial size=2></FONT> </DIV>

<DIV><FONT face=Arial size=2>I'm now doing test runs with Mike Barlow's ParaConc 

and Collocate as well as Bill Fletcher's kfngrams on a (admittedly) rather small 

sentence-aligend German-English parallel corpus (about 10,000 words each). For 

my Ph.D., however, I am planning to work with Philipp Koehn's EU proceedings 

with about 11 mio words each (does anyone know if its also available already 

tagged?)</FONT></DIV>

<DIV><FONT face=Arial size=2></FONT> </DIV>

<DIV><FONT face=Arial size=2>Furthermore, following the discussion on legal 

aspects of corpus compilation & exploitation on this list, I'd like to know 

if there are any legal problems concerning the use of the EU texts for (Ph.D.) 

research work?</FONT></DIV>

<DIV><FONT face=Arial size=2></FONT> </DIV>

<DIV><FONT face=Arial size=2>Thanks</FONT></DIV>

<DIV><FONT face=Arial size=2></FONT> </DIV>

<DIV><FONT face=Arial size=2>Philippa</FONT></DIV>

<DIV><FONT face=Arial size=2></FONT> </DIV>

<DIV><FONT face=Arial size=2></FONT> </DIV></BODY></HTML>