[Corpora-List] structured data (enu | csy) for IE needed

Paul Buitelaar paulb at dfki.de
Thu Jan 25 10:52:35 UTC 2007


Filip, in the context of the SmartWeb project 
(http://www.smartweb-projekt.de/) we are developing a data set in the 
football domain consisting of textual (unstructured) and tabular 
(semi-structured) match reports. You can find more info at 
http://www2.dfki.de/sw-lt/olp2_dataset/

To get access to the data set just send me an email.

Cheers


   Paul Buitelaar
   DFKI GmbH - Language Technology Lab
   Saarbrücken, Germany

   http://www.dfki.de/~paulb/

>From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On
>Behalf Of Filip Malik
>Sent: Thursday, January 25, 2007 9:32 AM
>To: versley at sfs.uni-tuebingen.de; Filip Malik
>Cc: CORPORA at uib.no
>Subject: Re: [Corpora-List] structured data (enu | csy) for IE needed
>
>  
>
>> <>My guess would be that Wikipedia fits your description, where you 
>> will find many tables and/or templates, and it is available in 
>> English and Czech. I don't know if anyone has tried extracting 
>> specific information from that, though.
>
>
>Thanks Yannick for your suggestion. Your reply warn me. I forgot to mention
>very importing condition: I need data from fixed domain (e.g. house sales)
>
>Best regards,
>Filip Malik
>-fm
>



More information about the Corpora mailing list