[Corpora-List] structured data (enu | csy) for IE needed

José Manuel Martínez Martínez pitragoras at yahoo.es
Thu Jan 25 11:39:23 UTC 2007


Hello,
Another interesting site could be the European Parliament. You will find
  versions both in English and Czech of Debates, Reports, Motions and so on.
http://www.europarl.europa.eu/activities/expert.do?language=EN
Maybe the scope of the documents available is too broad for your purpouses.
Best regards,

Jose Manuel Martinez Martinez

jmm

Filip Malik escribió:
> Hello all,
> 
> for my graduation theses, I need a set of structured data for some experiments:
> Data set should consists of XML files, HTML files or any of hypertext based files. 
> Next requirement is: "highly structuded data". This means, that I'm not interested
> in data with structure such as next example has: 
> <p>Paragraph, many words in same tag</p>
> I' looking for the data, that are more structured. Like this example:
> <t> <tag2>Few words (up to 10)</tag2> <tag3>Few words (up to 10)</tag3> </t>
> Last requirement is: English or Czech domain. 
> 
> I hope, that somebody, who reads Corpora was using similar data set, which
> could be reuse again. My goal is IE from hypertext by using content and structure 
> of data.
> 
> Thanks and regards, 
> Filip Malik
> 
> -fm 
> 
> 
> 
> 


		
______________________________________________ 
LLama Gratis a cualquier PC del Mundo. 
Llamadas a fijos y móviles desde 1 céntimo por minuto. 
http://es.voice.yahoo.com



More information about the Corpora mailing list