[Corpora-List] announcing pukwac and wackypedia

Linas Vepstas linasvepstas at gmail.com
Mon Jan 4 15:17:38 UTC 2010


2010/1/4 Eric Atwell <csc6ea at leeds.ac.uk>:
> What do you see these being used for? What are the useful applications of
> dependency-parsed treebanks?

My apologies in advance for excessive posting, but this seem important:

-- In August 2008, there was a long "Bootcamp" discussion on this mailing
list, on the meaning and nature of corpus linguistics, which was provoked
by Bill Louw when he more or less proclaimed that collocation is everything
to corpus linguistics.

I tried (but failed) to get a point across: "gee wouldn't it be neat if one
could do collocation, concordance, *and* have tags indicating whether
a word had been identified as a subject, object, prepositional object,
etc."  Exactly what sort of discoveries might be made more easily,
as a result of this, I don't know. What sort of traps and pitfalls might
await from using such markup -- especially when the markup is faulty --
fertile ground for debate. But, in general, performing collocation-like
analysis of tagged, structured, text seems to provide extra possibilities
that aren't easily available from bare-naked text.

--linas

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list