[Corpora-List] announcing pukwac and wackypedia
Linas Vepstas
linasvepstas at gmail.com
Mon Jan 4 14:46:04 UTC 2010
2010/1/4 Eric Atwell <csc6ea at leeds.ac.uk>:
> Marco, Linas,
>
> thanks for making available these dependency-parsed English corpora.
Welcome.
> What do you see these being used for? What are the useful applications of
> dependency-parsed treebanks?
I don't quite understand the stress on the word "dependency" --
is this a questin about the need for parsed treebanks in general,
or for dependency-parsed treebanks?
Parsing still takes a significant amount of CPU time, so having
pre-processed text is useful for several tasks. Personally, I've
used this data for several tasks:
-- studying correlations between word-sense assignments and
grammatical structure (paper in preparation)
-- building up a statistical database to guide NL output
-- using pattern matching to perform question answering
-- using the parse as input to a knowledge-extraction task
(identifying entities and their properties/attributes)
-- using the parsed text to provide a substrate of
"common-sense" knowledge for use in automated
reasoning systems.
Note that all but the first task are essentially "AI" tasks, rather
than "linguistics" tasks. Certainly, the last set of tasks attract
various kinds of commercial interest as well, for specialized
search engines and assistants of various sorts.
--linas
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list