[Corpora-List] Portuguese thesaurus

Rob Freeman lists at chaoticlanguage.com
Mon May 9 06:48:54 UTC 2005


Hi Adam,

On Tuesday 03 May 2005 17:46, you wrote:
> ...We have
> some evidence that, for PP-attachment, for Spanish, a distributional
> thesaurus outperforms Spanish WordNet.

That's an interesting comment...

How do you use your "distributional thesaurus" to do PP-attachment?

I have a parsing engine which works along what may be the same lines.

In general I believe the approach should be more powerful than one based on
any fixed classification (like WordNet), on the grounds that you can store
more (and especially contradictory) grammatical/semantic information in a
distributed representation.

You can see a general discussion in:

Freeman R. J., Example-based Complexity--Syntax and Semantics as the
Production of Ad-hoc Arrangements of Examples, Proceedings of the ANLP/NAACL
2000 Workshop on Syntactic and Semantic Complexity in Natural Language
Processing Systems, pp. 47-50. (http://acl.ldc.upenn.edu/W/W00/W00-0108.pdf)

The parser has so far been implemented for English, Danish, and Chinese. I
haven't done much benchmarking against specific class-based systems, however.
I'd be interested to see your exact results?

Best regards,

Rob Freeman



More information about the Corpora mailing list