[Corpora-List] Corpus Development
Marco Baroni
marco.baroni at unitn.it
Tue Apr 29 14:24:10 UTC 2008
Dear Mark,
Thanks for your informative reply.
I think the query you propose is probably a good surrogate, but not
quite what I meant.
If I understand your syntax correctly, this...
> WORD(S): [break].[v*] (i.e. all forms of 'break' as a verb)
> CONTEXT: [nn*] (and also select [0] [5] )
would also find, e.g., "break with a hammer", "break and repair cars",
etc., whereas with a CQP query like
VERB ADV? DET? ADJ* NOUN
I would only find collocates that are plausible direct objects of the
verb (since you can only have adverbs, articles, adjectives occurring
in between, and not, e.g., prepositions or other verbs). I suspect
that something like this would be extremely tricky to implement in a
relational db while preserving efficiency, but I would be happy to be
proven wrong.
Regards,
Marco
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list