[Corpora-List] Incidence of MWEs
David Brooks
D.J.Brooks at cs.bham.ac.uk
Thu Mar 16 10:03:52 UTC 2006
Chris Butler wrote:
> I notice that recent postings on this topic are concerned largely with the
> matter of opacity of meaning in MWEs - Robert Amsler's working principle "if
> you can predict its meaning from its constituent parts, it
> doesn't need a separate entry" effectively equates MWE with the traditional
> idiom.
Yes, and I fear this is my fault. I realise there is some difference of
opinion on many of the matters discussed so far, and I perhaps should
have narrowed this topic down to something more manageable.
My interest being in syntax, I'm interested in the implications of MWE
for evaluating parsers. That is to say, if you get something like "light
pen" in a corpus, it may be tagged as an N-bar, with either a compound
<N N> or an <Adj N>, but in principle the *syntax* will remain the same
(tag differences aside).
I would imagine this is not the case for "of course", which doesn't
strike me as a natural prepositional-phrase; likewise "kick the bucket"
is /syntactically/ a transitive verb-phrase, but, and here is the core
of my original (underspecified) question, would it be tagged as a
transitive verb-phrase, or would it be tagged as an MWE - perhaps an
intransitive verb-like MWE?
The reason I ask is that for things like PARSEVAL, this is going to have
an impact on constituent bracket scores, and I was wondering to what
extent it had been investigated, and how noticeable the effect of MWEs
might be.
So, I guess I'm principally interested in MWEs that cause a syntactic
variation (from the compositional norm), and whether or not they are
tagged in treebanks. Still it's been quite an enlightening debate...
D
--
David Brooks
http://www.cs.bham.ac.uk/~djb
More information about the Corpora
mailing list