MOR tier for German corpora
Lanna McRae
lannamcrae at gmail.com
Sat Aug 23 22:51:41 UTC 2025
Hi all,
I'm running some CLAN FREQ searches for my MA thesis and could use advice
on how to handle German verbs in the %mor tier. I'm trying to calculate
frequency of transitive verbs in German child-directed speech, but I'm
running into problems with separable-prefix verbs.
On the %mor tier, I've noticed that separable-prefix verbs appear as the
correct lemma (ex. *anrufen*) when they are in the infinitive. But in
finite forms where the prefix is separated, the prefix is often tagged as
an adverb/particle (ex. adv|an v|rufen). Because of this, it's hard for me
to get accurate lemma counts.
I’ve also come across instances where multiword sequences such as *darunter
bauen* (“build underneath it”) are analyzed as a pseudo-lemma like
*darunterbauen*.
- Is there a way in CLAN/MOR to consistently output the full verb lemma
(prefix + stem) without going line by line?
- Is there a recommended process to recombine prefix+ stem (but avoid
false lemmas like *darunterbauen*)?
As a separate problem, I’ve also noticed that some forms of wissen (like
*weiß*) are being tagged as *weißen *(“to whiten”) on the %mor tier.
- Is this a known issue in the German MOR grammar?
- Is there a standard fix for this in CLAN?
Thank you so much!
Sincerely,
Lanna
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/chibolts/f4cd6668-5cec-451b-a268-623edaf8ff1fn%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20250823/bceb0cab/attachment.htm>
More information about the Chibolts
mailing list