vocd (lemmatized version)
Jeanine Treffers-Daller
j.c.treffers-daller at reading.ac.uk
Mon Nov 6 22:01:09 UTC 2017
Hi Leonid and all readers
I am using vocd on the dependent tier and it works fine, but have added a
third switch +s"*|*~%%" which is not in the manual, to exclude a group of
suffixes marked by ~. Perhaps this is something that can be added to the
manual?
I also think that computing a lemmatized vocd on the dependent tier leads
to some distortions because the grammatical information to the left of the
pipe separator is left in the text. For example: "running" is marked as
"n:gerund|run" and participles are marked "part|go". Any other forms of
these verbs would have different codes to the left of the pipe separator. I
assume that in the case of "run" this would be "v|run" and for "go" this
would be "v|go". So vocd will treat the two forms of these verbs as
different types. That is not what we want in lemmatized analyses of lexical
diversity. The vocd score is inflated because of this.
Would it be possible to also erase the information to the left of the pipe
separator when running vocd on the mor tier? That would solve the problem.
I have tried to create a command to do this, but it doesn't seem to work. I
added +s"*%%|*".
The output I am getting without the switch I tried to add is given below.
The file itself is also attached.
Another unrelated point I'd like to flag up is that the complementizer
"that" is classified as a relative pronoun in the data set. Is it possible
to change this perhaps?
thanks a lot for your help!
best wishes
Jeanine
vocd +t%mor -t* +s"*|*-%%" +s"*|*&%%" +s"*|*~%%" +s"*|%%*" -s at exclude.cut @
Mon Nov 06 21:42:01 2017
vocd (20-Oct-2017) is conducting analyses on:
ONLY dependent tiers matching: %MOR;
****************************************
>From file <c:\DOCUMENTS\GRONINGEN\LLP\NATIVE SPEAKERS\post\E103.mor.pst.cex>
det|a adj|small n|boy coord|and pro:poss:det|his n|dad cop|be prep|by det|a
n|large n|lake adj|throw n|stick prep|in pro|it prep|for pro:poss:det|their
n|dog
det|the adj|little n|boy adj|call n:prop|Sam cop|be adj|happy n:gerund|play
prep|with pro:poss:det|his n|dog coord|and det|the n|dog v|swim prep|across
det|the n|lake adj|fetch det|the n|stick prep|for pro:obj|him
det|a adv:int|rather adv|well inf|to v|do n|gentleman v|walk adv|by
coord|and v|think rel|that pro:sub|he mod|will v|like inf|to v|join adv|in
prep|with n:prop|Sam coord|and pro:poss:det|his n|dad coord|and
pro:poss:det|his n|dog
pro:sub|he v|show pro:poss:det|his n|stick prep|to det|the n|dog
pro:indef|everybody v|look prep|at pro:obj|him adv:int|rather
in#adv|credulous
conj|but pro:indef|none det|the adj|less pro:sub|he v|decide inf|to v|join
prep|in det|the n|fun coord|and pro:sub|he v|throw pro:poss:det|his
adj|walk n|stick adv|far adv|away prep|into det|the n|middle prep|of
det|the n|lake
n:prop|Sam coord|and pro:poss:det|his n|dad coord|and pro:poss:det|his
n|dog adv|still mod|do adv:int|quite v|understand rel|what part|go adv|on
coord|and pro:sub|they v|walk adv|away prep|as det|the coord|and det|the
prep|as det|the n|dog mod|do v|want inf|to v|chase det|the n|stick
det|the adj|poor n|gentleman cop|be adv|left adv|there prep|by
pro:refl|himself inf|to v|take prep|off pro:poss:det|his n:pt|clothes
coord|and v|go part|swim prep|in det|the n|lake inf|to v|retrieve
pro:poss:det|his n|stick
n:prop|Sam aux|be part|play prep|on det|the n|pavement adj|outside det|a
n|bank pro:indef|one n|day
conj|as pro:sub|he part|play prep|with pro:poss:det|his n|toy det|a n|man
v|come n:gerund|run adj|past conj|and prep|into det|the n|bank
n:gerund|push n:prop|Sam adv|over adv|as pro:sub|he v|go
n:prop|Sam v|drop pro:poss:det|his n|toy coord|and cop|be qn|most n|upset
pro:poss:det|his n|dad v|run adv|over prep|to pro:obj|him inf|to v|see
rel|what det|the n|matter
coord|and n:prop|Sam v|tell pro:obj|him prep|of det|the adj|rude n|man
pro:dem|that adv:int|just v|push pro:obj|him adv|over conj|and adv|up
coord|and v|spoil pro:poss:det|his n|game
pro:poss:det|his n|dad n|march prep|into det|the n|bank prep|with
n:prop|Sam
det|the n|man rel|that adv:int|just part|push n:prop|Sam prep|over aux|be
part|hold det|the n|bank prep|up adj|use n|gun coord|and pro:indef|everyone
prep|in det|the n|bank adj|frighten
conj|but n:prop|Sam coord|and pro:poss:det|his n|dad mod|do adv:int|even
v|notice det|this pro:sub|they co|so n|cross prep|with pro:obj|him
n:prop|Sam n|dad v|slap det|the n|man
coord|and pro:sub|he v|drop pro:poss:det|his n|gun coord|and n:prop|Sam
adv:int|very adj|happy coord|and part|clap
n:prop|Sam n|dad co|so n|cross prep|with pro:obj|him rel|that det|the
n|bank det|the n|bank n|work cop|be adv|real v|impress rel|that pro:sub|he
aux|have aux|be part|frighten prep|of det|the n|gun coord|and n|thing
pro:indef|everyone v|applaud n:prop|Sam coord|and pro:poss:det|his n|dad
prep|for pro:poss:det|their n|bravery
okens samples ttr st.dev D
35 100 0.7886 0.062 51.470
36 100 0.7678 0.062 45.692
37 100 0.7589 0.064 44.198
38 100 0.7758 0.053 51.002
39 100 0.7590 0.059 46.604
40 100 0.7552 0.067 46.611
41 100 0.7524 0.063 46.883
42 100 0.7560 0.057 49.174
43 100 0.7428 0.068 46.120
44 100 0.7416 0.057 46.821
45 100 0.7382 0.058 46.841
46 100 0.7233 0.050 43.476
47 100 0.7426 0.056 50.331
48 100 0.7221 0.054 45.027
49 100 0.7314 0.053 48.803
50 100 0.7238 0.055 47.419
D: average = 47.279; std dev. = 2.250
D_optimum <47.19; min least sq val = 0.001>
tokens samples ttr st.dev D
35 100 0.7674 0.058 44.316
36 100 0.7792 0.067 49.484
37 100 0.7654 0.067 46.199
38 100 0.7692 0.066 48.711
39 100 0.7715 0.054 50.809
40 100 0.7485 0.064 44.553
41 100 0.7456 0.061 44.800
42 100 0.7464 0.061 46.142
43 100 0.7477 0.057 47.632
44 100 0.7307 0.059 43.613
45 100 0.7462 0.053 49.370
46 100 0.7361 0.062 47.220
47 100 0.7306 0.056 46.573
48 100 0.7260 0.048 46.180
49 100 0.7218 0.065 45.893
50 100 0.7182 0.053 45.760
D: average = 46.703; std dev. = 1.981
D_optimum <46.63; min least sq val = 0.001>
tokens samples ttr st.dev D
35 100 0.7711 0.066 45.472
36 100 0.7614 0.066 43.732
37 100 0.7708 0.059 47.959
38 100 0.7479 0.063 42.155
39 100 0.7585 0.062 46.442
40 100 0.7650 0.064 49.806
41 100 0.7493 0.060 45.901
42 100 0.7364 0.057 43.210
43 100 0.7347 0.065 43.730
44 100 0.7350 0.057 44.849
45 100 0.7227 0.054 42.370
46 100 0.7261 0.065 44.268
47 100 0.7357 0.059 48.139
48 100 0.7352 0.055 48.992
49 100 0.7243 0.060 46.615
50 100 0.7162 0.062 45.185
D: average = 45.552; std dev. = 2.244
D_optimum <45.50; min least sq val = 0.001>
VOCD RESULTS SUMMARY
====================
Types,Tokens,TTR: <138,320,0.431250>
D_optimum values: <47.19, 46.63, 45.50>
D_optimum average: 46.44
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/1e76840e-bddb-4e83-8ecc-9fa1cab95a7e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20171106/f5c14e1d/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: E103.mor.pst.cex
Type: application/octet-stream
Size: 5630 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20171106/f5c14e1d/attachment.obj>
More information about the Chibolts
mailing list