vocd (lemmatized version)

Jeanine Treffers-Daller j.c.treffers-daller at reading.ac.uk
Mon Nov 6 22:01:09 UTC 2017


Hi Leonid and all readers

I am using vocd on the dependent tier and it works fine, but have added a 
third switch +s"*|*~%%" which is not in the manual, to exclude a group of 
suffixes marked by ~. Perhaps this is something that can be added to the 
manual?

I also think that computing a lemmatized vocd on the dependent tier leads 
to some distortions because the grammatical information to the left of the 
pipe separator is left in the text.  For example: "running" is marked as 
"n:gerund|run" and participles are marked "part|go". Any other forms of 
these verbs would have different codes to the left of the pipe separator. I 
assume that in the case of "run" this would be "v|run" and for "go" this 
would be "v|go". So vocd will treat the two forms of these verbs as 
different types. That is not what we want in lemmatized analyses of lexical 
diversity. The vocd score is inflated because of this.

Would it be possible to also erase the information to the left of the pipe 
separator when running vocd on the mor tier? That would solve the problem. 
I have tried to create a command to do this, but it doesn't seem to work. I 
added +s"*%%|*". 

The output I am getting without the switch I tried to add is given below. 
The file itself is also attached.

Another unrelated point I'd like to flag up is that the complementizer 
"that" is classified as a relative pronoun in the data set. Is it possible 
to change this perhaps?

thanks a lot for your help!
best wishes
Jeanine

vocd +t%mor -t* +s"*|*-%%" +s"*|*&%%" +s"*|*~%%" +s"*|%%*" -s at exclude.cut @
Mon Nov 06 21:42:01 2017
vocd (20-Oct-2017) is conducting analyses on:
  ONLY dependent tiers matching: %MOR;
****************************************
>From file <c:\DOCUMENTS\GRONINGEN\LLP\NATIVE SPEAKERS\post\E103.mor.pst.cex>
det|a adj|small n|boy coord|and pro:poss:det|his n|dad cop|be prep|by det|a 
n|large n|lake adj|throw n|stick prep|in pro|it prep|for pro:poss:det|their 
n|dog 
det|the adj|little n|boy adj|call n:prop|Sam cop|be adj|happy n:gerund|play 
prep|with pro:poss:det|his n|dog coord|and det|the n|dog v|swim prep|across 
det|the n|lake adj|fetch det|the n|stick prep|for pro:obj|him 
det|a adv:int|rather adv|well inf|to v|do n|gentleman v|walk adv|by 
coord|and v|think rel|that pro:sub|he mod|will v|like inf|to v|join adv|in 
prep|with n:prop|Sam coord|and pro:poss:det|his n|dad coord|and 
pro:poss:det|his n|dog 
pro:sub|he v|show pro:poss:det|his n|stick prep|to det|the n|dog 
pro:indef|everybody v|look prep|at pro:obj|him adv:int|rather 
in#adv|credulous 
conj|but pro:indef|none det|the adj|less pro:sub|he v|decide inf|to v|join 
prep|in det|the n|fun coord|and pro:sub|he v|throw pro:poss:det|his 
adj|walk n|stick adv|far adv|away prep|into det|the n|middle prep|of 
det|the n|lake 
n:prop|Sam coord|and pro:poss:det|his n|dad coord|and pro:poss:det|his 
n|dog adv|still mod|do adv:int|quite v|understand rel|what part|go adv|on 
coord|and pro:sub|they v|walk adv|away prep|as det|the coord|and det|the 
prep|as det|the n|dog mod|do v|want inf|to v|chase det|the n|stick 
det|the adj|poor n|gentleman cop|be adv|left adv|there prep|by 
pro:refl|himself inf|to v|take prep|off pro:poss:det|his n:pt|clothes 
coord|and v|go part|swim prep|in det|the n|lake inf|to v|retrieve 
pro:poss:det|his n|stick 
n:prop|Sam aux|be part|play prep|on det|the n|pavement adj|outside det|a 
n|bank pro:indef|one n|day 
conj|as pro:sub|he part|play prep|with pro:poss:det|his n|toy det|a n|man 
v|come n:gerund|run adj|past conj|and prep|into det|the n|bank 
n:gerund|push n:prop|Sam adv|over adv|as pro:sub|he v|go 
n:prop|Sam v|drop pro:poss:det|his n|toy coord|and cop|be qn|most n|upset 
pro:poss:det|his n|dad v|run adv|over prep|to pro:obj|him inf|to v|see 
rel|what det|the n|matter 
coord|and n:prop|Sam v|tell pro:obj|him prep|of det|the adj|rude n|man 
pro:dem|that adv:int|just v|push pro:obj|him adv|over conj|and adv|up 
coord|and v|spoil pro:poss:det|his n|game 
pro:poss:det|his n|dad n|march prep|into det|the n|bank prep|with 
n:prop|Sam 
det|the n|man rel|that adv:int|just part|push n:prop|Sam prep|over aux|be 
part|hold det|the n|bank prep|up adj|use n|gun coord|and pro:indef|everyone 
prep|in det|the n|bank adj|frighten 
conj|but n:prop|Sam coord|and pro:poss:det|his n|dad mod|do adv:int|even 
v|notice det|this pro:sub|they co|so n|cross prep|with pro:obj|him 
n:prop|Sam n|dad v|slap det|the n|man 
coord|and pro:sub|he v|drop pro:poss:det|his n|gun coord|and n:prop|Sam 
adv:int|very adj|happy coord|and part|clap 
n:prop|Sam n|dad co|so n|cross prep|with pro:obj|him rel|that det|the 
n|bank det|the n|bank n|work cop|be adv|real v|impress rel|that pro:sub|he 
aux|have aux|be part|frighten prep|of det|the n|gun coord|and n|thing 
pro:indef|everyone v|applaud n:prop|Sam coord|and pro:poss:det|his n|dad 
prep|for pro:poss:det|their n|bravery 

okens  samples    ttr     st.dev      D
  35      100    0.7886    0.062     51.470
  36      100    0.7678    0.062     45.692
  37      100    0.7589    0.064     44.198
  38      100    0.7758    0.053     51.002
  39      100    0.7590    0.059     46.604
  40      100    0.7552    0.067     46.611
  41      100    0.7524    0.063     46.883
  42      100    0.7560    0.057     49.174
  43      100    0.7428    0.068     46.120
  44      100    0.7416    0.057     46.821
  45      100    0.7382    0.058     46.841
  46      100    0.7233    0.050     43.476
  47      100    0.7426    0.056     50.331
  48      100    0.7221    0.054     45.027
  49      100    0.7314    0.053     48.803
  50      100    0.7238    0.055     47.419

D: average = 47.279; std dev. = 2.250
D_optimum     <47.19; min least sq val = 0.001> 

tokens  samples    ttr     st.dev      D
  35      100    0.7674    0.058     44.316
  36      100    0.7792    0.067     49.484
  37      100    0.7654    0.067     46.199
  38      100    0.7692    0.066     48.711
  39      100    0.7715    0.054     50.809
  40      100    0.7485    0.064     44.553
  41      100    0.7456    0.061     44.800
  42      100    0.7464    0.061     46.142
  43      100    0.7477    0.057     47.632
  44      100    0.7307    0.059     43.613
  45      100    0.7462    0.053     49.370
  46      100    0.7361    0.062     47.220
  47      100    0.7306    0.056     46.573
  48      100    0.7260    0.048     46.180
  49      100    0.7218    0.065     45.893
  50      100    0.7182    0.053     45.760

D: average = 46.703; std dev. = 1.981
D_optimum     <46.63; min least sq val = 0.001> 

tokens  samples    ttr     st.dev      D
  35      100    0.7711    0.066     45.472
  36      100    0.7614    0.066     43.732
  37      100    0.7708    0.059     47.959
  38      100    0.7479    0.063     42.155
  39      100    0.7585    0.062     46.442
  40      100    0.7650    0.064     49.806
  41      100    0.7493    0.060     45.901
  42      100    0.7364    0.057     43.210
  43      100    0.7347    0.065     43.730
  44      100    0.7350    0.057     44.849
  45      100    0.7227    0.054     42.370
  46      100    0.7261    0.065     44.268
  47      100    0.7357    0.059     48.139
  48      100    0.7352    0.055     48.992
  49      100    0.7243    0.060     46.615
  50      100    0.7162    0.062     45.185

D: average = 45.552; std dev. = 2.244
D_optimum     <45.50; min least sq val = 0.001> 

VOCD RESULTS SUMMARY
====================
   Types,Tokens,TTR:  <138,320,0.431250>
  D_optimum  values:  <47.19, 46.63, 45.50>
  D_optimum average:  46.44

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/1e76840e-bddb-4e83-8ecc-9fa1cab95a7e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20171106/f5c14e1d/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: E103.mor.pst.cex
Type: application/octet-stream
Size: 5630 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20171106/f5c14e1d/attachment.obj>


More information about the Chibolts mailing list