vocd (lemmatized version)

Jeanine Treffers-Daller j.c.treffers-daller at reading.ac.uk
Tue Nov 7 08:19:32 UTC 2017


splendid - thanks very much, Leonid. Works beautifully!

On Monday, 6 November 2017 23:13:54 UTC, Leonid Spektor wrote:
>
> Jeanine,
>
> Manual and even VOCD's own "Examples: are a bit of out of date. To get the 
> best lemma-based analysis you should use the following command:
>
> vocd +sm;*,o% @
>
>
> Leonid.
>
> On Nov 6, 2017, at 17:01, Jeanine Treffers-Daller <
> j.c.treff... at reading.ac.uk <javascript:>> wrote:
>
>
> Hi Leonid and all readers
>
> I am using vocd on the dependent tier and it works fine, but have added a 
> third switch +s"*|*~%%" which is not in the manual, to exclude a group of 
> suffixes marked by ~. Perhaps this is something that can be added to the 
> manual?
>
> I also think that computing a lemmatized vocd on the dependent tier leads 
> to some distortions because the grammatical information to the left of the 
> pipe separator is left in the text.  For example: "running" is marked as 
> "n:gerund|run" and participles are marked "part|go". Any other forms of 
> these verbs would have different codes to the left of the pipe separator. I 
> assume that in the case of "run" this would be "v|run" and for "go" this 
> would be "v|go". So vocd will treat the two forms of these verbs as 
> different types. That is not what we want in lemmatized analyses of lexical 
> diversity. The vocd score is inflated because of this.
>
> Would it be possible to also erase the information to the left of the pipe 
> separator when running vocd on the mor tier? That would solve the problem. 
> I have tried to create a command to do this, but it doesn't seem to work. I 
> added +s"*%%|*". 
>
> The output I am getting without the switch I tried to add is given below. 
> The file itself is also attached.
>
> Another unrelated point I'd like to flag up is that the complementizer 
> "that" is classified as a relative pronoun in the data set. Is it possible 
> to change this perhaps?
>
> thanks a lot for your help!
> best wishes
> Jeanine
>
> vocd +t%mor -t* +s"*|*-%%" +s"*|*&%%" +s"*|*~%%" +s"*|%%*" -
> s... at exclude.cut <javascript:> @
> Mon Nov 06 21:42:01 2017
> vocd (20-Oct-2017) is conducting analyses on:
>   ONLY dependent tiers matching: %MOR;
> ****************************************
> From file <c:\DOCUMENTS\GRONINGEN\LLP\NATIVE 
> SPEAKERS\post\E103.mor.pst.cex>
> det|a adj|small n|boy coord|and pro:poss:det|his n|dad cop|be prep|by 
> det|a n|large n|lake adj|throw n|stick prep|in pro|it prep|for 
> pro:poss:det|their n|dog 
> det|the adj|little n|boy adj|call n:prop|Sam cop|be adj|happy 
> n:gerund|play prep|with pro:poss:det|his n|dog coord|and det|the n|dog 
> v|swim prep|across det|the n|lake adj|fetch det|the n|stick prep|for 
> pro:obj|him 
> det|a adv:int|rather adv|well inf|to v|do n|gentleman v|walk adv|by 
> coord|and v|think rel|that pro:sub|he mod|will v|like inf|to v|join adv|in 
> prep|with n:prop|Sam coord|and pro:poss:det|his n|dad coord|and 
> pro:poss:det|his n|dog 
> pro:sub|he v|show pro:poss:det|his n|stick prep|to det|the n|dog 
> pro:indef|everybody v|look prep|at pro:obj|him adv:int|rather 
> in#adv|credulous 
> conj|but pro:indef|none det|the adj|less pro:sub|he v|decide inf|to v|join 
> prep|in det|the n|fun coord|and pro:sub|he v|throw pro:poss:det|his 
> adj|walk n|stick adv|far adv|away prep|into det|the n|middle prep|of 
> det|the n|lake 
> n:prop|Sam coord|and pro:poss:det|his n|dad coord|and pro:poss:det|his 
> n|dog adv|still mod|do adv:int|quite v|understand rel|what part|go adv|on 
> coord|and pro:sub|they v|walk adv|away prep|as det|the coord|and det|the 
> prep|as det|the n|dog mod|do v|want inf|to v|chase det|the n|stick 
> det|the adj|poor n|gentleman cop|be adv|left adv|there prep|by 
> pro:refl|himself inf|to v|take prep|off pro:poss:det|his n:pt|clothes 
> coord|and v|go part|swim prep|in det|the n|lake inf|to v|retrieve 
> pro:poss:det|his n|stick 
> n:prop|Sam aux|be part|play prep|on det|the n|pavement adj|outside det|a 
> n|bank pro:indef|one n|day 
> conj|as pro:sub|he part|play prep|with pro:poss:det|his n|toy det|a n|man 
> v|come n:gerund|run adj|past conj|and prep|into det|the n|bank 
> n:gerund|push n:prop|Sam adv|over adv|as pro:sub|he v|go 
> n:prop|Sam v|drop pro:poss:det|his n|toy coord|and cop|be qn|most n|upset 
> pro:poss:det|his n|dad v|run adv|over prep|to pro:obj|him inf|to v|see 
> rel|what det|the n|matter 
> coord|and n:prop|Sam v|tell pro:obj|him prep|of det|the adj|rude n|man 
> pro:dem|that adv:int|just v|push pro:obj|him adv|over conj|and adv|up 
> coord|and v|spoil pro:poss:det|his n|game 
> pro:poss:det|his n|dad n|march prep|into det|the n|bank prep|with 
> n:prop|Sam 
> det|the n|man rel|that adv:int|just part|push n:prop|Sam prep|over aux|be 
> part|hold det|the n|bank prep|up adj|use n|gun coord|and pro:indef|everyone 
> prep|in det|the n|bank adj|frighten 
> conj|but n:prop|Sam coord|and pro:poss:det|his n|dad mod|do adv:int|even 
> v|notice det|this pro:sub|they co|so n|cross prep|with pro:obj|him 
> n:prop|Sam n|dad v|slap det|the n|man 
> coord|and pro:sub|he v|drop pro:poss:det|his n|gun coord|and n:prop|Sam 
> adv:int|very adj|happy coord|and part|clap 
> n:prop|Sam n|dad co|so n|cross prep|with pro:obj|him rel|that det|the 
> n|bank det|the n|bank n|work cop|be adv|real v|impress rel|that pro:sub|he 
> aux|have aux|be part|frighten prep|of det|the n|gun coord|and n|thing 
> pro:indef|everyone v|applaud n:prop|Sam coord|and pro:poss:det|his n|dad 
> prep|for pro:poss:det|their n|bravery 
>
> okens  samples    ttr     st.dev      D
>   35      100    0.7886    0.062     51.470
>   36      100    0.7678    0.062     45.692
>   37      100    0.7589    0.064     44.198
>   38      100    0.7758    0.053     51.002
>   39      100    0.7590    0.059     46.604
>   40      100    0.7552    0.067     46.611
>   41      100    0.7524    0.063     46.883
>   42      100    0.7560    0.057     49.174
>   43      100    0.7428    0.068     46.120
>   44      100    0.7416    0.057     46.821
>   45      100    0.7382    0.058     46.841
>   46      100    0.7233    0.050     43.476
>   47      100    0.7426    0.056     50.331
>   48      100    0.7221    0.054     45.027
>   49      100    0.7314    0.053     48.803
>   50      100    0.7238    0.055     47.419
>
> D: average = 47.279; std dev. = 2.250
> D_optimum     <47.19; min least sq val = 0.001> 
>
> tokens  samples    ttr     st.dev      D
>   35      100    0.7674    0.058     44.316
>   36      100    0.7792    0.067     49.484
>   37      100    0.7654    0.067     46.199
>   38      100    0.7692    0.066     48.711
>   39      100    0.7715    0.054     50.809
>   40      100    0.7485    0.064     44.553
>   41      100    0.7456    0.061     44.800
>   42      100    0.7464    0.061     46.142
>   43      100    0.7477    0.057     47.632
>   44      100    0.7307    0.059     43.613
>   45      100    0.7462    0.053     49.370
>   46      100    0.7361    0.062     47.220
>   47      100    0.7306    0.056     46.573
>   48      100    0.7260    0.048     46.180
>   49      100    0.7218    0.065     45.893
>   50      100    0.7182    0.053     45.760
>
> D: average = 46.703; std dev. = 1.981
> D_optimum     <46.63; min least sq val = 0.001> 
>
> tokens  samples    ttr     st.dev      D
>   35      100    0.7711    0.066     45.472
>   36      100    0.7614    0.066     43.732
>   37      100    0.7708    0.059     47.959
>   38      100    0.7479    0.063     42.155
>   39      100    0.7585    0.062     46.442
>   40      100    0.7650    0.064     49.806
>   41      100    0.7493    0.060     45.901
>   42      100    0.7364    0.057     43.210
>   43      100    0.7347    0.065     43.730
>   44      100    0.7350    0.057     44.849
>   45      100    0.7227    0.054     42.370
>   46      100    0.7261    0.065     44.268
>   47      100    0.7357    0.059     48.139
>   48      100    0.7352    0.055     48.992
>   49      100    0.7243    0.060     46.615
>   50      100    0.7162    0.062     45.185
>
> D: average = 45.552; std dev. = 2.244
> D_optimum     <45.50; min least sq val = 0.001> 
>
> VOCD RESULTS SUMMARY
> ====================
>    Types,Tokens,TTR:  <138,320,0.431250>
>   D_optimum  values:  <47.19, 46.63, 45.50>
>   D_optimum average:  46.44
>
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to chibolts+u... at googlegroups.com <javascript:>.
> To post to this group, send email to chib... at googlegroups.com 
> <javascript:>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/chibolts/1e76840e-bddb-4e83-8ecc-9fa1cab95a7e%40googlegroups.com 
> <https://groups.google.com/d/msgid/chibolts/1e76840e-bddb-4e83-8ecc-9fa1cab95a7e%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/6158b1b9-947a-4f50-b027-0df5e724c851%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20171107/506330fe/attachment.htm>


More information about the Chibolts mailing list