vocd (lemmatized version)
Leonid Spektor
spektor at andrew.cmu.edu
Mon Nov 6 23:13:51 UTC 2017
Jeanine,
Manual and even VOCD's own "Examples: are a bit of out of date. To get the best lemma-based analysis you should use the following command:
vocd +sm;*,o% @
Leonid.
> On Nov 6, 2017, at 17:01, Jeanine Treffers-Daller <j.c.treffers-daller at reading.ac.uk> wrote:
>
>
> Hi Leonid and all readers
>
> I am using vocd on the dependent tier and it works fine, but have added a third switch +s"*|*~%%" which is not in the manual, to exclude a group of suffixes marked by ~. Perhaps this is something that can be added to the manual?
>
> I also think that computing a lemmatized vocd on the dependent tier leads to some distortions because the grammatical information to the left of the pipe separator is left in the text. For example: "running" is marked as "n:gerund|run" and participles are marked "part|go". Any other forms of these verbs would have different codes to the left of the pipe separator. I assume that in the case of "run" this would be "v|run" and for "go" this would be "v|go". So vocd will treat the two forms of these verbs as different types. That is not what we want in lemmatized analyses of lexical diversity. The vocd score is inflated because of this.
>
> Would it be possible to also erase the information to the left of the pipe separator when running vocd on the mor tier? That would solve the problem. I have tried to create a command to do this, but it doesn't seem to work. I added +s"*%%|*".
>
> The output I am getting without the switch I tried to add is given below. The file itself is also attached.
>
> Another unrelated point I'd like to flag up is that the complementizer "that" is classified as a relative pronoun in the data set. Is it possible to change this perhaps?
>
> thanks a lot for your help!
> best wishes
> Jeanine
>
> vocd +t%mor -t* +s"*|*-%%" +s"*|*&%%" +s"*|*~%%" +s"*|%%*" -s at exclude.cut @
> Mon Nov 06 21:42:01 2017
> vocd (20-Oct-2017) is conducting analyses on:
> ONLY dependent tiers matching: %MOR;
> ****************************************
> From file <c:\DOCUMENTS\GRONINGEN\LLP\NATIVE SPEAKERS\post\E103.mor.pst.cex>
> det|a adj|small n|boy coord|and pro:poss:det|his n|dad cop|be prep|by det|a n|large n|lake adj|throw n|stick prep|in pro|it prep|for pro:poss:det|their n|dog
> det|the adj|little n|boy adj|call n:prop|Sam cop|be adj|happy n:gerund|play prep|with pro:poss:det|his n|dog coord|and det|the n|dog v|swim prep|across det|the n|lake adj|fetch det|the n|stick prep|for pro:obj|him
> det|a adv:int|rather adv|well inf|to v|do n|gentleman v|walk adv|by coord|and v|think rel|that pro:sub|he mod|will v|like inf|to v|join adv|in prep|with n:prop|Sam coord|and pro:poss:det|his n|dad coord|and pro:poss:det|his n|dog
> pro:sub|he v|show pro:poss:det|his n|stick prep|to det|the n|dog
> pro:indef|everybody v|look prep|at pro:obj|him adv:int|rather in#adv|credulous
> conj|but pro:indef|none det|the adj|less pro:sub|he v|decide inf|to v|join prep|in det|the n|fun coord|and pro:sub|he v|throw pro:poss:det|his adj|walk n|stick adv|far adv|away prep|into det|the n|middle prep|of det|the n|lake
> n:prop|Sam coord|and pro:poss:det|his n|dad coord|and pro:poss:det|his n|dog adv|still mod|do adv:int|quite v|understand rel|what part|go adv|on coord|and pro:sub|they v|walk adv|away prep|as det|the coord|and det|the prep|as det|the n|dog mod|do v|want inf|to v|chase det|the n|stick
> det|the adj|poor n|gentleman cop|be adv|left adv|there prep|by pro:refl|himself inf|to v|take prep|off pro:poss:det|his n:pt|clothes coord|and v|go part|swim prep|in det|the n|lake inf|to v|retrieve pro:poss:det|his n|stick
> n:prop|Sam aux|be part|play prep|on det|the n|pavement adj|outside det|a n|bank pro:indef|one n|day
> conj|as pro:sub|he part|play prep|with pro:poss:det|his n|toy det|a n|man v|come n:gerund|run adj|past conj|and prep|into det|the n|bank n:gerund|push n:prop|Sam adv|over adv|as pro:sub|he v|go
> n:prop|Sam v|drop pro:poss:det|his n|toy coord|and cop|be qn|most n|upset
> pro:poss:det|his n|dad v|run adv|over prep|to pro:obj|him inf|to v|see rel|what det|the n|matter
> coord|and n:prop|Sam v|tell pro:obj|him prep|of det|the adj|rude n|man pro:dem|that adv:int|just v|push pro:obj|him adv|over conj|and adv|up coord|and v|spoil pro:poss:det|his n|game
> pro:poss:det|his n|dad n|march prep|into det|the n|bank prep|with n:prop|Sam
> det|the n|man rel|that adv:int|just part|push n:prop|Sam prep|over aux|be part|hold det|the n|bank prep|up adj|use n|gun coord|and pro:indef|everyone prep|in det|the n|bank adj|frighten
> conj|but n:prop|Sam coord|and pro:poss:det|his n|dad mod|do adv:int|even v|notice det|this pro:sub|they co|so n|cross prep|with pro:obj|him
> n:prop|Sam n|dad v|slap det|the n|man
> coord|and pro:sub|he v|drop pro:poss:det|his n|gun coord|and n:prop|Sam adv:int|very adj|happy coord|and part|clap
> n:prop|Sam n|dad co|so n|cross prep|with pro:obj|him rel|that det|the n|bank det|the n|bank n|work cop|be adv|real v|impress rel|that pro:sub|he aux|have aux|be part|frighten prep|of det|the n|gun coord|and n|thing
> pro:indef|everyone v|applaud n:prop|Sam coord|and pro:poss:det|his n|dad prep|for pro:poss:det|their n|bravery
>
> okens samples ttr st.dev D
> 35 100 0.7886 0.062 51.470
> 36 100 0.7678 0.062 45.692
> 37 100 0.7589 0.064 44.198
> 38 100 0.7758 0.053 51.002
> 39 100 0.7590 0.059 46.604
> 40 100 0.7552 0.067 46.611
> 41 100 0.7524 0.063 46.883
> 42 100 0.7560 0.057 49.174
> 43 100 0.7428 0.068 46.120
> 44 100 0.7416 0.057 46.821
> 45 100 0.7382 0.058 46.841
> 46 100 0.7233 0.050 43.476
> 47 100 0.7426 0.056 50.331
> 48 100 0.7221 0.054 45.027
> 49 100 0.7314 0.053 48.803
> 50 100 0.7238 0.055 47.419
>
> D: average = 47.279; std dev. = 2.250
> D_optimum <47.19; min least sq val = 0.001>
>
> tokens samples ttr st.dev D
> 35 100 0.7674 0.058 44.316
> 36 100 0.7792 0.067 49.484
> 37 100 0.7654 0.067 46.199
> 38 100 0.7692 0.066 48.711
> 39 100 0.7715 0.054 50.809
> 40 100 0.7485 0.064 44.553
> 41 100 0.7456 0.061 44.800
> 42 100 0.7464 0.061 46.142
> 43 100 0.7477 0.057 47.632
> 44 100 0.7307 0.059 43.613
> 45 100 0.7462 0.053 49.370
> 46 100 0.7361 0.062 47.220
> 47 100 0.7306 0.056 46.573
> 48 100 0.7260 0.048 46.180
> 49 100 0.7218 0.065 45.893
> 50 100 0.7182 0.053 45.760
>
> D: average = 46.703; std dev. = 1.981
> D_optimum <46.63; min least sq val = 0.001>
>
> tokens samples ttr st.dev D
> 35 100 0.7711 0.066 45.472
> 36 100 0.7614 0.066 43.732
> 37 100 0.7708 0.059 47.959
> 38 100 0.7479 0.063 42.155
> 39 100 0.7585 0.062 46.442
> 40 100 0.7650 0.064 49.806
> 41 100 0.7493 0.060 45.901
> 42 100 0.7364 0.057 43.210
> 43 100 0.7347 0.065 43.730
> 44 100 0.7350 0.057 44.849
> 45 100 0.7227 0.054 42.370
> 46 100 0.7261 0.065 44.268
> 47 100 0.7357 0.059 48.139
> 48 100 0.7352 0.055 48.992
> 49 100 0.7243 0.060 46.615
> 50 100 0.7162 0.062 45.185
>
> D: average = 45.552; std dev. = 2.244
> D_optimum <45.50; min least sq val = 0.001>
>
> VOCD RESULTS SUMMARY
> ====================
> Types,Tokens,TTR: <138,320,0.431250>
> D_optimum values: <47.19, 46.63, 45.50>
> D_optimum average: 46.44
>
>
> --
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com <mailto:chibolts+unsubscribe at googlegroups.com>.
> To post to this group, send email to chibolts at googlegroups.com <mailto:chibolts at googlegroups.com>.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/1e76840e-bddb-4e83-8ecc-9fa1cab95a7e%40googlegroups.com <https://groups.google.com/d/msgid/chibolts/1e76840e-bddb-4e83-8ecc-9fa1cab95a7e%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/8B14D149-F296-499C-BE41-ED1C66FA3B96%40andrew.cmu.edu.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20171106/247f543c/attachment.htm>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20171106/247f543c/attachment-0001.htm>
More information about the Chibolts
mailing list