gemfreq type/token counts ratios
Leonid Spektor
spektor at andrew.cmu.edu
Thu Jul 2 20:05:28 UTC 2020
Cynthia,
First I need to get more information from you. Do your data files have an @ID: headers? Do you want to get type/token and type/token ratio for speaker words or for morphological analysis words or for lemmas? Do you want the output in plain readable text format or in Excel format? Different answers to those questions will require different commands to get the exact result that you want.
Please allow me to explain the reason for my second question. For example, if you have the following sentence:
*MOT: you can't put it on the table and table it.
If you run FREQ on speaker words, then you will get result:
1 and
1 can't
2 it
1 on
1 put
2 table
1 the
1 you
------------------------------
8 Total number of different item types used
10 Total number of items (tokens)
0.800 Type/Token ratio
If you run FREQ on morphological analysis words, then you will get result:
1 coord|and
1 det:art|the
1 mod|can
1 neg|not
1 n|table
1 prep|on
2 pro:per|it
1 pro:per|you
1 v|put&ZERO
1 v|table
------------------------------
10 Total number of different item types used
11 Total number of items (tokens)
0.909 Type/Token ratio
If you run FREQ on lemmas, then you will get result:
1 and
1 can
2 it
1 not
1 on
1 put
2 table
1 the
1 you
------------------------------
9 Total number of different item types used
11 Total number of items (tokens)
0.818 Type/Token ratio
Leonid.
> On Jul 2, 2020, at 14:13, Cynthia Audisio <cpaudisio at gmail.com> wrote:
>
> Hello Chibolts,
>
> I've got a group of files, each of them's got several "gems" with play situations. Is it possible to get separate type/token totals and ratios for each of the gems in a file ?
> This is how the file looks:
>
> .
> .
> @Bg: play1
> .
> .
> .
> @Eg: play1
> .
> .
> .
> .
> @Bg: play2
> .
> .
> .
> @Eg: play2
> .
> .
>
> and what I need is individual type/token counts and ratio for each play situation (play 1, play 2, etc). Up to now I've run gemfreq which yields a freq list (not total number of type/token and type/token ratio, which is what i need).
> Thanks,
>
> --
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com <mailto:chibolts+unsubscribe at googlegroups.com>.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/65a859f6-26e9-487f-9246-761b46de9bb3o%40googlegroups.com <https://groups.google.com/d/msgid/chibolts/65a859f6-26e9-487f-9246-761b46de9bb3o%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/B2DA9712-1FF1-4956-A136-1E1600C08ACE%40andrew.cmu.edu.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20200702/29577c46/attachment.htm>
More information about the Chibolts
mailing list