big corpora

Brian MacWhinney macw at cmu.edu
Tue May 11 21:47:32 UTC 1999


Dear Antonella,
  I use CHILDES frequently to run what we call "mega-freqs" on far more
than 1500 pages of data.  I don't think you will have trouble with
running on a corpus of that size.  I am assuming that you will break your
transcript up into files corresponding to individual sessions, right?  To
analyze sittings together, as you wish, you just use wildcards, as in
   freq *.cha

One way of testing all this would be to download a huge corpus, such as
the Brown or Hall corpora and run your commands there to see if having
lots of data can somehow "break" the programs.  I doubt that it will.

If you have other technical questions about the detailed running of CHAT
commands or the editor, let's move the discussion over to the other
mailing list at info-chibolts at childes.psy.cmu.edu.  Good luck.

--Brian MacWhinney



More information about the Info-childes mailing list