big corpora
Brian MacWhinney
macw at cmu.edu
Tue May 11 21:47:32 UTC 1999
Dear Antonella,
I use CHILDES frequently to run what we call "mega-freqs" on far more
than 1500 pages of data. I don't think you will have trouble with
running on a corpus of that size. I am assuming that you will break your
transcript up into files corresponding to individual sessions, right? To
analyze sittings together, as you wish, you just use wildcards, as in
freq *.cha
One way of testing all this would be to download a huge corpus, such as
the Brown or Hall corpora and run your commands there to see if having
lots of data can somehow "break" the programs. I doubt that it will.
If you have other technical questions about the detailed running of CHAT
commands or the editor, let's move the discussion over to the other
mailing list at info-chibolts at childes.psy.cmu.edu. Good luck.
--Brian MacWhinney
More information about the Info-childes
mailing list