syllables count

Mon Oct 23 15:11:12 UTC 2006

dear everyone,

two weeks ago I asked for ways to count syllables with CLAN. I tried the way Brian MacWhinney suggests with postcodes, but as I need a distinction between "filler syllables" and others this would leave me with too many different postcodes. I decided to make a %syl: tier and have a distinction between filler syllable $fill & counted syllable $syll. And just tap for every syllable once.
this goes quite quick, although it doesn't look very clear. 
A typical line looks like:

*ANA:    Yes, I think uh that these uh would not be relevant.
%syl:     $syll $syll $syll $fill $syll $syll $fill $syll $syll $syll $syll $syll $syll

and then counting them with FREQ

freq +t%syl file.cha

giving an output like
$syll    11     
$fill    2

Thanks for your answers below you find them.
bye, bye,
Marije

Answer of Brian MacWhinney:
This is the type of analysis that will eventually be best supported by the Phon program that Yvan Rose, Greg Hedlund, and I are developing (see http://childes.psy.cmu.edu/phon/) But PHON can't really do this yet. So, for the time being, hand coding in CLAN is the best method and the method you propose seems excellent. Obviously, you can get total counts by just multiplying in FREQ. Moreover, you can also check for sequences of monosyllables and such with COMBO. So, I like your solution in the general case. In fact, PHON will be implementing something like this too, I believe. However, if all you cared about was the total number, you could save yourself a bit of work by must appending a postcode such as [+ 6] and then running FREQ on the postcodes. One could imagine a special purpose version of MOR that was used just to yield syllable counts. You could use the "english translation" field for this. You would get %mor: pro|I=$1 v|walk=$1 adv|home=$1 adv|tomorrow=$3
Then you would FREQ to focus on the items after the = To make this work, you would need to have all words recognized by MOR and you would have to add syllable numbers to all words in the lexicon. Not an easy job, but eventually we will have to do this.

--Brian MacWhinney 

answer of Eva Aguilar Mediavilla:
For my codification I use a dependent tier %syl: where I put the number and type of syllables ex. (in Spanish) tomato %syl: W-S-W and the in other dependent tier I codified the number of syllables in the word ex.
%wor: 3. It's a similar solution as you.
Eva Aguilar Mediavilla

answer of Ann Peters:
I have also wanted to count syllables and have invented my own method which I will describe below. First, however, I need to point out that it is important to be clear what one wants syllable counts for. I suspect that your goal and mine differ. But perhaps seeing how I solved my problem will help you solve yours.
What I really wanted to know what how the child's *utterances* increased in syllable length and what kinds of prosodic patterns were occurring. In particular I wanted to be able to track the appearance of "filler syllables", including where in utterances they were appearing. So I devised a %syl tier that I have hand coded (for all the tapes from 18-1/2 through 22 months). The specific codes I have used are:
L = stressed lexical syllable
l = unstressed lexical syllable
f = filler syllable
g = unstressed grammatical morpheme
X = stressed unglossable syllable
x = unstressed unglossable syllable
S = animal sound or sound effect (stressed)
s = animal sound or sound effect (unstressed)
Words within an utterance are linked with _ [underline]. (On the main line @fs is the suffix for filler syllables. On the %pho tier, ^ signifies main stress.)) 
Typical early tiers look like this:
*CHI: ready ?
%pho: rEdi
%syl: Ll
*CHI: nono at c ?
%pho: n6^n6
%syl: lL
*CHI: tape .
%pho: tIp
%syl: L
*CHI: xxx ?
%pho: h6^h6
%syl: xX
*CHI: n at fs Daddy ?
%pho: n ^d at di
%syl: f_Ll
*CHI: eee at p !
%pho: ^i:-^i:
%syl: SS

Later tiers look like these:
*CHI: a at fs get diaper ?
%pho: 6 g6 ^dap6
%syl: f_l_Ll
*CHI: a at fs floppy at c ?
%pho: 6~ fa^piy
%syl: f_lL
*CHI: read a at fs book .
%pho: riy6 bUk .
%syl: L_f_L
*CHI: close'it .
%pho: ^kotsIt
%syl: L'g
*CHI: n at fs play pans ?
%pho: 6m ^pey ^p at ns
%syl: f_L_L
*CHI: un at fs brush a at fs teeth ?
%pho: 6m br6sh6 ^tiyf
%syl: f_L_f_L
*CHI: thankyou ,, Daddy .
%pho: ^thanky6  <mailto:^d at diy> ^d at diy
%syl: Ll_Ll

Note that each %syl tier has only a single "word" in it. (This was Brian's suggestion.) Then I use FREQ to sort the patterns, as follows:
GREQ file.cha +t*chi -t* +t%syl +k +f
This sorts and counts the patterns in my %syl tiers. I then hand-sort the output in order to extract information about e.g. How many utterances with N syllables did he produce? How many were iambic vs. trochaic? When did medial fillers come in and then disappear? I hope this gives you some ideas. good luck!
ann

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/info-childes/attachments/20061023/2cd380f1/attachment.htm>