Neutralizing ending in word list
SCHNEIDER, Phyllis
Phyllis.Schneider at ualberta.ca
Wed Oct 13 17:43:59 UTC 2004
I wouldn't even call 'he' a 'word root' -- I would consider both 'he' and
'is' as words, one contracted and one not. I would count both words.
Phyllis
-----Original Message-----
From: Leach, Diane (NIH/NICHD) [mailto:leachd at mail.nih.gov]
Sent: Wednesday, October 13, 2004 5:53 AM
To: 'Brian MacWhinney'
Cc: 'info-chibolts at mail.talkbank.org'; 'info-childes at mail.talkbank.org'
Subject: RE: Neutralizing ending in word list
Hi Brian.
Well, that's a good question. I had been trying to get a count of word
roots and I had been essentially ignoring the contracted words. The
definition of word roots still seems to be up for debate. Does anyone else
want to weigh in about whether a count of number of different word roots
(types) should include both the root word and the contracted word ("he" and
"be&3S" in the case of "he's"), or just the root of the contraction (only
"he")?
Diane
-----Original Message-----
From: Brian MacWhinney [mailto:macw at mac.com]
Sent: Saturday, October 09, 2004 3:26 PM
To: Leach, Diane (NIH/NICHD)
Cc: 'info-chibolts at mail.talkbank.org'
Subject: Re: Neutralizing ending in word list
Dear Diane,
Sorry about the delay in responding. I guess your goal is to treat forms
like "he'd" as if they were versions of "he". You may be right that it will
be tricky to do that using the % symbols. But, before going in that
direction, I would like to think through with you and others, the logic of
the analysis. Do you really want to say that the pronoun is the root of a
cliticized form? Isn't that going against the idea that there are really two
full words being contracted here. Wouldn't it be better to use the +p option
and treat the tilde ~ for the clitic as
a delimiter? That was the original goal underlying the +p option and it
would seem to apply well in this case.
--Brian MacWhinney
On Oct 5, 2004, at 11:58 AM, Leach, Diane (NIH/NICHD) wrote:
Hi folks.
I have been trying to get a list of "root words" by taking my transcripts
and neutralizing the endings on the %mor line. I was using the following
command:
freq +u +k -t* +t*CHI +t*MOT +t%mor +s"%|*" +s"%|*-%%" +s"%|*~%%"
"*.mor.pst"
This works well except when I have words in the form
n|word-ENDING~CONTRACTION, such as n|dog-DIM~v|be&3S. In this case, the
result is that n|dog-DIM shows up in the word list (the contraction was
neutralized, but not the diminutive ending). I have tried adding
+s"%|*-%%~%%" to the command line and replacing +s"%|*~%%" with
+s"%|*-%%~%%", but neither of these seems to work. In the first case,
nothing changes, and in the second case, it neutralizes the endings on the
complex forms (e.g., n|dog-DIM~v|be&3S), but then I still have the words
with contractions showing up in the word list (e.g., n|dog~v|be&3S).
Any thoughts about how I could fix this?
Thanks!
Diane
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20041013/aa098531/attachment.htm>
More information about the Chibolts
mailing list