Neutralizing ending in word list

SCHNEIDER, Phyllis Phyllis.Schneider at ualberta.ca
Wed Oct 13 17:43:59 UTC 2004


I wouldn't even call 'he' a 'word root' -- I would consider both 'he' and
'is' as words, one contracted and one not.  I would count both words.

Phyllis

-----Original Message-----
From: Leach, Diane (NIH/NICHD) [mailto:leachd at mail.nih.gov]
Sent: Wednesday, October 13, 2004 5:53 AM
To: 'Brian MacWhinney'
Cc: 'info-chibolts at mail.talkbank.org'; 'info-childes at mail.talkbank.org'
Subject: RE: Neutralizing ending in word list



Hi Brian.

Well, that's a good question.  I had been trying to get a count of word
roots and I had been essentially ignoring the contracted words.  The
definition of word roots still seems to be up for debate.  Does anyone else
want to weigh in about whether a count of number of different word roots
(types) should include both the root word and the contracted word ("he" and
"be&3S" in the case of "he's"), or just the root of the contraction (only
"he")?

Diane

-----Original Message-----
From: Brian MacWhinney [mailto:macw at mac.com]
Sent: Saturday, October 09, 2004 3:26 PM
To: Leach, Diane (NIH/NICHD)
Cc: 'info-chibolts at mail.talkbank.org'
Subject: Re: Neutralizing ending in word list



Dear Diane,

Sorry about the delay in responding. I guess your goal is to treat forms
like "he'd" as if they were versions of "he". You may be right that it will
be tricky to do that using the % symbols. But, before going in that
direction, I would like to think through with you and others, the logic of
the analysis. Do you really want to say that the pronoun is the root of a
cliticized form? Isn't that going against the idea that there are really two
full words being contracted here. Wouldn't it be better to use the +p option
and treat the tilde ~ for the clitic as

a delimiter? That was the original goal underlying the +p option and it
would seem to apply well in this case.


--Brian MacWhinney


On Oct 5, 2004, at 11:58 AM, Leach, Diane (NIH/NICHD) wrote:




Hi folks.


I have been trying to get a list of "root words" by taking my transcripts
and neutralizing the endings on the %mor line.  I was using the following
command:


freq +u +k -t* +t*CHI +t*MOT +t%mor +s"%|*" +s"%|*-%%" +s"%|*~%%"
"*.mor.pst"


This works well except when I have words in the form
n|word-ENDING~CONTRACTION, such as n|dog-DIM~v|be&3S.  In this case, the
result is that n|dog-DIM shows up in the word list (the contraction was
neutralized, but not the diminutive ending).  I have tried adding
+s"%|*-%%~%%" to the command line and replacing +s"%|*~%%" with
+s"%|*-%%~%%", but neither of these seems to work.  In the first case,
nothing changes, and in the second case, it neutralizes the endings on the
complex forms (e.g., n|dog-DIM~v|be&3S), but then I still have the words
with contractions showing up in the word list (e.g., n|dog~v|be&3S).


Any thoughts about how I could fix this?


Thanks!

Diane




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20041013/aa098531/attachment.htm>


More information about the Chibolts mailing list