<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">

<TITLE>Message</TITLE>

<META content="MSHTML 6.00.2800.1458" name=GENERATOR></HEAD>

<BODY>

<DIV><SPAN class=467434311-13102004><FONT face=Arial color=#0000ff size=2>Hi

Brian.</FONT></SPAN></DIV>

<DIV><SPAN class=467434311-13102004><FONT face=Arial color=#0000ff

size=2></FONT></SPAN> </DIV>

<DIV><SPAN class=467434311-13102004><FONT face=Arial color=#0000ff size=2>Well,

that's a good question.  I had been trying to get a count of word

roots and I had been essentially ignoring the contracted

words.  The definition of word roots still seems to be up for

debate.  Does anyone else want to weigh in about whether a count of number

of different word roots (types) should include both the root word and the

contracted word ("he" and "be&3S" in the case of "he's"), or just the

root of the contraction (only "he")?</FONT></SPAN></DIV>

<DIV><SPAN class=467434311-13102004><FONT face=Arial color=#0000ff

size=2></FONT></SPAN> </DIV>

<DIV><SPAN class=467434311-13102004><FONT face=Arial color=#0000ff

size=2>Diane</FONT></SPAN></DIV>

<BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px">

  <DIV></DIV>

  <DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left><FONT

  face=Tahoma size=2>-----Original Message-----<BR><B>From:</B> Brian MacWhinney

  [mailto:macw@mac.com] <BR><B>Sent:</B> Saturday, October 09, 2004 3:26

  PM<BR><B>To:</B> Leach, Diane (NIH/NICHD)<BR><B>Cc:</B>

  'info-chibolts@mail.talkbank.org'<BR><B>Subject:</B> Re: Neutralizing ending

  in word list<BR><BR></FONT></DIV>

  <P>Dear Diane, </P>

  <P>Sorry about the delay in responding. I guess your goal is to treat forms

  like "he'd" as if they were versions of "he". You may be right that it will be

  tricky to do that using the % symbols. But, before going in that direction, I

  would like to think through with you and others, the logic of the analysis. Do

  you really want to say that the pronoun is the root of a cliticized form?

  Isn't that going against the idea that there are really two full words being

  contracted here. Wouldn't it be better to use the +p option and treat the

  tilde ~ for the clitic as </P>

  <P>a delimiter? That was the original goal underlying the +p option and it

  would seem to apply well in this case. </P><BR>

  <P>--Brian MacWhinney </P><BR>

  <P>On Oct 5, 2004, at 11:58 AM, Leach, Diane (NIH/NICHD) wrote:

  </P><BR><BR><BR>

  <P><FONT face=Arial size=2>Hi folks.</FONT> </P><BR>

  <P><FONT face=Arial size=2>I have been trying to get a list of "root words" by

  taking my transcripts and neutralizing the endings on the %mor line.  I

  was using the following command:</FONT> </P><BR>

  <P><FONT face=Arial size=2>freq +u +k -t* +t*CHI +t*MOT +t%mor +s"%|*"

  +s"%|*-%%" +s"%|*~%%" "*.mor.pst"</FONT> </P><BR>

  <P><FONT face=Arial size=2>This works well except when I have words in the

  form n|word-ENDING~CONTRACTION, such as n|dog-DIM~v|be&3S.  In this

  case, the result is that n|dog-DIM shows up in the word list (the contraction

  was neutralized, but not the diminutive ending).  I have tried

  adding  +s"%|*-%%~%%" to the command line and replacing +s"%|*~%%" with

  +s"%|*-%%~%%", but neither of these seems to work.  In the first case,

  nothing changes, and in the second case, it neutralizes the endings on the

  complex forms (e.g., n|dog-DIM~v|be&3S), but then I still have the words

  with contractions showing up in the word list (e.g.,

  n|dog~v|be&3S). </FONT> </P><BR>

  <P><FONT face=Arial size=2>Any thoughts about how I could fix this?</FONT>

  </P><BR>

  <P><FONT face=Arial size=2>Thanks!</FONT> </P>

  <P><FONT face=Arial size=2>Diane</FONT> </P><BR>

  <P></P></BLOCKQUOTE></BODY></HTML>