problems with disambiguating THAT on MOR tier as complementizer, relative pronoun and demonstrative pronoun

Brian MacWhinney macw at cmu.edu
Wed Jul 25 21:15:48 UTC 2018


Dear Jeanine,

    Currently, "that" has three values:
Result: pro:rel|that^pro:dem|that^det:dem|that
You don't mention the third, which is involved in  sentences like "that man is my father"
You are basically asking to add a fourth reading of conj|that.  You are suggesting that we add the reading conj:sub|that.  In fact, we changed all cases of conj:sub to just conj over a year ago.  So, it would be conj|that.   Perhaps you have an old version of MOR.  But this is just a cosmetic renaming in any case.  I agree that it makes good linguistic sense to add conj|that. If successfully tagged, that form would apply to lines 29 and 89 in your example transcript.  However, for this to have any impact on what you are doing, we would first have to go through the training corpus to see if this usage is even present with any frequency.  If not, adding it will have no effect on POST.  And then there is the question about whether it is present in sufficient quantity to result in good tagging.  This will take time and it is hard at this point to say whether it would work out reliably.  

Going through your example file, it is clear that there are errors in disambiguating "that" beyond just the problem with conj|that.  There are also two cases in which something should be tagged as pro:dem, but is tagged instead as pro:rel (lines 69 annd 95).  I see in your corrected version that you properly recoded these as pro:dem|that.  In both of these cases, it appears that the demonstrative occurs before the word "is".  I could add the following rule to POSTMORTEM that would cover these cases:

pro:rel|that cop|* => pro:dem|that cop|*

However, to avoid overgeneration, this should really include a "not" as in 
^n|* pro:rel|that cop|* => ^n|* pro:dem|that cop|*
That would probably fix the majority of these problems.  However, this will require a modification to POSTMORTEM.  The good news is that we need to make this modification anyway.  

However, there is no way to use this POSTMORTEM method for adding conj|that.  For this, the best approach is a version of what you are doing.  You search for pro:rel|that and revise to conj|that.  This should only take a few minutes for each file.  You could also go a bit faster perhaps if you search for pro:rel|that with KWAL and then use the output to triple click and go to cases where you think a change should be made.  

I'll address your other question separately.

Best,

--Brian MacWhinney

> On Jul 25, 2018, at 2:47 AM, Jeanine Treffers-Daller <jeanine.daller07 at gmail.com> wrote:
> 
> Dear Leonid and colleagues
> We have come across a problem with the mor tier:
> On the mor tier the word THAT is coded as pro:rel regardless of its actual grammatical role.  I have seen that the MOR manual on the CHILDES website also contains examples with the same mistakes so wondered if this could be addressed. These are the three uses of "that" in our stories:
> 
> 1: that used as a subordinate conjunction (complementizer) as in “he saw that the cat…” Should the code on the mor tier be "conj:sub|that"?
> 2: that used as a demonstrative pronoun  as in “that was delicious” : code "pro:dem|that"?
> 3: that used as a relative pronoun as in “a playful cat that saw a butterfly”. This is coded correctly as "pro:rel|that" on the mor tier.
>  
> I have manually changed this on the mor tier in the attachment "cat.mor.pst.cha" but have also attached the original file with the errors. We  have 1000 children telling these stories so doing all this manually is a bit difficult.
> 
> A second question related to the above is whether we can disambiguate these three types of "that" when we count types on the mor tier. If we run "freq +sm;*,o% @" we get a list of types but the three types of "that" are counted as one type. We'd like to be able to differentiate between these three in our count of types. Is there any way to do this? Thanks a lot for your help!
> best wishes
> Jeanine Treffers-Daller
> 
> -- 
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com <mailto:chibolts+unsubscribe at googlegroups.com>.
> To post to this group, send email to chibolts at googlegroups.com <mailto:chibolts at googlegroups.com>.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CAGprKZUQcnCpkjnoNNxsBZTXUERsZxx%3DkpOdAXq1JKRS23wBgA%40mail.gmail.com <https://groups.google.com/d/msgid/chibolts/CAGprKZUQcnCpkjnoNNxsBZTXUERsZxx%3DkpOdAXq1JKRS23wBgA%40mail.gmail.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.
> <cat.cha><catERROR.mor.pst.cha><cat.mor.pst.cex>

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/559A5308-FAB2-4F43-B038-07A8130995A9%40cmu.edu.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20180725/b36e0fba/attachment.htm>


More information about the Chibolts mailing list