using MOR/POST

Brian MacWhinney macw at cmu.edu
Tue Jan 27 04:16:21 UTC 2009


Dear Jamie,
      Yes, those are the correct sources.  Regarding analyzing quotes,  
I came to realize
in about 1998 that the system of marking quotes in the main line was  
not optimal,
since it failed to distinguish quoted short phrases or words from  
longer quotations,
which often occur during book reading.  To address this, I have spent  
many long
hours, replacing the earlier system with the system that uses the +"  
mark at the beginning
of quoted sentences.  For many corpora, this has been fixed.  For  
others, it has not.
You don't specify what corpora you are looking at, so I can't say more  
on this matter.
However, if they are your own data, then you should just fix this  
yourself.
    If you want to omit [+ bch] material, then just add -s"[+ bch]" to  
the command.  Regarding
the "post" category, this is designed for post-qualifiers such as  
"too" and "also".  It is a
small, but frequent class.
     Regarding your problems with CHSTRING, I think that would be the  
subject of a different
note, since this note is about MOR and POST.
     Regarding errors in POST disambiguation, we expect something like  
a 95% accuracy rate at
the best.  If your input data is not well segmented, you can expect  
more errors.  However, if
you have noted something consistent, then I would like to look into  
it.  However, I need more
to go on than "cyber-bafflement".  Do you mean that it inserts a  
question mark or exactly what?
    Regarding "decided" and "couldn't" I would need input text to see  
exactly what is going on.
However, for "out" I can give you a simple answer.  I agree that  
having "out" as a noun seems
pretty useless.  I should probably remove it.  It may have been in  
there for the "ins and outs" of
something.  So, go ahead and remove it.  Or you can add
n|out "out"
to ex.cut

Good luck with your work on this,

-- Brian MacWhinney

On Jan 26, 2009, at 10:41 PM, Jamie Smith wrote:

>
> I have several questions about MOR after working my way through a
> batch of transcripts. The most basic question is whether I have
> overlooked any documentation for beginners, other than the CLAN manual
> and the article linked here:
>
> http://childes.psy.cmu.edu/morgrams/morphosyntax.doc
>
> If so, perhaps I can find answers to these questions on my own. I'll
> go ahead and ask the questions, hoping that they don't have
> straightforward answers in a spot I've overlooked. Please let me know
> if there's a better place to talk about getting started with MOR.
>
> First, is there a flag that will tell MOR to analyze the contents of
> quotes marked with the ["] code? I'm looking at children's narratives
> and they frequently include quotations from the characters. I've been
> filling in the contents of the quotes by hand, but this is not very
> efficient. On a similar note, MOR parses the contents of lines marked
> with the [+bch] postcode, and I'd rather omit those.
>
> Second, is there a listing of all the categories somewhere handy? The
> linked article lists many of them, but I'm still stumped about the
> "post" category.
>
> Third, I attempted to use CHSTRING to make some changes across a batch
> of .mor files, but it didn't work as smoothly as I expected it to.
> Only some of the requested changes actually happened, which I find
> puzzling.
>
> Fourth, I'm also puzzled by some inconsistencies in POST's
> performance. It will sometimes know that "decided" should be v|decide-
> PAST, but four times out of five it will throw up its cyber-hands in
> bafflement. "Couldn't" has an even lower success rate. Can I improve
> its hit rate? (I'm wondering if that's the kind of thing for which
> POSTTRAIN was designed.)
>
> Last, POST keeps mislabeling "out" as a noun -- over and over and
> over. I'm hoping I might be able to modify that, but if not I am
> considering deleting the lexicon entry for "out" as a noun because I
> am tired of changing it to be an adverb or a preposition. Would that
> present any problems I'm not foreseeing?
>
> Thanks for any help you can provide.
>
> Jamie in Illinois
> >
>


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "chibolts" group.
To post to this group, send email to chibolts at googlegroups.com
To unsubscribe from this group, send email to chibolts+unsubscribe at googlegroups.com
For more options, visit this group at http://groups.google.com/group/chibolts?hl=en
-~----------~----~----~----~------~----~------~--~---



More information about the Chibolts mailing list