Capital letters in written L2 data?
Brian MacWhinney
macw at cmu.edu
Wed May 20 13:04:22 UTC 2009
Dear Riikka,
Let me check with Leonid on this one. In theory, it should be
possible to use the sf.cut file in MOR to control the linkage of
capitals to proper nouns, but I think this may not work in the default
case.
However, my guess is that you are only really having trouble with
the words at the beginnings of sentences that are capitalized. If you
want MOR and everything to work right, you really should run LOWCASE
with the +c option. To further control this, you can use the +d
option. Once this is done, MOR will run more smoothly. It would be
great to have a Finnish MOR too, but I suppose the bulk of your L2
data is in English anyway.
--Brian MacWhinney
On May 20, 2009, at 2:14 PM, Riikka wrote:
>
> Dear all,
>
> We're using a somewhat modified form of CHAT to transcribe Finnish/
> English L2 written data (modified for coding purposes and because the
> system was originally developed for spoken language data).
>
> Although we cannot use MOR in CLAN for Finnish L2 data, we're going
> to try to use it for English L2 data.
>
> The problem is that in our transcribed data set we've retained upper
> case letters exactly as they were used in the original hand-written
> data. Of course, MOR interprets all words with the initial letter in
> upper case as proper nouns. I was wondering, is there a clever way to
> make MOR ignore at least the sentence initial upper case letters? Or
> do we just have to prepare another data set, with upper case letters
> edited out?
>
> Best,
> Riikka from Jyvaskyla, Finland
>
> >
>
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "chibolts" group.
To post to this group, send email to chibolts at googlegroups.com
To unsubscribe from this group, send email to chibolts+unsubscribe at googlegroups.com
For more options, visit this group at http://groups.google.com/group/chibolts?hl=en
-~----------~----~----~----~------~----~------~--~---
More information about the Chibolts
mailing list