Filled pauses and parts of speech
Kevin Donnelly
kevin at dotmon.com
Mon Nov 4 21:13:57 UTC 2013
Hi Shelley
::::On Monday 04 November 2013 Brian MacWhinney said::::
> There may be other ways of doing this that rely on the XML structure, but
> nothing all that easy.
As Brian says, if you want to keep the existing MOR output, which is probably
desirable, you will need to manipulate the text directly.
One way of doing this is to read into a database table each line of the CHAT
file that contains at fp, and any tiers (in this case MOR) attached to it. The
aim is to convert the "horizontal" text in the CHAT file into "vertical" text.
Think of it like a spreadsheet, where each tier reads down instead of across,
and each cell has a word (or MOR lexeme) in it, eg
Think | verb
of | preposition
it | pronoun.obj
like | conjunction
um at fp |
a | article.indef
spreadsheet | noun
Then all you need to do is run down the text column to find instances of @fp,
and then read the next cell in the next column to get the following lexeme
(aka: find the lexeme following any blank cell in the lexeme column).
So finding the info you need is trivial once it has been imported. However,
the import itself might require some experimentation. For instance, in my
experience, there are quite a few typos in "completed" CHAT files, and very
often they have not even been CHECKed. That complicates the import. Another,
less serious, issue would be that you would need to make sure the import puts
a blank in the lexeme column every time you get an @fp tag, or the lexemes
will be off.
If your student has only a couple of hours of transcriptions to do, the
easiest way is, frankly, to find the @fp utterances as you have already done,
and do the following lexeme by hand. If they have more than, say, 5 hours,
that would justify devoting effort to an import, because it could also be used
for other things.
--
Pob hwyl / Best wishes
Kevin Donnelly
kevindonnelly.org.uk
bangortalk.org.uk
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/201311042113.57528.kevin%40dotmon.com.
For more options, visit https://groups.google.com/groups/opt_out.
More information about the Chibolts
mailing list