Filled pauses and parts of speech

Kevin Donnelly kevin at dotmon.com
Mon Nov 4 21:13:57 UTC 2013


Hi Shelley

::::On Monday 04 November 2013 Brian MacWhinney said::::
> There may be other ways of doing this that rely on the XML structure, but
> nothing all that easy.

As Brian says, if you want to keep the existing MOR output, which is probably 
desirable, you will need to manipulate the text directly.

One way of doing this is to read into a database table each line of the CHAT 
file  that contains at fp, and any tiers (in this case MOR) attached to it.  The 
aim is to convert the "horizontal" text in the CHAT file into "vertical" text.  
Think of it like a spreadsheet, where each tier reads down instead of across, 
and each cell has a word (or MOR lexeme) in it, eg
Think			| verb
of			| preposition
it			| pronoun.obj
like			| conjunction
um at fp		| 
a			| article.indef
spreadsheet	| noun

Then all you need to do is run down the text column to find instances of @fp, 
and then read the next cell in the next column to get the following lexeme 
(aka: find the lexeme following any blank cell in the lexeme column).

So finding the info you need is trivial once it has been imported.  However, 
the import itself might require some experimentation.  For instance, in my 
experience, there are quite a few typos in "completed" CHAT files, and very 
often they have not even been CHECKed.  That complicates the import.  Another, 
less serious, issue would be that you would need to make sure the import puts 
a blank in the lexeme column every time you get an @fp tag, or the lexemes 
will be off.

If your student has only a couple of hours of transcriptions to do, the 
easiest way is, frankly, to find the @fp utterances as you have already done, 
and do the following lexeme by hand.  If they have more than, say, 5 hours, 
that would justify devoting effort to an import, because it could also be used 
for other things.

-- 
Pob hwyl / Best wishes

Kevin Donnelly
kevindonnelly.org.uk
bangortalk.org.uk

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/201311042113.57528.kevin%40dotmon.com.
For more options, visit https://groups.google.com/groups/opt_out.



More information about the Chibolts mailing list