CLAN: Text Extraction
Snigdha Khanna
snkhanna at iu.edu
Tue Feb 6 21:14:57 UTC 2024
I want to remove all annotations like the gestures and errors. Hence, I
would like to use the txt format of just the transcribed text without
annotations.
Any idea how to do that?
On Tuesday, February 6, 2024 at 4:10:32 PM UTC-5 macw wrote:
> CLAN’s FLO program does most of this. Alternatively, you could grab all
> the <w> tags from the XML version of the database.
>
> What kind of NLP do you want to use? You could apply Universal
> Dependencies directly.
>
> — Brian MacWhinney
> Teresa Heinz Professor of Cognitive Psychology,
> Language Technologies and Modern Languages, CMU
>
> > On Feb 6, 2024, at 3:08 PM, Snigdha Khanna <snkh... at iu.edu> wrote:
> >
> > Hello!
> >
> > I am trying to extract "clean" text from annotated transcripts that I
> have. Is there any way to use CLAN to export a txt file format, or a
> simpler method to remove annotations from the transcripts, so that I can
> parse it using NLP?
> >
> > Any help is appreciated!
> >
> > Thanks,
> > Snigdha
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "chibolts" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to chibolts+u... at googlegroups.com.
> > To view this discussion on the web visit
> https://groups.google.com/d/msgid/chibolts/237e8996-63ba-4476-859f-4b1e6841ab3an%40googlegroups.com
> .
>
>
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/cb3c67ac-e21e-492a-8710-3f1ef74cda6dn%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20240206/ed05f794/attachment-0001.htm>
More information about the Chibolts
mailing list