CLAN: Text Extraction
Leonid Spektor
spektor at andrew.cmu.edu
Tue Feb 6 21:38:59 UTC 2024
Command flo +ca +t* *.cha should work.
Leonid.
> On Feb 6, 2024, at 16:14, Snigdha Khanna <snkhanna at iu.edu> wrote:
>
> I want to remove all annotations like the gestures and errors. Hence, I would like to use the txt format of just the transcribed text without annotations.
>
> Any idea how to do that?
>
>
> On Tuesday, February 6, 2024 at 4:10:32 PM UTC-5 macw wrote:
>> CLAN’s FLO program does most of this. Alternatively, you could grab all the <w> tags from the XML version of the database.
>>
>> What kind of NLP do you want to use? You could apply Universal Dependencies directly.
>>
>> — Brian MacWhinney
>> Teresa Heinz Professor of Cognitive Psychology,
>> Language Technologies and Modern Languages, CMU
>>
>> > On Feb 6, 2024, at 3:08 PM, Snigdha Khanna <snkh... at iu.edu <>> wrote:
>> >
>> > Hello!
>> >
>> > I am trying to extract "clean" text from annotated transcripts that I have. Is there any way to use CLAN to export a txt file format, or a simpler method to remove annotations from the transcripts, so that I can parse it using NLP?
>> >
>> > Any help is appreciated!
>> >
>> > Thanks,
>> > Snigdha
>> >
>> > --
>> > You received this message because you are subscribed to the Google Groups "chibolts" group.
>> > To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u... at googlegroups.com <>.
>> > To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/237e8996-63ba-4476-859f-4b1e6841ab3an%40googlegroups.com.
>>
>
>
> --
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com <mailto:chibolts+unsubscribe at googlegroups.com>.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/cb3c67ac-e21e-492a-8710-3f1ef74cda6dn%40googlegroups.com <https://groups.google.com/d/msgid/chibolts/cb3c67ac-e21e-492a-8710-3f1ef74cda6dn%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/7256CB6D-33FE-461B-9A0E-F479DDCC69C7%40andrew.cmu.edu.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20240206/ad6772b2/attachment.htm>
More information about the Chibolts
mailing list