CLAN: Text Extraction
Brian Macwhinney
macw at cmu.edu
Tue Feb 6 21:10:19 UTC 2024
CLAN’s FLO program does most of this. Alternatively, you could grab all the <w> tags from the XML version of the database.
What kind of NLP do you want to use? You could apply Universal Dependencies directly.
— Brian MacWhinney
Teresa Heinz Professor of Cognitive Psychology,
Language Technologies and Modern Languages, CMU
> On Feb 6, 2024, at 3:08 PM, Snigdha Khanna <snkhanna at iu.edu> wrote:
>
> Hello!
>
> I am trying to extract "clean" text from annotated transcripts that I have. Is there any way to use CLAN to export a txt file format, or a simpler method to remove annotations from the transcripts, so that I can parse it using NLP?
>
> Any help is appreciated!
>
> Thanks,
> Snigdha
>
> --
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/237e8996-63ba-4476-859f-4b1e6841ab3an%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/D36A2735-C125-4C2E-B37C-626A1516D524%40cmu.edu.
More information about the Chibolts
mailing list