extracting utterances from specified tier ID
'Mingyu Yuan' via chibolts
chibolts at googlegroups.com
Mon Jul 7 19:58:03 UTC 2025
Hi everyone,
I have a question about extracting participants' utterances using CLAN
commands and was wondering if I'm thinking along the right lines. I'd
appreciate it if you could take a look. Thanks!
I'm working with DementiaBank, specifically the ADReSS dataset, a subset of
the Pitt corpus. I used the following command to extract the 'flow' tier of
participants' utterances: `flo +cr +tPAR*`. Here, I have the asterisk *
placed after the PAR identifier. But I noticed that in the CLAN manual, the
asterisk typically precedes it, as in `t*PAR`.
I got the following output after running `t*PAR`
flo (13-Apr-2023) is conducting analyses on:
ONLY speaker main tiers matching: *PAR;
And here's the output after running `tPAR*`
flo (13-Apr-2023) is conducting analyses on:
ONLY speaker main tiers matching: *PAR*;
It looks like the asterisk is used to search for tier ID patterns. Since
all my files contain only INV and PAR tiers, I assume tier matching would
only affect the selection of the PAR tier. I also used a Python function to
verify that the utterances extracted by these two commands were identical
(attached below, in case it's helpful).
Both commands appear to work, but I don't fully understand why. Please let
me know your thoughts. Thank you very much!
Best,
Mingyu
def check_clan_command(id, file_old, file_new):
# Read the .cex file created by the old command (i.e. with tPAR*)
with open(PATH_TO_OLD_FILE, 'r') as file_old_cmd:
file_o = file_old_cmd.read().splitlines()
# Read the .cex file created by the new command (i.e. with t*PAR)
with open(PATH_TO_NEW_FILE, 'r') as file_new_cmd:
file_n = file_new_cmd.read().splitlines()
print(id, file_o == file_n)
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/chibolts/d2459e0d-41c6-4707-9e00-e75f5e755c47n%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20250707/4b1d50e8/attachment.htm>
More information about the Chibolts
mailing list