Quote marks affecting language tagging

Brian MacWhinney macw at andrew.cmu.edu
Thu Jun 27 16:10:53 UTC 2019


Dear Cathy,
    I created a full CHAT file to replicate this and to make sure that the file passed CHECK, which it does.  And I replicated the problem.  I will check with Leonid about this.
Also, could you please do me and other readers a favor and add you last name to your email.  

Best,

—Brian Macwhinney

> On Jun 27, 2019, at 11:15 AM, 'Cathy Lonngren' via chibolts <chibolts at googlegroups.com> wrote:
> 
> Hi, 
> 
> I am testing my bilingual CHAT files to check whether I have coded the material accurately. I'm doing this by running freq commands like the following: 
> freq @ +t*MOT +l -s"*@s" +u +o  (this would give me a list of all the English words used and would highlight any Portuguese words that I have failed to code appropriately (as they shouldn't be there!). However, it appears that something is going wrong when words are enclosed in quote marks. For example in the following excerpt, both of MOT's quoted "cinco at s" end up being tagged as English (cinco at s@s:eng). When I tried adding a space between the @s and the " (see *JAM: "cinco at s ") then it seemed to work (i.e. it no longer featured on the 'English' word list for JAM). So, is this what I should do in all such examples? As quoted material in Portuguese occurs quite a lot in the corpus I wanted to check whether this was the best method.  
>  
> *JAM:	[- por] é dez.
> %add:	MOT
> *MOT:	no, I said “cinco at s”. [+ ep]
> %add:	JAM
> *JAM:	“cinco at s ”?
> %add:	MOT
> *MOT:	yeah, how do you say “cinco at s” in English? [+ epe]
> %add:	JAM
> *JAM:	[- por] inglês é?
> %add:	MOT
> *MOT:	uhuh.
> %add:	JAM
> *MOT:	count o(lha)@s +...
> %add:	JAM
> *JAM:	cinco at s she's[?][*] “five[/] five”. [+ pe]
> 
> Many thanks, 
> 
> Cathy
> 
> -- 
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com <mailto:chibolts+unsubscribe at googlegroups.com>.
> To post to this group, send email to chibolts at googlegroups.com <mailto:chibolts at googlegroups.com>.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CAP0r51TsNyykhczheJ55nqwt%2BzWNmhVESa4Pj5piq9L2oMP6kg%40mail.gmail.com <https://groups.google.com/d/msgid/chibolts/CAP0r51TsNyykhczheJ55nqwt%2BzWNmhVESa4Pj5piq9L2oMP6kg%40mail.gmail.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/7E9E5C6C-EC28-4D3D-B4F0-04DB6E6FFF1E%40andrew.cmu.edu.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20190627/7c994be0/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.cha
Type: application/octet-stream
Size: 492 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20190627/7c994be0/attachment-0002.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20190627/7c994be0/attachment-0001.htm>


More information about the Chibolts mailing list