Stumped by error on "#" prefix marking

Leonid Spektor spektor at andrew.cmu.edu
Sat Apr 5 05:24:39 UTC 2014


Bruno,

	I am afraid there is no good way to do this. Having '#' character inside a word is such a bad idea from our perspective, that we don't allow it even if someone does put *_#_* into depfile.cut. 

	Of cause being CLAN you can do whatever you want, but it will be at your own rick. You can add keyword "legacy" to @Options: header in depfile.cut, so that @Options: line will read:

	@Options:	heritage num sign IPA CA multi caps bullets legacy

and then add @Options: header to your data files like this:

@Options:	legacy

But if you are going to go through all this trouble, then what is the point of running CHECK anyway. You can add '#" inside a word and just don't run CHECK on that data file. The other side effect of @Options:	legacy is that CHECK will not check if words start with capitalized letters.


Leonid.



On Apr 4, 2014, at 15:07 , Bruno Estigarribia <brunilda at gmail.com> wrote:

> Thank you Leonid,
> 
> Is there anyway to create an "in-house" depfile.cut that will allow uses of # for non-separable prefixes like:
> *CHI: re#bueno.
> I tried tweaking it adding  *_#_*  to the *: line but I can't get it to work. (This requires at least one character before and after the #, right?)
> (I need to do a first pass morphemicizing on the main line for reasons internal to the project I am working on...)
> Thanks
> Bruno
> 
> On Thursday, April 3, 2014 3:09:56 PM UTC-4, Spektor, Leonid: CMU wrote:
> Bruno,
> 
> 	The  *_*#  in depfile.cut is for words that end with '#' character only. In languages like Hebrew prefixes can be separate from the stem word and they are marked with '#' sign at the end. For example prefixes like "ha#" and "ba#". If you have '#' character in the middle or the beginning of the the word, then CHECK will complain.
> 
> Leonid.
> 
> 
> 
> On Apr 3, 2014, at 14:25, Bruno Estigarribia <brun... at gmail.com> wrote:
> 
>> Hello,
>> 
>> I have a .cha file where I have marked prefixes using #. CHECK doesn't like this (error message: "Illegal character(s) '#' found.(48)").
>>  know, because Brian has said this to me before, that I should be morphologizing directly on the %mor tier. I understand this recommendation (it is repeated several times in section 6 of the CHAT manual). However, when I look at the depfile, the option for using # is still there:
>> *:    *  , ,, [x _*]  [- _*] [+ _*]  [^ *] *~_* *_*# *-_* 
>> [and it goes on...]
>> So why is CHECK choking on it? 
>> Thanks
>> Bruno
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups "chibolts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u... at googlegroups.com.
>> To post to this group, send email to chib... at googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/eb1a1fb4-d31f-41ac-a9c9-a4174a992f97%40googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
> To post to this group, send email to chibolts at googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/1cb9fa42-9f92-426c-ae3e-956c27325aa6%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/3B5BA844-CF94-4935-B80F-81C2212F4DB8%40andrew.cmu.edu.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20140405/6a17116b/attachment.htm>


More information about the Chibolts mailing list