Stumped by error on "#" prefix marking
Leonid Spektor
spektor at andrew.cmu.edu
Sat Apr 5 05:24:39 UTC 2014
Bruno,
I am afraid there is no good way to do this. Having '#' character inside a word is such a bad idea from our perspective, that we don't allow it even if someone does put *_#_* into depfile.cut.
Of cause being CLAN you can do whatever you want, but it will be at your own rick. You can add keyword "legacy" to @Options: header in depfile.cut, so that @Options: line will read:
@Options: heritage num sign IPA CA multi caps bullets legacy
and then add @Options: header to your data files like this:
@Options: legacy
But if you are going to go through all this trouble, then what is the point of running CHECK anyway. You can add '#" inside a word and just don't run CHECK on that data file. The other side effect of @Options: legacy is that CHECK will not check if words start with capitalized letters.
Leonid.
On Apr 4, 2014, at 15:07 , Bruno Estigarribia <brunilda at gmail.com> wrote:
> Thank you Leonid,
>
> Is there anyway to create an "in-house" depfile.cut that will allow uses of # for non-separable prefixes like:
> *CHI: re#bueno.
> I tried tweaking it adding *_#_* to the *: line but I can't get it to work. (This requires at least one character before and after the #, right?)
> (I need to do a first pass morphemicizing on the main line for reasons internal to the project I am working on...)
> Thanks
> Bruno
>
> On Thursday, April 3, 2014 3:09:56 PM UTC-4, Spektor, Leonid: CMU wrote:
> Bruno,
>
> The *_*# in depfile.cut is for words that end with '#' character only. In languages like Hebrew prefixes can be separate from the stem word and they are marked with '#' sign at the end. For example prefixes like "ha#" and "ba#". If you have '#' character in the middle or the beginning of the the word, then CHECK will complain.
>
> Leonid.
>
>
>
> On Apr 3, 2014, at 14:25, Bruno Estigarribia <brun... at gmail.com> wrote:
>
>> Hello,
>>
>> I have a .cha file where I have marked prefixes using #. CHECK doesn't like this (error message: "Illegal character(s) '#' found.(48)").
>> know, because Brian has said this to me before, that I should be morphologizing directly on the %mor tier. I understand this recommendation (it is repeated several times in section 6 of the CHAT manual). However, when I look at the depfile, the option for using # is still there:
>> *: * , ,, [x _*] [- _*] [+ _*] [^ *] *~_* *_*# *-_*
>> [and it goes on...]
>> So why is CHECK choking on it?
>> Thanks
>> Bruno
>>
>> --
>> You received this message because you are subscribed to the Google Groups "chibolts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u... at googlegroups.com.
>> To post to this group, send email to chib... at googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/eb1a1fb4-d31f-41ac-a9c9-a4174a992f97%40googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
> To post to this group, send email to chibolts at googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/1cb9fa42-9f92-426c-ae3e-956c27325aa6%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/3B5BA844-CF94-4935-B80F-81C2212F4DB8%40andrew.cmu.edu.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20140405/6a17116b/attachment.htm>
More information about the Chibolts
mailing list