<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">Bruno,<div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>I am afraid there is no good way to do this. Having '#' character inside a word is such a bad idea from our perspective, that we don't allow it even if someone does put *_#_* into depfile.cut. </div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>Of cause being CLAN you can do whatever you want, but it will be at your own rick. You can add keyword "legacy" to @Options: header in depfile.cut, so that @Options: line will read:</div><div><br></div><div><div><span class="Apple-tab-span" style="white-space:pre"> </span>@Options:<span class="Apple-tab-span" style="white-space:pre"> </span>heritage num sign IPA CA multi caps bullets legacy</div><div><br></div><div>and then add @Options: header to your data files like this:</div><div><br></div><div><div><div>@Options:<span class="Apple-tab-span" style="white-space:pre"> </span>legacy</div></div></div><div><br></div><div>But if you are going to go through all this trouble, then what is the point of running CHECK anyway. You can add '#" inside a word and just don't run CHECK on that data file. The other side effect of @Options:<span class="Apple-tab-span" style="white-space: pre;"> </span>legacy is that CHECK will not check if words start with capitalized letters.</div><div><br></div><div>
<span class="Apple-style-span" style="border-collapse: separate; border-spacing: 0px;"><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: 'Lucida Grande'; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: 'Lucida Grande'; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><br class="Apple-interchange-newline">Leonid.</div><div><br></div></div></span></div></span></span></div></span></div></span></span><br class="Apple-interchange-newline">
</div>
<br><div><div>On Apr 4, 2014, at 15:07 , Bruno Estigarribia <<a href="mailto:brunilda@gmail.com">brunilda@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div dir="ltr">Thank you Leonid,<br><br>Is there anyway to create an "in-house" depfile.cut that will allow uses of # for non-separable prefixes like:<br>*CHI: re#bueno.<br>I tried tweaking it adding *_#_* to the *: line but I can't get it to work. (This requires at least one character before and after the #, right?)<br>(I need to do a first pass morphemicizing on the main line for reasons internal to the project I am working on...)<br>Thanks<br>Bruno<br><br>On Thursday, April 3, 2014 3:09:56 PM UTC-4, Spektor, Leonid: CMU wrote:<blockquote class="gmail_quote" style="margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;"><div style="word-wrap:break-word">Bruno,<div><br></div><div><span style="white-space:pre"> </span>The *_*# in depfile.cut is for words that end with ‘#’ character only. In languages like Hebrew prefixes can be separate from the stem word and they are marked with ‘#’ sign at the end. For example prefixes like “ha#” and “ba#”. If you have ‘#’ character in the middle or the beginning of the the word, then CHECK will complain.<br><div>
<span style="border-collapse:separate;border-spacing:0px"><span style="border-collapse: separate; font-family: 'Lucida Grande'; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px;"><div style="word-wrap:break-word"><span style="border-collapse: separate; font-family: 'Lucida Grande'; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px;"><div style="word-wrap:break-word"><span style="border-collapse: separate; font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px;"><span style="border-collapse: separate; font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px;"><div style="word-wrap:break-word"><span style="border-collapse: separate; font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px;"><div style="word-wrap:break-word"><div><br>Leonid.</div><div><br></div></div></span></div></span></span></div></span></div></span></span><br>
</div>
<br><div><div>On Apr 3, 2014, at 14:25, Bruno Estigarribia <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="5HX2uOZmG_QJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">brun...@gmail.com</a>> wrote:</div><br><blockquote type="cite"><div dir="ltr">Hello,<br><br>I have a .cha file where I have marked prefixes using #. CHECK doesn't like this (error message: "Illegal character(s) '#' found.(48)").<br> know, because Brian has said this to me before, that I should be morphologizing directly on the %mor tier. I understand this recommendation (it is repeated several times in section 6 of the CHAT manual). However, when I look at the depfile, the option for using # is still there:<br>*: * , ,, [x _*] [- _*] [+ _*] [^ *] *~_* *_*# *-_* <br>[and it goes on...]<br>So why is CHECK choking on it? <br>Thanks<br>Bruno<br></div><div><br></div>
-- <br>
You received this message because you are subscribed to the Google Groups "chibolts" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="5HX2uOZmG_QJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">chibolts+u...@<wbr>googlegroups.com</a>.<br>
To post to this group, send email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="5HX2uOZmG_QJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">chib...@googlegroups.com</a>.<br>
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/chibolts/eb1a1fb4-d31f-41ac-a9c9-a4174a992f97%40googlegroups.com?utm_medium=email&utm_source=footer" target="_blank" onmousedown="this.href='https://groups.google.com/d/msgid/chibolts/eb1a1fb4-d31f-41ac-a9c9-a4174a992f97%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/chibolts/eb1a1fb4-d31f-41ac-a9c9-a4174a992f97%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/<wbr>msgid/chibolts/eb1a1fb4-d31f-<wbr>41ac-a9c9-a4174a992f97%<wbr>40googlegroups.com</a>.<br>
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/<wbr>optout</a>.<br>
</blockquote></div><br></div></div></blockquote></div><div><br class="webkit-block-placeholder"></div>
-- <br>
You received this message because you are subscribed to the Google Groups "chibolts" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mailto:chibolts+unsubscribe@googlegroups.com">chibolts+unsubscribe@googlegroups.com</a>.<br>
To post to this group, send email to <a href="mailto:chibolts@googlegroups.com">chibolts@googlegroups.com</a>.<br>
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/chibolts/1cb9fa42-9f92-426c-ae3e-956c27325aa6%40googlegroups.com?utm_medium=email&utm_source=footer">https://groups.google.com/d/msgid/chibolts/1cb9fa42-9f92-426c-ae3e-956c27325aa6%40googlegroups.com</a>.<br>
For more options, visit <a href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.<br>
</blockquote></div><br></div></body></html>
<p></p>
-- <br />
You received this message because you are subscribed to the Google Groups "chibolts" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mailto:chibolts+unsubscribe@googlegroups.com">chibolts+unsubscribe@googlegroups.com</a>.<br />
To post to this group, send email to <a href="mailto:chibolts@googlegroups.com">chibolts@googlegroups.com</a>.<br />
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/chibolts/3B5BA844-CF94-4935-B80F-81C2212F4DB8%40andrew.cmu.edu?utm_medium=email&utm_source=footer">https://groups.google.com/d/msgid/chibolts/3B5BA844-CF94-4935-B80F-81C2212F4DB8%40andrew.cmu.edu</a>.<br />
For more options, visit <a href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.<br />