Secondary entries (was Re: [Lexicog] Query on how to deal with coined words)
Vincent `Bentong` S. Isles
bentong.isles at GMAIL.COM
Thu Apr 12 08:36:51 UTC 2007
Hi Mike,
I had hoped to use CC to automate the production of what I call the
"spelling rationale" (\sr) field.
A single entry may have any of these fields: \et, \bw, \mr, but never
two or three of them.
The logic is this:
* If \et is present then it becomes \sr.
* If \mr is present and it is different from \lx after stripping \lx
of hyphens, then it becomes \sr.
* If \bw is present:
** If the first parameter is "en":
*** If \ge!= \lx then \sr = "en" + ge; else \sr = "en"
** else \sr = \bw
I hope you did get lost with that logic because I was not able to
translate it to CC statements. I guess I'm not very good with computers :(
The \sr field gets formatted as follows:
aba (native word: no \sr field)
abaka [Tag] (Tagalog word borrowed as is)
amatyur [Eng fi:amateur] (English word borrowed with change in sp)
kasub-anan [ka-SUBO-anan] (derived word with the root not so obvious)
This is for the "Cebuano Spelling Dictionary", a stripped-down version
of the "Modern Cebuano Dictionary".
Thanks for the offer on the parser, but I think I don't have the cash
for the xerox tools.
For now I've stopped work on that part of the project.
Thanks for the information. :)
--Bentong Isles
--- In lexicographylist at yahoogroups.com, Mike Maxwell <maxwell at ...> wrote:
>
> Vincent `Bentong` S. Isles wrote:
> > I would like to know that "complex procedure". I had spent half of
> > yesterday and the whole of today trying to understand the Consistent
> > Changes program, and I do very well think of CC when you wrote
> > "complex" :)
>
> It might be useful to say what you're hoping to use cc for. Cebuano
> morphology is very complex, with reduplication and infixing, and if
> you're trying to go from stems or inflected words to roots, I don't
> think I would recommend cc. What you really need is a morphological
parser.
>
> It happens that I wrote a morphological parser for Cebuano several
years
> ago. I won't claim that it does everything right--I basically wrote it
> in a couple hours, and tweaked it a little after that--it would
probably
> work better than one could do with cc. I did it while I was working
for
> the Linguistic Data Consortium (LDC), and I'll have to ask them whether
> it's sharable. It uses the Xerox finite state tools, for which you
have
> to pay $40 (the CD comes in a book from U of Chicago press). If that's
> of interest, let me know and I'll see what the LDC says.
>
> On the other hand, if you're trying to fix spelling errors, a parser
> won't help you fix them (it might help you find them). A program like
> cc might be usable to fix some classes of errors, provided you have
some
> notion of what common errors are (substituting a 'c' for a 'k', for
> example).
>
> Also, there are other programs that do more or less the same thing that
> cc does. These come largely from the Unix/ Linux world, but are
> available on DOS and in the Windows command prompt. These are programs
> like sed and awk (and more recently, Unicode-compatible versions of
such
> programs, often written in Perl). Depending on where you are (at a
> university, for example), you may be able to find people who can help
> you with these programs more easily than with cc.
> --
> Mike Maxwell
> maxwell at ...
>
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/lexicographylist/
<*> Your email settings:
Individual Email | Traditional
<*> To change settings online go to:
http://groups.yahoo.com/group/lexicographylist/join
(Yahoo! ID required)
<*> To change settings via email:
mailto:lexicographylist-digest at yahoogroups.com
mailto:lexicographylist-fullfeatured at yahoogroups.com
<*> To unsubscribe from this group, send an email to:
lexicographylist-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the Lexicography
mailing list