<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2800.1106" name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=#ffffff>
<DIV><FONT face=Arial size=2>Hi Dick,</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>The idea of a wordform inventory with links to the
roots is a good one. In my opinion, at least this much attention to the
actual wordforms is necessary. My opinion of course is biased. This
bias comes from my experience as a language learner many years ago. I was
trying to get a feel for the language by reading parts of the New Testament in a
related language. This wasn't easy, so I went to the dictionary of that
language for help. But it turned out to be almost no help, because this
dictionary had followed a custom of listing only roots, with no indication of
possible inflections or derivations.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>Maintenance of links doesn't need to be a
problem. If I just keep a list of occurring wordforms in the entry for
each root, then whenever links need to be updated, the computer should be able
to simply regenerate the whole list. What I'm picturing here is analogous
to the automatic generation of an English index, as Shoebox does for MDF
dictionaries.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>And how much work is it to create these lists of
wordforms that I would like to keep in the entry for each root? Not as
much as might be expected, if you can make good use of a parser to semi-automate
the process of identifying the root for each wordform. I've done this with
the word list from a New Testament, and it took about 2 months of work to
complete.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>I do need to admit </FONT><FONT face=Arial
size=2>to you that I probably don't fully grasp the size of the problem
you're writing about for highly agglutinating languages. My experience is
just with Philippine languages, and the ones I've worked with use only 10-20
thousand different wordforms for a New Testament.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>Allan</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<BLOCKQUOTE
style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">
<DIV style="FONT: 10pt arial">----- Original Message ----- </DIV>
<DIV
style="BACKGROUND: #e4e4e4; FONT: 10pt arial; font-color: black"><B>From:</B>
<A title=Dick_Watson@gial.edu
href="mailto:Dick_Watson@gial.edu">Dick_Watson@gial.edu</A> </DIV>
<DIV style="FONT: 10pt arial"><B>To:</B> <A
title=lexicographylist@yahoogroups.com
href="mailto:lexicographylist@yahoogroups.com">lexicographylist@yahoogroups.com</A>
</DIV>
<DIV style="FONT: 10pt arial"><B>Sent:</B> Thursday, May 26, 2005 6:47
AM</DIV>
<DIV style="FONT: 10pt arial"><B>Subject:</B> Re: [Lexicog] Digest Number
353</DIV>
<DIV><FONT face=Arial size=2></FONT><FONT face=Arial size=2></FONT><FONT
face=Arial size=2></FONT><BR></DIV><BR><FONT face="Courier New" size=2>From:
Dick Watson <<A
href="mailto:dick_watson@gial.edu">dick_watson@gial.edu</A><BR>Subject: Re:
Digest Number 343</FONT> <BR><BR><FONT face="Courier New" size=2>My response
to the desire for all wordforms in an electronic dictionary is that you would
have a monstrosity if the language were highly agglutinating. You would
not only have huge redundancy, you would have all the work of dealing with
each and every entry, deciding how much information to include in each one,
most of which would be redundant, but forever maintaining all of the additions
and corrections to keep up such a huge database. Would you limit your
wordforms to those actually found in a corpus or would you also run through
paradigms of all possible forms? The latter would run into all kinds of
problems with derivations, many of which would never occur or would not
necessarily have the same meaning as that predicted, besides the sheer
enormity of the task.</FONT> <BR><FONT face="Courier New" size=2>It could be
more practical to have a separate simple wordform inventory with links to the
roots, stems or citation forms in the dictionary, but even the maintenance of
all those links would keep you from more important lexicographic tasks, not to
mention taking time out to meet your grandchildren, if there had been time to
have children.</FONT> <BR><BR><FONT face="Courier New" size=2>Dick</FONT>
<!-- |**|begin egp html banner|**| -->
<br>
<tt><hr width="500">
<b>Yahoo! Groups Links</b><br>
<ul>
<li>To visit your group on the web, go to:<br><a href="http://groups.yahoo.com/group/lexicographylist/">http://groups.yahoo.com/group/lexicographylist/</a><br>
<li>To unsubscribe from this group, send an email to:<br><a href="mailto:lexicographylist-unsubscribe@yahoogroups.com?subject=Unsubscribe">lexicographylist-unsubscribe@yahoogroups.com</a><br>
<li>Your use of Yahoo! Groups is subject to the <a href="http://docs.yahoo.com/info/terms/">Yahoo! Terms of Service</a>.
</ul>
</tt>
</br>
<!-- |**|end egp html banner|**| -->
</BODY></HTML>