<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">

<META content="MSHTML 6.00.2800.1106" name=GENERATOR>

<STYLE></STYLE>

</HEAD>

<BODY bgColor=#ffffff>

<DIV><FONT face=Arial size=2>Hi Dick,</FONT></DIV>

<DIV><FONT face=Arial size=2></FONT> </DIV>

<DIV><FONT face=Arial size=2>The idea of a wordform inventory with links to the

roots is a good one.  In my opinion, at least this much attention to the

actual wordforms is necessary.  My opinion of course is biased.  This

bias comes from my experience as a language learner many years ago.  I was

trying to get a feel for the language by reading parts of the New Testament in a

related language.  This wasn't easy, so I went to the dictionary of that

language for help.  But it turned out to be almost no help, because this

dictionary had followed a custom of listing only roots, with no indication of

possible inflections or derivations.</FONT></DIV>

<DIV><FONT face=Arial size=2></FONT> </DIV>

<DIV><FONT face=Arial size=2>Maintenance of links doesn't need to be a

problem.  If I just keep a list of occurring wordforms in the entry for

each root, then whenever links need to be updated, the computer should be able

to simply regenerate the whole list.  What I'm picturing here is analogous

to the automatic generation of an English index, as Shoebox does for MDF

dictionaries.</FONT></DIV>

<DIV><FONT face=Arial size=2></FONT> </DIV>

<DIV><FONT face=Arial size=2>And how much work is it to create these lists of

wordforms that I would like to keep in the entry for each root?  Not as

much as might be expected, if you can make good use of a parser to semi-automate

the process of identifying the root for each wordform.  I've done this with

the word list from a New Testament, and it took about 2 months of work to

complete.</FONT></DIV>

<DIV><FONT face=Arial size=2></FONT> </DIV>

<DIV><FONT face=Arial size=2>I do need to admit </FONT><FONT face=Arial

size=2>to you that I probably don't fully grasp the size of the problem

you're writing about for highly agglutinating languages.  My experience is

just with Philippine languages, and the ones I've worked with use only 10-20

thousand different wordforms for a New Testament.</FONT></DIV>

<DIV><FONT face=Arial size=2></FONT> </DIV>

<DIV><FONT face=Arial size=2>Allan</FONT></DIV>

<DIV><FONT face=Arial size=2></FONT> </DIV>

<DIV><FONT face=Arial size=2></FONT> </DIV>

<BLOCKQUOTE

style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">

  <DIV style="FONT: 10pt arial">----- Original Message ----- </DIV>

  <DIV

  style="BACKGROUND: #e4e4e4; FONT: 10pt arial; font-color: black"><B>From:</B>

  <A title=Dick_Watson@gial.edu

  href="mailto:Dick_Watson@gial.edu">Dick_Watson@gial.edu</A> </DIV>

  <DIV style="FONT: 10pt arial"><B>To:</B> <A

  title=lexicographylist@yahoogroups.com

  href="mailto:lexicographylist@yahoogroups.com">lexicographylist@yahoogroups.com</A>

  </DIV>

  <DIV style="FONT: 10pt arial"><B>Sent:</B> Thursday, May 26, 2005 6:47

AM</DIV>

  <DIV style="FONT: 10pt arial"><B>Subject:</B> Re: [Lexicog] Digest Number

  353</DIV>

  <DIV><FONT face=Arial size=2></FONT><FONT face=Arial size=2></FONT><FONT

  face=Arial size=2></FONT><BR></DIV><BR><FONT face="Courier New" size=2>From:

  Dick Watson <<A

  href="mailto:dick_watson@gial.edu">dick_watson@gial.edu</A><BR>Subject: Re:

  Digest Number 343</FONT> <BR><BR><FONT face="Courier New" size=2>My response

  to the desire for all wordforms in an electronic dictionary is that you would

  have a monstrosity if the language were highly agglutinating.  You would

  not only have huge redundancy, you would have all the work of dealing with

  each and every entry, deciding how much information to include in each one,

  most of which would be redundant, but forever maintaining all of the additions

  and corrections to keep up such a huge database.  Would you limit your

  wordforms to those actually found in a corpus or would you also run through

  paradigms of all possible forms?  The latter would run into all kinds of

  problems with derivations, many of which would never occur or would not

  necessarily have the same meaning as that predicted, besides the sheer

  enormity of the task.</FONT> <BR><FONT face="Courier New" size=2>It could be

  more practical to have a separate simple wordform inventory with links to the

  roots, stems or citation forms in the dictionary, but even the maintenance of

  all those links would keep you from more important lexicographic tasks, not to

  mention taking time out to meet your grandchildren, if there had been time to

  have children.</FONT> <BR><BR><FONT face="Courier New" size=2>Dick</FONT>

<!-- |**|begin egp html banner|**| -->

<br>

<tt><hr width="500">

<b>Yahoo! Groups Links</b><br>

<ul>

<li>To visit your group on the web, go to:<br><a href="http://groups.yahoo.com/group/lexicographylist/">http://groups.yahoo.com/group/lexicographylist/</a><br> 

<li>To unsubscribe from this group, send an email to:<br><a href="mailto:lexicographylist-unsubscribe@yahoogroups.com?subject=Unsubscribe">lexicographylist-unsubscribe@yahoogroups.com</a><br> 

<li>Your use of Yahoo! Groups is subject to the <a href="http://docs.yahoo.com/info/terms/">Yahoo! Terms of Service</a>.

</ul>

</tt>

</br>

<!-- |**|end egp html banner|**| -->

</BODY></HTML>