<html><body>


Dear Bill and Mike,<br><br>Considering all the troubles connected to multiple forms of the same word, both<br>verbs and nouns, is it not better to use the list words of a solid printed English dictionary like OED or Webster or even Random House D. It is also possible to use large translation dictionaries, depending on language, in which a compiler is fluent. The comfort of their lists consist both in indication of character of verb and<br>gender of noun.<br>    For classical languages, Greek and Latin it is easier, because the words are usually given there in a representative form with indication to conjugation or declension.<br>    One should recognize that it is not always practical to deal with the corpus,<br>when you compile a dictionary, especially Urdu-English dictionary.<br>    For this case I would use big English-Persian  and English-Arabic dictionaries.   <br><br>Hayim Sheynin<br><br><b><i>Mike Maxwell
 <maxwell@ldc.upenn.edu></i></b> wrote:<blockquote class="replbq" style="border-left: 2px solid rgb(16, 16, 255); margin-left: 5px; padding-left: 5px;">     <!-- Network content -->           <div id="ygrp-text">             <div><a href="mailto:billposer%40alum.mit.edu">billposer@alum.<wbr>mit.edu</a> wrote:<br> > If you're just looking for a large wordlist, one such list<br> > is the list that is distributed with many GNU/Linux systems,<br> > usually in /usr/share/dict/<wbr>words. The older list is only about 45,000<br> > words, but some systems have a longer list of over 200,000 words.<br> <br> That will be wordforms (including inflected  forms), won't it, Bill?  So <br> nearly every noun will have two forms (or three, if you
 count <br> possessives-<wbr>-but maybe the GNU/Linux apps are smart enough that they <br> don't need those).<br>  <br> Speaking of the size of word lists, I saw what had to be one of the <br> dumbest reasons to make English, rather than French, the main language <br> (in some sense, I don't recall what) of the EU: English has more words. <br>   I forget the counts--something like 450k for English vs. 200k for <br> French.  Unfortunately I don't recall the citation.  I'm guessing that <br> they were talking about lexemes, else the many inflected forms of French <br> verbs would, I would have thought, have increased the French number.  No <br> idea whether they included English particle verbs, or how they drew the <br> line between which compound nouns to include and which not to include.<br> <br> > Of course another way of obtaining a wordlist is simply to acquire<br>  > a big chunk of English text (say some combination of internet<br>  > posts and books from
 Project Guttenberg) and extract from it a<br>  > list of the unique words.<br> <br> Don't forget to include Canterbury Tales if you're doing books, or <br> <a href="http://houseoffame.blogspot.com/">http://houseoffame.<wbr>blogspot.<wbr>com/</a> if you're doing internet posts :-).<br> -- <br>  Mike Maxwell<br>  <a href="mailto:maxwell%40ldc.upenn.edu">maxwell@ldc.<wbr>upenn.edu</a><br> </div>     </div>          <!--End group email -->  </blockquote><br><p>

<hr size=1>Want to start your own business? Learn how on <a href="http://us.rd.yahoo.com/evt=41244/*http://smallbusiness.yahoo.com/r-index">Yahoo! Small Business.</a>
<span width="1" style="color: white;"/>__._,_.___</span>


<!-- |**|begin egp html banner|**| -->

  <img src="http://geo.yahoo.com/serv?s=97476590/grpId=11682781/grpspId=1709195911/msgId=3644/stime=1165806433" width="1" height="1"> <br>

<!-- |**|end egp html banner|**| -->

    
<!-- |**|begin egp html banner|**| -->

  <br><br>
  <div style="width:500px; text-align:right; margin-bottom:1px; color:#909090;">
    <tt>SPONSORED LINKS</tt>
  </div>
  <table bgcolor=#e0ecee cellspacing="13" cellpadding="0" width=500px>        
                  <tr valign=top>
            <td style="width:25%;">
        <tt><a href="http://groups.yahoo.com/gads;_ylc=X3oDMTJkcGxjM2QyBF9TAzk3NDc2NTkwBF9wAzEEZ3JwSWQDMTE2ODI3ODEEZ3Jwc3BJZAMxNzA5MTk1OTExBHNlYwNzbG1vZARzdGltZQMxMTY1ODA2NDMz?t=ms&k=Science+lab+equipment&w1=Science+lab+equipment&w2=Life+science+research&w3=Life+sciences&w4=Life+science+product&w5=Life+science+company&c=5&s=125&g=0&.sig=nhMObThtY0u3dC_zdOTU4g">Science lab equipment</a></tt>
      </td>
                      <td style="width:25%;">
        <tt><a href="http://groups.yahoo.com/gads;_ylc=X3oDMTJkMzEwbGI2BF9TAzk3NDc2NTkwBF9wAzIEZ3JwSWQDMTE2ODI3ODEEZ3Jwc3BJZAMxNzA5MTk1OTExBHNlYwNzbG1vZARzdGltZQMxMTY1ODA2NDMz?t=ms&k=Life+science+research&w1=Science+lab+equipment&w2=Life+science+research&w3=Life+sciences&w4=Life+science+product&w5=Life+science+company&c=5&s=125&g=0&.sig=Iijeys-hhXush45nkHI1fw">Life science research</a></tt>
      </td>
                      <td style="width:25%;">
        <tt><a href="http://groups.yahoo.com/gads;_ylc=X3oDMTJkMzdhN3VpBF9TAzk3NDc2NTkwBF9wAzMEZ3JwSWQDMTE2ODI3ODEEZ3Jwc3BJZAMxNzA5MTk1OTExBHNlYwNzbG1vZARzdGltZQMxMTY1ODA2NDMz?t=ms&k=Life+sciences&w1=Science+lab+equipment&w2=Life+science+research&w3=Life+sciences&w4=Life+science+product&w5=Life+science+company&c=5&s=125&g=0&.sig=BzQVPANazGzLMf63F0AEwA">Life sciences</a></tt>
      </td>
              </tr>
                        <tr valign=top>
            <td style="width:25%;">
        <tt><a href="http://groups.yahoo.com/gads;_ylc=X3oDMTJkZTk1YjdsBF9TAzk3NDc2NTkwBF9wAzQEZ3JwSWQDMTE2ODI3ODEEZ3Jwc3BJZAMxNzA5MTk1OTExBHNlYwNzbG1vZARzdGltZQMxMTY1ODA2NDMz?t=ms&k=Life+science+product&w1=Science+lab+equipment&w2=Life+science+research&w3=Life+sciences&w4=Life+science+product&w5=Life+science+company&c=5&s=125&g=0&.sig=Zt0u9C_2RYCHGrro11mmsA">Life science product</a></tt>
      </td>
                      <td style="width:25%;">
        <tt><a href="http://groups.yahoo.com/gads;_ylc=X3oDMTJkaWJxMnZ1BF9TAzk3NDc2NTkwBF9wAzUEZ3JwSWQDMTE2ODI3ODEEZ3Jwc3BJZAMxNzA5MTk1OTExBHNlYwNzbG1vZARzdGltZQMxMTY1ODA2NDMz?t=ms&k=Life+science+company&w1=Science+lab+equipment&w2=Life+science+research&w3=Life+sciences&w4=Life+science+product&w5=Life+science+company&c=5&s=125&g=0&.sig=4SCAy6h9Eoo0zrSnqnWSeg">Life science company</a></tt>
      </td>
                    </tr>
      </table>     
  
<!-- |**|end egp html banner|**| -->


<!-- |**|begin egp html banner|**| -->

<br>
      <div style="font-family: verdana; font-size: 77%; border-top: 1px solid #666; padding: 5px 0;" >
      Your email settings: Individual Email|Traditional <br>
      <a href="http://groups.yahoo.com/group/lexicographylist/join;_ylc=X3oDMTJnaTk1M2VpBF9TAzk3NDc2NTkwBGdycElkAzExNjgyNzgxBGdycHNwSWQDMTcwOTE5NTkxMQRzZWMDZnRyBHNsawNzdG5ncwRzdGltZQMxMTY1ODA2NDMz">Change settings via the Web</a> (Yahoo! ID required) <br>
      Change settings via email: <a href="mailto:lexicographylist-digest@yahoogroups.com?subject=Email Delivery: Digest">Switch delivery to Daily Digest</a> | <a href = "mailto:lexicographylist-fullfeatured@yahoogroups.com?subject=Change Delivery Format: Fully Featured">Switch to Fully Featured</a> <br>
           <a href="http://groups.yahoo.com/group/lexicographylist;_ylc=X3oDMTJlZWY4MjE5BF9TAzk3NDc2NTkwBGdycElkAzExNjgyNzgxBGdycHNwSWQDMTcwOTE5NTkxMQRzZWMDZnRyBHNsawNocGYEc3RpbWUDMTE2NTgwNjQzMw--">
        Visit Your Group 
      </a> |
      <a href="http://docs.yahoo.com/info/terms/">
        Yahoo! Groups Terms of Use
      </a> |
      <a href="mailto:lexicographylist-unsubscribe@yahoogroups.com?subject=Unsubscribe">
       Unsubscribe 
      </a> 
 <br>
    </div>
  <br>

<!-- |**|end egp html banner|**| -->


<span  style="color: white;"/>__,_._,___</span>
</body></html>