<html><head>
<style type="text/css">
<!--
#ygrp-mkp{
border: 1px solid #d8d8d8;
font-family: Arial;
margin: 14px 0px;
padding: 0px 14px;
}
#ygrp-mkp hr{
border: 1px solid #d8d8d8;
}
#ygrp-mkp #hd{
color: #628c2a;
font-size: 85%;
font-weight: bold;
line-height: 122%;
margin: 10px 0px;
}
#ygrp-mkp #ads{
margin-bottom: 10px;
}
#ygrp-mkp .ad{
padding: 0 0;
}
#ygrp-mkp .ad a{
color: #0000ff;
text-decoration: none;
}
-->
</style>
</head>
<body>
Dear Mike:<br><br>You give an empty form in your example of how this works (for Arabic) at <br> <a href="http://projects.ldc.upenn.edu/art/reader/source/Al-Kitaab.01.">http://projects.<wbr>ldc.upenn.<wbr>edu/art/reader/<wbr>source/Al-<wbr>Kitaab.01.</a> Can you send several actual examples of Arabic lexemes from this database, e.g. ta`arif or<br>or suhuniyyat(un). It is difficult to understand the structure of the database without actual lexemes. I also do not know whether the examples will be<br>displayed in the readable form in the email.<br><br>Hayim Sheynin<br><br><b><i>maxwell@ldc.upenn.edu</i></b> wrote:<blockquote class="replbq" style="border-left: 2px solid rgb(16, 16, 255); margin-left: 5px; padding-left: 5px;"> <div
id="ygrp-text"> <div>Quoting Heather Souter <<a href="mailto:hsouter%40gmail.com">hsouter@gmail.<wbr>com</a>>:<br> > I, too, am very interested in learning about dictionary development<br> > for languages with complex morphologies. ...<br> > Any insight into how to create dictionaries that are useful to<br> > speakers and learners and not only language specialists would be<br> > especially welcomed!<br> <br> One "solution" (quote marks explained at the end of this msg) is to <br> give people a computer program that allows them to look up words <br> regardless of the inflected form that they type in. For the simple <br> cases, this can often be done by just looking for a substring of the <br> typed-in word. For a purely suffixing language, the substring would <br> begin at the first letter of the typed-in word.<br> <br> Of course, the simple cases are not the ones where people need the most <br> help. The complex cases--where there is
prefixing (or worse, both <br> prefixing and suffixing), or infixing, or reduplication, or lots of <br> stem allomorphy--<wbr>are the ones where people need help, and where the <br> simple solutions don't work. For these morphologically complex <br> languages, there needs to be a morphological parser between the user <br> and the electronic dictionary per se. The parser's job is to remove <br> all the suffixes, undo any stem allomorphy, convert the stem into a <br> dictionary citation form, and finally look up the citation form in the <br> actual dictionary.<br> <br> One project that is building such tools in a generic fashion (i.e. in a <br> way that should be portable to more languages, as opposed to a <br> proprietary way that just works for French, say), is a Department of <br> Education funded project at the Linguistic Data Consortium (LDC). <br> There's an example of how this works (for Arabic) at <br> <a
href="http://projects.ldc.upenn.edu/art/reader/source/Al-Kitaab.01.">http://projects.<wbr>ldc.upenn.<wbr>edu/art/reader/<wbr>source/Al-<wbr>Kitaab.01.</a> In this <br> case, the lookup is limited to the text shown there, but a simple <br> modification would allow the user to type in words to be looked up. <br> The project is also demonstrating lookup with the same tool on (a <br> dialect of) Nahuatl, a morphologically complex language of Mexico. <br> (Disclaimer: I'm a consultant on this project, hence biased :-).)<br> <br> There are of course other reasons (besides morphology) that make it <br> hard for people to look up words in dictionaries, such as spelling. <br> One can imagine inserting a spell corrector between the user and the <br> electronic dictionary. For morphologically complex languages, such a <br> spell corrector will almost certainly have to be based off of a <br> morphological parser.<br> <br> And of course my whole long-winded answer presupposes that
electronic <br> dictionaries (and the computers that they run on) are a reasonable <br> solution for the language speakers. For speakers of languages in <br> California, that's probably true; for speakers in the Amazon, that may <br> not be a solution at all.<br> <br> Mike Maxwell<br> CASL/ U MD<br> <br> ------------<wbr>---------<wbr>---------<wbr>---------<wbr>---------<wbr>---------<wbr>-<br> This message was sent using IMP, the Internet Messaging Program.<br> <br> </div> </div> <!--End group email --> </blockquote><br><BR><BR>Dr. Hayim Y. Sheynin<p>
<hr size=1>Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. <a href="http://us.rd.yahoo.com/evt=51733/*http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ "> Try it now.</a>
<span width="1" style="color: white;"/>__._,_.___</span>
<!-- Start Recommendations -->
<!-- End Recommendations -->
<!-- |**|begin egp html banner|**| -->
<img src="http://geo.yahoo.com/serv?s=97476590/grpId=11682781/grpspId=1709195911/msgId=4445/stime=1209926657" width="1" height="1"> <br>
<!-- |**|end egp html banner|**| -->
<!-- |**|begin egp html banner|**| -->
<br>
<div style="font-family: verdana; font-size: 77%; border-top: 1px solid #666; padding: 5px 0;" >
Your email settings: Individual Email|Traditional <br>
<a href="http://groups.yahoo.com/group/lexicographylist/join;_ylc=X3oDMTJnOXIydnE5BF9TAzk3NDc2NTkwBGdycElkAzExNjgyNzgxBGdycHNwSWQDMTcwOTE5NTkxMQRzZWMDZnRyBHNsawNzdG5ncwRzdGltZQMxMjA5OTI2NjU3">Change settings via the Web</a> (Yahoo! ID required) <br>
Change settings via email: <a href="mailto:lexicographylist-digest@yahoogroups.com?subject=Email Delivery: Digest">Switch delivery to Daily Digest</a> | <a href = "mailto:lexicographylist-fullfeatured@yahoogroups.com?subject=Change Delivery Format: Fully Featured">Switch to Fully Featured</a> <br>
<a href="http://groups.yahoo.com/group/lexicographylist;_ylc=X3oDMTJlbXIxNW1zBF9TAzk3NDc2NTkwBGdycElkAzExNjgyNzgxBGdycHNwSWQDMTcwOTE5NTkxMQRzZWMDZnRyBHNsawNocGYEc3RpbWUDMTIwOTkyNjY1Nw--">
Visit Your Group
</a> |
<a href="http://docs.yahoo.com/info/terms/">
Yahoo! Groups Terms of Use
</a> |
<a href="mailto:lexicographylist-unsubscribe@yahoogroups.com?subject=Unsubscribe">
Unsubscribe
</a>
<br>
</div>
<br>
<!-- |**|end egp html banner|**| -->
<span style="color: white;"/>__,_._,___</span>
</body></html>