[Lexicog] Digital Glossarization
Mike Maxwell
maxwell at LDC.UPENN.EDU
Fri May 9 14:28:39 UTC 2008
Jimm GoodTracks wrote:
> I believe that there is value in a "Back of the Book" glossary...
>
> ...For Native American Languages, the verb complex is the most
> important and complex element of a sentence... Certainly a reader can
> be written in the most simplest format, which involves no prefixes,
> suffixes, infixes, conjugations, nor additional grammatical elements
> to be added on to the verb in a Native American sentence. In this
> case, the reader would only be able to speak in the 3rd person
> singular, namely: He/ she/ it...
>
> When other voices are introduced, namely -- I, you, we, they and
> dual or plural elements -- the verb complex begins to build via
> prefixes and suffixes which have no meaning when detached apart from
> the verb.
This same problem happens with many morphologically complex languages,
of which Arabic and Nahuatl (a language of Mexico) are examples. The
difficulty is compounded (no pun), in the sense that most of the words
cannot be looked up in the dictionary, because they are literally not in
the dictionary--at least not in that form.
I recently posted to this group about a project to do assisted
dictionary lookup, using a morphological parser (probably before the
original poster joined this list). I won't repeat what I said, but if
you look in the archives, my posting was 2 May at 2:55 PM, in the thread
"Collaborative lexicography."
> In addition, direct and indirect discourse, prepositional elements,
> probability and more can all factor in. And to this end, the literal
> translation provided in a more layman's terms rather than a
> professional linguistic rendition seems to be the most helpful to the
> language student.
This is also an issue, and not one we can claim to have solved in the
above-mentioned project. One can imagine that most readers would not be
helped by a gloss like 'IncompletiveAspect-2Ergative-hit-1Absolutive'.
If you are producing a semi-literal translation by hand, then of course
you can give a more helpful (if perhaps less accurate) translation. But
it's hard to know how to do this automatically, in a way that is both
general (within a specific language) and accurate. (I don't know for
sure, but I suspect this would be the downfall of a statistical MT
program, even for languages where there are sufficient parallel corpora
to make that possible.) Our proposed solution is to link the morpheme
glosses to a grammar, but that can be cumbersome.
--
Mike Maxwell
maxwell at ldc.upenn.edu
------------------------------------
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/lexicographylist/
<*> Your email settings:
Individual Email | Traditional
<*> To change settings online go to:
http://groups.yahoo.com/group/lexicographylist/join
(Yahoo! ID required)
<*> To change settings via email:
mailto:lexicographylist-digest at yahoogroups.com
mailto:lexicographylist-fullfeatured at yahoogroups.com
<*> To unsubscribe from this group, send an email to:
lexicographylist-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the Lexicography
mailing list