[Lexicog] Re: Citation forms in Prefixing Languages

Mike Maxwell maxwell at LDC.UPENN.EDU
Wed Jan 21 02:23:51 UTC 2004


Koontz John E wrote:
> This calls to mind trying to find entries in a Classical Greek lexicon
> starting with an inflected irregular non-present stem form from text,
> though at least student versions sometimes list the first person of
> common irregular stems.

I think the crucial issue is how many difficult-to-parse forms there are,
where difficulty can be caused by irregular forms (as here), or by opacity
(as in many Philippine languages), or by sheer number of prefixes (as in
Bantu), or by some combination of these (as in Athabaskan languages).  If
there aren't too many difficult forms, you can list them among the other
entries (i.e. as minor entries).  But if they're overwhelming (as I would
imagine the Bantu ones are--after all, this is an agglutinating language),
then listing is probably not an option, because the vast majority of the
entries would be these minor entries.

It would be interesting to see how bad it would be to list all or most of
the prefixed forms in these other languages.  My impression (from very
limited experience) is that Cebuano (Philippine) would not be too bad,
Tagalog would be somewhat worse, and Athabaskan languages would be nearly as
bad as the Bantu languages (in terms of the number of prefixed forms that
would be listed--my bias is that Athabaskan is much worse in terms of
opacity and complexity of derivation).

Ron Moe wrote:
> So if you encounter 'minadag' you would have to look under
> 'adag' 'padag' 'badag' and 'madag', hoping that one of them
> was the word you were looking for. If you wanted to find 'manadag',
> you might find it under 'adag' 'tadag' 'dadag' 'nadag' or 'sadag'.
> And that's assuming you were familiar enough with the language
> to even know where to look, and analytical enough to recognize
> potential prefixes and figure out what they might be hiding.

OK, so suppose we assume the dictionary user doesn't know the language well
enough to do this (or is too unsophisticated to think of the problem in this
way).  Can we help him out by listing in the printed dictionary all the
possible "prefix strings"--where by "prefix string" I mean the string of
characters up to the first invariable letter or so of the stem?  This would
then be cross referenced to the various possible citation forms.  So for
this example, we would have pseudo-lex entries like

    mina... See a..., pa..., ba..., ba..., ma....
    mine... See e..., pe..., be..., be..., me....
and
    mana...  See a..., ta..., da..., na..., sa....
etc.

So the number of these pseudo-entries would be on the order of number of
prefix sequences * the number of letters that can follow the opaque
prefixes, rather than the number of prefix sequences * the number of roots
of the appropriate morphosyntactic class.  A much smaller number, I would
think, and perhaps manageable in some cases.

The use of pseudo-lex entries would take some training, but it might be
better than trying to teach the opaque (in the linguistic sense, although
it's likely to seem opaque in the other sense!) phonological and
morphological processes.

Of course the real answer is a computer program that parses a wordform and
gives you a pointer to the root(s), and a pocket computer that this will run
on.  Some day.  (Actually, we're working on this, but it is apt to be
impractical in many cases.  And you have to build the parser for each
language.)

    Mike Maxwell
    Linguistic Data Consortium
    maxwell at ldc.upenn.edu




------------------------ Yahoo! Groups Sponsor ---------------------~-->
Buy Ink Cartridges or Refill Kits for your HP, Epson, Canon or Lexmark
Printer at MyInks.com. Free s/h on orders $50 or more to the US & Canada.
http://www.c1tracking.com/l.asp?cid=5511
http://us.click.yahoo.com/mOAaAA/3exGAA/qnsNAA/HKE4lB/TM
---------------------------------------------------------------------~->

Yahoo! Groups Links

To visit your group on the web, go to:
 http://groups.yahoo.com/group/lexicographylist/

To unsubscribe from this group, send an email to:
 lexicographylist-unsubscribe at yahoogroups.com

Your use of Yahoo! Groups is subject to:
 http://docs.yahoo.com/info/terms/



More information about the Lexicography mailing list