16.821, Diss: Comp Ling/Lexicography: Dabiru: 'Preparing ...'

Thu Mar 17 18:51:11 UTC 2005

LINGUIST List: Vol-16-821. Thu Mar 17 2005. ISSN: 1068 - 4875.

Subject: 16.821, Diss: Comp Ling/Lexicography: Dabiru: 'Preparing ...'

Moderators: Anthony Aristar, Wayne State U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>

Reviews (reviews at linguistlist.org)
        Sheila Collberg, U of Arizona
        Terry Langendoen, U of Arizona

Homepage: http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.

Editor for this issue: Takako Matsui <tako at linguistlist.org>
================================================================

To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.

===========================Directory==============================

1)
Date: 17-Mar-2005
From: Takako Matsui < tako at linguistlist.org >
Subject: Preparing A Bi-lingual Electronic Verb Paradigm Dictionary (Telugu-Hindi)

-------------------------Message 1 ----------------------------------
Date: Thu, 17 Mar 2005 13:48:07
From: Takako Matsui < tako at linguistlist.org >
Subject: Preparing A Bi-lingual Electronic Verb Paradigm Dictionary (Telugu-Hindi)

Institution: University of Hyderabad
Program: Centre for Applied Linguistics and Translation Studies
Dissertation Status: Completed
Degree Date: 2004

Author: Sirisha Dabiru

Dissertation Title: Preparing A Bi-lingual Electronic Verb Paradigm Dictionary
(Telugu-Hindi)

Linguistic Field(s): Computational Linguistics
                     Lexicography

Subject Language(s): Hindi (HND)
                     Telugu (TCW)

Dissertation Director(s):
Padmakar Dadegaonkar

Dissertation Abstract:

Computerized dictionaries have become most common in this digital age. The
presentation and aims of the dictionaries vary according to the necessity and
interest.  A set or list of all the inflectional forms of a word is called a
paradigm. Paradigm first appeared in English in the 15th century, meaning 'an
example or pattern,' and it still bears this meaning today: Their company is a
paradigm of the small high-tech firms that have recently sprung up in this area.
For nearly 400 years paradigm has also been applied to the patterns of
inflections that are used to sort the verbs, nouns, and other parts of speech of
a language into groups that are more easily studied.1  A good bi-lingual
dictionary usually contains the word (of source language), it's grammatical
category, gender (wherever it is applicable) and it's counter part in the target
language. It is very rare that a dictionary contains the paradigm of each word
and it's equivalent in the target language. Depending upon the language the
paradigm of a verb from tens to thousands. In Greek, the paradigm of a verb has
more than 200 forms. In Sanskrit the total paradigm of a verb has 1350 forms.
Rather it is not possible for a lexicographer to include all such forms in
his/her dictionary. The main aim of this topic is to study the paradigm of
Telugu verb and supply the equivalents for the inflected form. As an example let
us look at the verb koVttu(=to beat; mAra(Hindi)). The paradigm list for this
verb contain various words like koVttAnu, koVttiMDi, koVttAdu, koVttanu etc.
which carry different meanings - (mEne)mArA ThA(koVttAnu), nahIM
mArUMgA(koVttanu). A general dictionary contains only the root word koVttu. If
we keenly observe the inflected words (Anu of koVttAnu, anu of koVttanu) are
making the difference.

As this work is related to computational lexicography the purpose of this study
is to make them available for the user by presenting the observations through
programmatically.

The first chapter explains the following concepts:
1.	What is computational linguistics/ computational lexicography
2.	Aims of the study, computational perspective part of it
3.	Evolution of computational lexicography (International and Indian scenario)
4.	Limitations of computational lexicography

The second chapter deals with the following things:
1.	How the data was collected
2.	Criteria for data collection
3.	How the data was analyzed
4.	Difficulties faced during analysis

The third chapter explains rather gives a clear idea how the data has been
analyzed. The fourth chapter deals with the computational implementation of the
analyzed data. The fifth chapter is a concluding chapter. This explains how this
study can be continued further.  Appendix follows the last chapter and the
bibliography.

-----------------------------------------------------------
LINGUIST List: Vol-16-821