[Lexicog] Glossary template

Mon Jan 11 19:56:05 UTC 2010

I've begun work on a new project to produce a template for a bilingual or
trilingual glossary. A glossary is also sometimes called a "vocabulary."
Usually a trilingual glossary/vocabulary is presented in a column format,
each language being presented in a column and each lexeme presented in a
row. A trilingual glossary would have three sections. Assuming the three
languages are the vernacular (V), analysis language 1 (AL1), and analysis
language 2 (AL2), the three sections would be V-AL1-AL2 (in three column
format), AL1-V (or AL1-V-AL2), and AL2-V (or AL2-V-AL1). Usually such
glossaries are limited to single words.
If you would like to collaborate in the production of the template, I would
appreciate the help. I especially need help in picking words to include,
ranking them by frequency and universal usefulness, giving brief definitions
(and possibly example sentences) in order to clarify which sense of the word
is intended, and translating the list of words into other major languages.
If anyone has published a bilingual or trilingual glossary, we could use it
to help produce the template. If you would be willing to let us use your
database, we could harvest the information in it. For instance I have an
electronic copy of a Maguindanaon-English-Tagalog glossary. We could harvest
the English-Tagalog correspondences from it. Someone may know of other
(online?) resources for setting up translation equivalences between major
languages.
I am proposing the following:
1. We call the project "Glost" (Glossary Template). Other suggestions are
welcome.
2. We maintain the database in Toolbox. Toolbox is free. It runs on just
about any computer. It supports Unicode and therefore just about any script.
It can handle odd databases such as the one we would be developing.
3. We create a database of perhaps 5,000 entries. We rank the entries by
frequency and general usefulness. The end user would decide how many entries
he wants and pick which entries he wants. For instance he might want to
start by translating the 3,000 most frequent words in English and then
picking 2,500 to include in the publication.
4. We provide translations in multiple major languages so that the end user
can pick which languages he wants in his publication.
5. We index the entries to a list of English words (for convenience sake and
for general usefulness). However the end user could pick any of the major
languages for his publication, including or excluding English. It would not
be necessary for the end user to know English.
6. We index the entries to the DDP list of semantic domains. This would
enable the end user to sort the entries by domain, making it easier to
translate the words. It would also enable the end user to include a
thesaurus as one section of the publication.
7. The project is open source and freeware, meaning that there are no
copyright restrictions on its use, it is freely distributed, and others can
adapt it as needed. For instance we might want regional variants (for
Africa, Asia, South America, etc.).
8. We develop easy publication paths, perhaps using Lexique Pro. This might
include tools for checking the spelling of the vernacular words. (I know
some of you computer guys already have such tools.)
9. We provide consultant help to set up a project for the end user and
prepare the document for printing and publication, thereby making it easier
for the end user to quickly and easily produce and publish the document. For
instance SIL might be willing to assign someone in a particular country to
be the consultant and to help produce these glossaries for each language in
the country.
10. We use a "shell book" approach for the front and back matter. Each
country might need its own consultant to develop front and back matter
appropriate to the country. For instance the copyright requirements for the
published glossary will vary from country to country. Therefore the
copyright page would need to be designed for a particular country.
11. We develop easy ways to convert the glossary database into a lexical
database that can be used in Toolbox, FieldWorks, and other software to
further develop the data into a full dictionary.
Here is how the system might work. A potential end user contacts us, asking
for help in producing a glossary for language X. One of us meets with him
(or communicates via email) to help him set up the project. We help him
decide which languages to include, how many entries, how many copies, etc.
We extract the analysis languages from the database and send him a file (or
printout). He translates the words into language X. He checks the spelling
and semantic equivalents between languages to ensure accuracy. He sends us
the file (or printout). We prepare the document for publication and send him
a file ready for printing. (Alternatively we produce the photo-ready copy.)
He takes it to a printer and gets it printed.
I know of a case where a state government was willing to fund, supervise,
support, and publish such glossaries for all eight languages in the state.
All they needed was someone to provide the technical expertise. If we work
out the system, the technical requirements will be greatly reduced for the
end user (and consultant).
Please let me know if you can help. Thanks.
Ron Moe

------------------------------------

Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/lexicographylist/

<*> Your email settings:
    Individual Email | Traditional

<*> To change settings online go to:
    http://groups.yahoo.com/group/lexicographylist/join
    (Yahoo! ID required)

<*> To change settings via email:
    lexicographylist-digest at yahoogroups.com 
    lexicographylist-fullfeatured at yahoogroups.com

<*> To unsubscribe from this group, send an email to:
    lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/