6.291 FYI: Software for an experimental MT system

The Linguist List linguist at tam2000.tamu.edu
Thu Feb 23 20:01:57 UTC 1995


----------------------------------------------------------------------
LINGUIST List:  Vol-6-291. Thu 23 Feb 1995. ISSN: 1068-4875. Lines: 205
 
Subject: 6.291 FYI: Software for an experimental MT system
 
Moderators: Anthony Rodrigues Aristar: Texas A&M U. <aristar at tam2000.tamu.edu>
            Helen Dry: Eastern Michigan U. <hdry at emunix.emich.edu>
 
Asst. Editors: Ron Reck <rreck at emunix.emich.edu>
               Ann Dizdar <dizdar at tam2000.tamu.edu>
               Ljuba Veselinova <lveselin at emunix.emich.edu>
 
-------------------------Directory-------------------------------------
 
1)
Date: Wed, 22 Feb 1995 15:38:09 +0100
From: Wilhelm Weisweber (ww at cs.tu-berlin.de)
Subject: Software for an experimental MT system
 
-------------------------Messages--------------------------------------
1)
Date: Wed, 22 Feb 1995 15:38:09 +0100
From: Wilhelm Weisweber (ww at cs.tu-berlin.de)
Subject: Software for an experimental MT system
 
Hi subscribers of LINGUIST List,
 
information and software of the experimental MT system of the project
KIT-FAST from the Technical University of Berlin is now available via
WWW and FTP. The information below is available via
 
    WWW: http://www.cs.tu-berlin.de/~ww/mtsystem.html
 
This WWW document contains all hypertext links, which are relevant in
order to get the software, documentation and further information. The
experimental MT system is implemented in Prolog and running on AT
compatible PC as well as Sun Workstations (see below).
 
The experimental MT System of the Project KIT-FAST
==================================================
 
An experimental MT system has been developed and implemented by the project
FAST within the project group KIT. The transfer-based experimental MT system
translates German texts into English sentence by sentence. The translation of
a sentence consists of morphological, syntactical, semantical and conceptual
analysis, transfer, generation and morphological synthesis. The semantic and
conceptual analysis, the transfer as well as the generation is realized by
one algorithm on the basis of term-rewriting (known from the automatic
provement of equations). A module for the evaluation of anaphoric relations
of the source language and the KL-ONE based knowledge representation system
BACK are components of the MT system. The BACK system is used for the
representation of background knowledge in its TBox and of the text content
in its ABox. The evaluation algorithm uses the representation of the text
content in order to check the semantic consistency of possible antecedents
for anaphoric pronouns. This factor and others are defined as parameters
for the evaluation algorithm.
 
The components of the MT system
+++++++++++++++++++++++++++++++
 
 o morphological analyser based on the SUTRA system
 o GPSG parser for direct interpretation of ID rules, LP statements and
   metarules
 o term-rewrite rule interpreter for semantic and conceptual analysis,
   transfer and generation
 o morphological synthesizer based on the SUTRA system
 o module for the evaluation of anaphoric relations
 o the knowledge representation system BACK
 o tools for the development of lexicons, grammars and term-rewrite systems
 
Linguistic Data
+++++++++++++++
 
Linguistic data was developed in order to translate a German text, which is
"The Proposal of the European Commission for the ESPRIT Programme". About 100
sentences were successfully tested with the help of the MT system. The
linguistic data comprises:
 
 o a German grammar (GPSG):
    - 22 main categories, 34 features
    - 22 aliases
    - 76 ID rules
    - 23 LP statements
    - 5 metarules
    - 23 FCRs
    - 265 lexical entries (stem forms)
 o 134 term-rewrite rules for semantic analysis (German)
 o 37 term-rewrite rules for conceptual analysis (German)
 o 248 term-rewrite rules for transfer (German --) English)
 o 182 term-rewrite rules for generation (English)
 o 8 factors for the evaluation of anaphoric relations in German:
    1. agreement
    2. binding
    3. proximity
    4. preference for the semantic subject
    5. topic preference
    6. identity of roles
    7. negative preference for free adjuncts
    8. conceptual consistency
 o the predefined background knowledge comprises selectional restrictions
 
Implementation
++++++++++++++
 
The MT system is implemented in Quintus-Prolog 3.1 (commercial software) and
SWI-Prolog 1.9.5 (public domain software). Both Prolog dialects are running
on Sun workstations under SunOS and AT compatible PCs under DOS (Windows 3.1).
The MT system is tested for Quintus- and SWI-Prolog under SunOS and under
SWI-Prolog under Windows 3.1 and needs about 10 MB of hard disk space.
 
In order to get the software for the MT system running on AT compatible PCs
under DOS (Windows 3.1) see http://www.cs.tu-berlin.de/~ww/mtdos.html.
 
If you are interested in receiving the software for the MT system for Sun
workstations under SunOS see http://www.cs.tu-berlin.de/~ww/mtsun.html.
 
Documents related to the MT system
++++++++++++++++++++++++++++++++++
 
 o Birte Schmitz, Susanne Preu_, Christa Hauenschild
   "Textreprdsentation und Hintergrundwissen f|r die Anaphernresolution im
   Maschinellen \bersetzungssystem KIT-FAST"
   KIT-Report 93, Institute for Software and Theoretical CS, Technical
   University of Berlin 1992 and in: M. Kohrt, Ch. K|per (eds.), "Probleme der
   \bersetzungswissenschaft", Working Papers in Linguistics, Department for
   Linguistics, Technical University of Berlin 1991, p. 39-81
 o Christa Hauenschild
   "Anapherninterpretation in der Maschinellen \bersetzung"
   KIT-Report 94, Institute for Software and Theoretical CS, Technical
   University of Berlin 1992 and Zeitschrift f|r Literaturwissenschaft und
   Linguistik 84 (1991), Vandenhoeck & Ruprecht, p. 50-66
 o Susanne Preu_, Birte Schmitz, Christa Hauenschild
   "Anaphora Resolution Based on Semantic and Conceptual Knowledge"
   in: Susanne Preu_, Birte Schmitz, "Workshop on Textrepresentation and
   Domain Modelling - Ideas from Linguistics and AI", KIT-Report 97, Institute
   for Software and Theoretical CS, Technical University of Berlin 1992, p.
   1-13
 o Wilhelm Weisweber
   "Transfer in Machine Translation by Non-Confluent Term-Rewrite Systems"
   Proceedings of the GWAI-89, Eringerfeld 1989, p. 264-269
 o Wilhelm Weisweber, Christa Hauenschild
   "A Model of Multi-Level Transfer for Machine Translation and Its Partial
   Realization"
   KIT-Report 77, Institute for Software and Theoretical CS, Technical
   University of Berlin 1990 and to appear in: Proceedings of the Seminar
   "Computers & Translation '89", Tiflis 1989
 o Wilhelm Weisweber
   "Term-Rewriting as a Basis for a Uniform Architecture in Machine
   Translation"
   Proceedings of the Coling-92, Nantes 1992, p. 777-783 and extended version
   in KIT-Report 101, Institute for Software and Theoretical CS, Technical
   University of Berlin 1992
 o Christa Hauenschild, Stephan Busemann
   "A constructive Version of GPSG for Machine Translation"
   in: Erich Steiner, Paul Schmidt, Cornelia Zellinsky-Wibbelt (eds.), "From
   Syntax to Semantics - Insights from Machine Translation", Frances Pinter,
   London 1988, p. 216-238
 o Wilhelm Weisweber
   "Ein Dominanz-Chart-Parser f|r generalisierte Phrasenstrukturgrammatiken"
   KIT-Report 45, Institute for Software and Theoretical CS, Technical
   University of Berlin 1987
 o Wilhelm Weisweber, Susanne Preu_"
   "Direct Parsing with Metarules
   Proceedings of the Coling-92, Nantes 1992, p. 1111-1115 and extended
   version in KIT-Report 102, Institute for Software and Theoretical CS,
   Technical University of Berlin 1992
 o Wilhelm Weisweber
   "Termersetzung als Basis f|r eine einheitliche Architektur in der
   maschinellen Sprach|bersetzung"
   Sprache un Information Band 28, Niemeyer, T|bingen 1994
 o Wilhelm Weisweber
   "The experimental MT System System of the Project KIT-FAST"
   Proceedings of the International Conference "Machine Translation: Ten Years
   On", Cranfield 1994, p. 12.1-12.19
 
User and System documentation:
 
 o Wilhelm Weisweber
   "Implementierungs- und Benutzerhandbuch des experimentellen Berliner
   M\-Systems"
   KIT-Report 116, Institute for Software and Theoretical CS, Technical
   University of Berlin 1994
 
The list of available KIT reports can be found at
http://www.cs.tu-berlin.de/~kit/reportliste/kitlistehtml.html.
 
Further Information
+++++++++++++++++++
 
Wilhelm Weisweber
Technical University of Berlin
Department of Computer Sciences
Institute for Software and Theoretical Computer Sciences (ISTI)
Functional and Logic Programming (FLP)
Sekr.: FR 6-10
Franklinstr. 28/29
D-10587 Berlin-Charlottenburg
Federal Republic of Germany
 
Fon: +49-30-314-73608
Fax: +49-30-314-73622
E-mail: ww at cs.tu-berlin.de
WWW: http://www.cs.tu-berlin.de/~ww/
 
--------------------------------------------------------------------------
LINGUIST List: Vol-6-291.



More information about the LINGUIST mailing list