[Corpora-List] term extraction info

KERREMANS, Koen Koen.Kerremans at ehb.be
Fri Oct 4 07:36:46 UTC 2002


Hello,

These are the references I got in answer to my question concerning "term
extraction" (cf. see below). Each reference is preceded by the name of the
person who gave me the information. Feel free to add more info to this list.

Regards,

Koen Kerremans

1. Books:

-Jerome Richalot: Pearson, J. (1998). Terms in context. Benjamins, John
Publishing Company.

2. Articles:

-Piklu Gupta: Heid, U. (1999). "Extracting terminologically relevant
collocations from German technical texts". [search via Google]
-Klaus Fleischmann: L'homme, Benali, Bertrand, Laudique. (1996). "Definition
of an evaluation grid for term extraction software". In Terminology 3:2.
Benjamins Publishing Co.
-Chantal Enguehard: Enguehard, C., Pantéra, L., "Automatic Natural
Acquisition of a Terminology", Journal of quantitative linguistics, vol.2,
n°1, pp.27-32, 1995.
-Chantal Enguehard: C. Enguehard, B. Daille, E. Morin, “Tools for
Terminology Processing”, The Indo-European Conference on Multilingual
Communications Technologies (IEMCT), R. K. Arora, M. Kulkarni, H. Darbari
(editors), Tata McGraw-Hill, pp.218-229, Pune, India, June 2002.

3. Websites:

-Johan Haller: http://www.iai.uni-sb.de/de/pub.html
-John Kohl: http://www.xplanation.com [xplanation has a term-extraction tool
that is part of the MT system that this company uses. It is pretty good at
identifying noun phrases. They are located in Leuven. They also have
controlled-English software]
-Ross Smith: http://www.mkms.xerox.com [XEROX have a terminology management
program called XTS which contains an extraction function]
-François Rousselot:
http://www-ensais.u-strasbg.fr/liia/LIIA_Products_Installers/install.htm
[this tool is based on repeated segments: there is a small english
documentation in the program]
-Scheiden: http://www.biomath.jussieu.fr/ATALA/outil/ [section "Extraction
de termes"]

4. Notes:

-Sabine Kirchmeier-Andersen
(http://www.id.cbs.dk/medarbej/ska/sabine_da.shtml) recommends Word Smith
Tools and Quirk who both can use a LGP korpus in order to identify
automatically frequent LSP candidates.  She thinks the latest articles by
Beatrice Daille et al. about term extraction describe the most efficient
methodology.
-Antal van den Bosch (http://ilk.kub.nl/~antalb/) did some experimenting
with a memory-based shallow parser after which he extracted terms using the
tf*idf method in statistics

> -----Original Message-----
> Hello,
>
> Does anyone know of good term extraction tools/methods? My purpose is to
> compare some of the existing methodologies to one another and to evaluate
> their performances on domain-specific texts. Good references or surveys of
> term extraction tools/methods are welcome as well.
>
> Regards,
>
> Koen Kerremans
>



More information about the Corpora mailing list