19.1824, Qs: Representing Numerical Expressions

Sat Jun 7 19:18:32 UTC 2008

LINGUIST List: Vol-19-1824. Sat Jun 07 2008. ISSN: 1068 - 4875.

Subject: 19.1824, Qs: Representing Numerical Expressions

Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>
Reviews: Randall Eggert, U of Utah  
         <reviews at linguistlist.org> 

Homepage: http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, 
and donations from subscribers and publishers.

Editor for this issue: Catherine Adams <catherin at linguistlist.org>

We'd like to remind readers that the responses to queries are usually
best posted to the individual asking the question. That individual is
then strongly encouraged to post a summary to the list. This policy was
instituted to help control the huge volume of mail on LINGUIST; so we
would appreciate your cooperating with it whenever it seems appropriate.

In addition to posting a summary, we'd like to remind people that it
is usually a good idea to personally thank those individuals who have
taken the trouble to respond to the query.

To post to LINGUIST, use our convenient web form at


Date: 06-Jun-2008
From: Sandra Williams < s.h.williams at open.ac.uk >
Subject: Representing Numerical Expressions


-------------------------Message 1 ---------------------------------- 
Date: Sat, 07 Jun 2008 15:17:17
From: Sandra Williams [s.h.williams at open.ac.uk]
Subject: Representing Numerical Expressions
E-mail this message to a friend:

Knowledge Representations for Numerical Data in Generation and Information

I am currently working on generating numerical expressions in English,
especially on producing variations in proportions (decimals, fractions,
percentages, etc.). I am collecting a corpus of sets of newspaper and
magazine articles that report on the same underlying numerical data (for
instance, a dozen articles about the recent study on the decrease in the
puffin population on the Isle of May in Scotland). I am investigating how
different authors vary when expressing the same underlying numerical
information, e.g. 'The puffin population off Scotland's East coast has
dropped by nearly a third in less than five years'. If anyone knows of
similar research on numerical expressions, I would be grateful to hear of
it (I am aware of Veronique Moriceau's work on generating numerical answers
in a question-answering system).

My main question, however, concerns knowledge representations for basic
numerical data, such as would be necessary for Information Extraction or
Natural Language Generation. Does anyone know of research on representing
proportions, e.g. as the cardinality of two sets of entities and their
related time and space information, as in the above sentence? -  presumably
one set of puffins at time T1, located off the East coast of Scotland and
another set, at time T2 (less than 5 years after T1) located in the same
place and with the ratio of the cardinalities of the two sets approximately
equal to 3 to 2.

If you know of Information Extraction systems, NLG systems, ontologies, or
linguistic studies that would be relevant to my research, please do let me

Thank you for your help. I will post a summary of answers. 

Linguistic Field(s): Computational Linguistics
                     Text/Corpus Linguistics

LINGUIST List: Vol-19-1824	


More information about the Linguist mailing list