[Corpora-List] Short message service (SMS) corpus publicly available

Min-Yen Kan kanmy at comp.nus.edu.sg
Mon May 3 12:46:50 UTC 2004


Dear researchers:

	We are pleased to make publicly available a small corpus of short
message service (SMS) messages.  

**** National University of Singapore Short Message Service Corpus ****

These messages were collected and used in a final year undergraduate project
analyzing the efficiency of SMS input.  The corpus contains messages mostly
in English.  The message contributors were mainly university students in
Singapore.

Over 10,000 messages were collected, representing over 100 different users.
The corpus is made available under a modified Open Directory Project
license.  Please see the webpage for the corpus for more details.  More
comprehensive documentation on the (on-going) project will be made available
as time and demand allow.

http://www.comp.nus.edu.sg/~rpnlpir/downloads/corpora/smsCorpus/

We hope the community with find this corpus useful as a small benchmark for
gauging the efficiency of SMS message entry as well as for SMS / chat log
language analysis.  These messages are provided as an XML file that
validates against a document-internal DTD.

Regards,

Min-Yen KAN
Assistant Professor
Department of Computer Science, School of Computing
National University of Singapore, Singapore 117543
Office: S15-05-05
Tel: ++ (65) 6874-1885
Fax: ++ (65) 6779-4580
kanmy at comp.nus.edu.sg
http://www.comp.nus.edu.sg/~kanmy



More information about the Corpora mailing list