[Corpora-List] SMS corpora?

Yunqing Xia yqxia at tsinghua.edu.cn
Sun Sep 4 04:04:30 UTC 2011


Hi Yorick,

We had been working on short Chinese chat message analysis for a
while.  Below links would be useful to find information about English
SMS corpus and related work.

NUS SMS Corpus and related work
http://www.comp.nus.edu.sg/~rpnlpir/downloads/corpora/smsCorpus/

SMS Spam Corpus v.0.1
http://www.esp.uem.es/jmgomez/smsspamcorpus/

Phd thesis:  A corpus linguistics study of SMS text messaging
http://etheses.bham.ac.uk/253/1/Tagg09PhD.pdf

Jeunghyun Byun, Seung-Wook Lee, Young-In Song, Hae-Chang Rim. 2008.
Two Phase Model for SMS Text Messages Refinement. In Proceedings of
AAAI Workshop on Enhanced Messaging.
http://www.aaai.org/Papers/Workshops/2008/WS-08-04/WS08-04-002.pdf

AiTi Aw , Min Zhang , Juan Xiao , Jian Su, A phrase-based statistical
model for SMS text normalization, Proceedings of the COLING/ACL on
Main conference poster sessions, p.33-40, July 17-18, 2006, Sydney,
Australia

Catherine Kobus , François Yvon , Géraldine Damnati, Normalizing SMS:
are two metaphors better than one?, Proceedings of the 22nd
International Conference on Computational Linguistics, p.441-448,
August 18-22, 2008, Manchester, United Kingdom

Kam-Fai Wong, Yunqing Xia: Normalization of Chinese chat language.
Language Resources and Evaluation, Volume 42, Number 2:  219-242


yunqing

--
On 4 September 2011 06:05, Yorick Wilks <ywilks at ihmc.us> wrote:
> Is anyone aware of an easily obtained corpus of (semi!!)English SMS messages?
> Id be grateful for pointers.
> Yorick Wilks
>
>
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list