<span class="Apple-style-span" style="border-collapse: collapse; font-family: arial, sans-serif; font-size: 13px; ">Dear members of the corpora community:<br><br>We are seeking your help to enlarge a freely available public corpus<br>
of SMS messages. In the last few months, at the National University<br>of Singapore (NUS), we have been working to collect a live corpus of<br>SMS (Short Message Service) messages. In fact, previously in 2004, we<br>had made a corpus of messages (~10,000 messages in English, mostly<br>
from Singaporeans) available to the public for study.<br><br>We have restarted the 2004 project since last October, aiming at<br>enlarging the corpus both in depth and breadth. We are collecting<br>better demographic information, timestamps, recipient and sender<br>
identity (appropriately anonymized) and including this with the<br>corpus' messages. Up to now, we have collected over 21,000 new<br>English messages and 10,000 Chinese messages. Most messages are<br>tagged with metadata about the sender's profile (gender, age, country,<br>
years of using SMS, number of SMS sent daily, etc.). The corpus is<br>being versioned and released on a monthly basis, and is free for all<br>communities to use. New releases are made on a monthly basis, since<br>the corpus collection process is live and the corpus is growing. For<br>
detailed information about our corpus, please visit our NUS SMS Corpus<br>site at: <a href="http://wing.comp.nus.edu.sg:8080/SMSCorpus" target="_blank" style="color: rgb(51, 51, 204); ">http://wing.comp.nus.edu.sg:8080/SMSCorpus</a>.<br>
<br>We write this email to seek your help, either directly or indirectly,<br>to ask for your contribution to build this public resource. SMS<br>messages still continue to be a vital, sensitive and important vehicle<br>for personal communication which many of us use on a daily basis. Up<br>
to now, scholars do not have access to a large, freely available SMS<br>corpus to study and most research on SMS has been done with<br>collaboration with private companies who have strict non-disclosure<br>agreements, making comparative SMS research impossible.<br>
<br>As SMS are potentially sensitive and identity-revealing, our<br>collection framework tries to anonymize sensitive data in messages,<br>such as telephone numbers, email addresses and other identifiers,<div>before accepting them into the corpus. This is a legitimate attempt to</div>
<div>collect and enlarge an SMS corpus for the public good, and if you are </div><div>concerned about the legitimacy of our project, please visit our webpage</div><div> first. Additionally, this study has been exempted from NUS' institutional </div>
<div>review board (IRB) panel for human studies protocols.<div><br>Such a public corpus needs your contribution, as most of us are<br>senders of SMS. With a larger base of contributors and a growing<br>number of messages archived, the corpus will grow in depth and utility<br>
to scholars everywhere.<br><br></div><div><br>Currently, there are three methods for you to contribute SMS messages<br>to the public corpus. Please refer to the "Contribution" page from our<br>project page at <a href="http://wing.comp.nus.edu.sg:8080/SMSCorpus/" target="_blank" style="color: rgb(51, 51, 204); ">http://wing.comp.nus.edu.sg:8080/SMSCorpus/</a> for<br>
detailed information. We summarize them below.<br><br>* Android phone owners - Please install our App "SMS Collection for<br>Corpus" from the Android market (authored by Web IR/ NLP Group @ NUS).<br>Follow the app's instructions to submit SMS to us. The software will<br>
create a draft message with your SMSes to send to us; you will have a<br>chance to censor or delete messages that you do not want to<br>contribute.<br><br>* Nokia phone owner - Please use Nokia PC Suite to export SMS as a CSV<br>
file. The PC Suite software is available from our project page. Then<br>send the file to <a href="mailto:SMS.Donation@gmail.com" target="_blank" style="color: rgb(51, 51, 204); ">SMS.Donation@gmail.com</a>.<br><br>* Other brand phone owner - You can type your messages in the<br>
contribution site's web page. Or export your SMS as a file(eg. CSV<br>file) if you know some software can help you do so, then sent the file<br>to <a href="mailto:SMS.Donation@gmail.com" target="_blank" style="color: rgb(51, 51, 204); ">SMS.Donation@gmail.com</a>.<br>
<br>(We currently do not have an automated donation method for the iPhone, sorry!)</div></div><div><br></div><div><br></div><div><div>If you have any questions or suggestions, please feel free to contact</div><div>me. We sincerely appreciate your suggestions and contributions!</div>
</div></span><br>-- <br><div><span style="border-collapse:collapse;color:rgb(80, 0, 80);font-family:arial, sans-serif;font-size:13px">Tao CHEN</span></div><div><span style="border-collapse:collapse;color:rgb(80, 0, 80);font-family:arial, sans-serif;font-size:13px"><br>
</span></div><div><span style="border-collapse:collapse;color:rgb(80, 0, 80);font-family:arial, sans-serif;font-size:13px"><div><span style="border-collapse:collapse;color:rgb(80, 0, 80);font-family:arial, sans-serif;font-size:13px">PhD Candidate</span></div>
Web IR / NLP Group (WING), School of Computing</span><div><span style="border-collapse:collapse;color:rgb(80, 0, 80);font-family:arial, sans-serif;font-size:13px">National University of Singapore</span></div></div><br>