[Corpora-List] All English Text Messaging Corpus?
Rich Cooper
rich at englishlogickernel.com
Sat Apr 9 20:41:45 UTC 2011
Hi Laura,
I don't know of any text message sources exactly like what your are
describing. But there is a huge, partially structured text database for US
patent documents, nearly all in English I suppose, which have all been
critiqued by expert examiners, as edited in the process of negotiating a
patent claim set - all in English. You can create databases of patent
documents on your desktop by downloading the free web client software Elk
for Patents (EfP), which is built on the English Logic Kernel (Elk), as
described in US Patent 7,209,923. The patent is posted on the web site as
well. It teaches ways to combine corpus analysis methods with relational
and object oriented database technologies. See my website to download and
try the free program.
EnglishLogicKernel dot com
One advantage of choosing the patent database is that every document is
constrained by the patenting process by experts in each patent's specific
technologies, and the vocabulary of words defined modus ponens after careful
debate and crafting of each claim sentence. For example, no really
effective syntax parser for English has reached widespread usage, with the
best of the performers being the Link Grammar Processor (LGP), IMHO. Using
the vocabulary of non-noise words defined in patent claims, the English
analyst can relate those claim words and phrases to specific objects as they
have been described by sentences in the much more verbose specification part
of the patent document. This provides an ideal, large, partially structured
database and processing environment in which to analyze the English of claim
language.
HTH,
-Rich
Sincerely,
Rich Cooper
EnglishLogicKernel.com
Rich AT EnglishLogicKernel DOT com
9 4 9 \ 5 2 5 - 5 7 1 2
-----Original Message-----
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of
Christopherson, Laura
Sent: Saturday, April 09, 2011 12:35 PM
To: corpora at uib.no
Subject: [Corpora-List] All English Text Messaging Corpus?
Hi All,
Do any of you know of a text messaging corpus only in English that is not
a collection of someone's personal (and/or family/friends') messages?
Thanks,
Laura
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list