[Corpora-List] All English Text Messaging Corpus?

Rich Cooper rich at englishlogickernel.com
Mon Apr 11 16:43:20 UTC 2011


Hi Trevor,

Yes, the PTO backlog is estimated at about one million applications
unprocessed.  The director's meaning was about the way his funds have been
diverted from the PTO examiner staff to congress's political slush funds.  

English is English, whether in text messaging or in other forms of corpora,
so while the text messaging corpus may be useful to study for those
purposes, the real issue is how English is structured in actual usage.  The
PTO has documents that were very carefully reviewed, which is almost
certainly not the case in text messaging.  Therefore it makes a great corpus
for finding specialized language as used to describe reality within the
specification of the patent.  

JMHO,
-Rich
 
Sincerely,
Rich Cooper
EnglishLogicKernel.com
Rich AT EnglishLogicKernel DOT com
9 4 9 \ 5 2 5 - 5 7 1 2

-----Original Message-----
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of
Trevor Jenkins
Sent: Monday, April 11, 2011 5:42 AM
To: Corpora list
Subject: Re: [Corpora-List] All English Text Messaging Corpus?

On Sat, 9 Apr 2011, Rich Cooper <rich at englishlogickernel.com> wrote:

> Hi Laura,
>
> I don't know of any text message sources exactly like what your are
> describing.  But there is a huge, partially structured text database for
US
> patent documents, nearly all in English I suppose, ...

Surely a highly specific genre of English; all about claims and prior art.
Nothing like English as she is spoke by the people likely to be sending
text messages.

> One advantage of choosing the patent database is that every document is
> constrained by the patenting process by experts in each patent's specific
> technologies, ...

Um, not so true. The (possibly once) head of the USPO admitted that many
dubious patents were being granted because the organisation did not have
the necessary expertise to evaluate the claims. This comment was
specifically made in relation to software patents, which are in any case a
highly contenious area.

Regards, Trevor

<>< Re: deemed!


_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list