<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:st1="urn:schemas-microsoft-com:office:smarttags" xmlns="http://www.w3.org/TR/REC-html40">


<head>

<meta http-equiv=Content-Type content="text/html; charset=us-ascii">

<meta name=Generator content="Microsoft Word 11 (filtered medium)">

<!--[if !mso]>

<style>

v\:* {behavior:url(#default#VML);}

o\:* {behavior:url(#default#VML);}

w\:* {behavior:url(#default#VML);}

.shape {behavior:url(#default#VML);}

</style>

<![endif]--><o:SmartTagType

 namespaceuri="urn:schemas-microsoft-com:office:smarttags" name="place"/>

<!--[if !mso]>

<style>

st1\:*{behavior:url(#default#ieooui) }

</style>

<![endif]-->

<style>

<!--

 /* Font Definitions */

 @font-face

        {font-family:Tahoma;

        panose-1:2 11 6 4 3 5 4 4 2 4;}

@font-face

        {font-family:Verdana;

        panose-1:2 11 6 4 3 5 4 4 2 4;}

 /* Style Definitions */

 p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0in;

        margin-bottom:.0001pt;

        font-size:12.0pt;

        font-family:"Times New Roman";}

a:link, span.MsoHyperlink

        {color:blue;

        text-decoration:underline;}

a:visited, span.MsoHyperlinkFollowed

        {color:blue;

        text-decoration:underline;}

span.EmailStyle17

        {mso-style-type:personal-reply;

        font-family:Arial;

        color:blue;

        font-weight:normal;

        font-style:normal;

        text-decoration:none none;}

@page Section1

        {size:8.5in 11.0in;

        margin:1.0in 1.25in 1.0in 1.25in;}

div.Section1

        {page:Section1;}

-->

</style>


</head>


<body lang=EN-US link=blue vlink=blue>


<div class=Section1>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'>Dear Siddhartha,<o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'>Metadata in RDBMSs is usually stored in

tables (as you may already know) that can be queried or updated.  I show

how to do so in a program that represents metadata in tables in my patent

7,209,923:<o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><a

href="http://www.englishlogickernel.com/Patent-7-209-923-B1.pdf">http://www.englishlogickernel.com/Patent-7-209-923-B1.pdf</a><o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'>That view of the database tables, columns

and rows is ideal for reasoning tasks and still takes full advantage of the

RDBMS features, as shown in the text and figures of that patent.  Figure 2

shows a generic view of metadata arranged in tables for representing symbols

and text strings, including tokens and phrases.  A copy is below, if it

gets through the email:<o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><img border=0 width=363 height=244

id="_x0000_i1027" src="cid:image001.jpg@01CCB66A.722E7E60"><o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'>Relationships can be modeled most

obviously by creating tables named for the relationship, with rows that contain

the constants and variables you want to place into the relationship.  I

use text names for constants and variables, with variables distinctively

starting with underscores (“_”) much like prolog does.  <o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'>I keep one table locked in memory for the

symbol table, which binds a unique arrival ID (an integer that grows with each

new symbol definition) to a unique text string.  <o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'>The metadata tables relate to the symbol

table by storing just the indexed arrival ID for that string, whether the

string is a symbol or a phrase extracted from a text source.  Unification

is very fast given that representation because the integer indexes are adequate

for calculating unifications.  <o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'>My particular NLP interest of the moment

is in examining patent specifications, which contain unstructured text fields

within a formulaic overall outline that can be dissected algorithmically. 

Patent claims are phrases that bind the sentence “I claim X” so

that each claim phrase can be substituted for X.  <o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'>I also use an inverted text method for

separating out phrases (mostly sentences) from texts.  Each patent

document is read in as text, inverted to enumerate phrases (approximately

sentences).  Each indexed phrase from the inverted document is then

tokenized, with the tokens interned uniquely into the symbol table.  <o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'>Quantifiers are transformed into sequences

of symbol table arrival IDs (integers), and the sequences are stored as rows in

the relationship modeling table.  Since quantifiers can be either

constants or variables, rules can be generalized from instance data in the

claim phrase or the specification phrases.  That means all stored

relations, other than metadata tables, have rows containing cells populated by

integers.  That is why unification and search are so fast with this

representation.  <o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'>There is an example program you can

download, though it doesn’t work with Windows 7 yet.  You can run it

on a <st1:place w:st="on">Vista</st1:place> or an XP box though.  It can

be downloaded from:<o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><a

href="http://www.englishlogickernel.com/setup.exe">http://www.englishlogickernel.com/setup.exe</a><o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'>I promise it won’t screw up your

computer; I use the program on a daily basis and it helps enormously in my

business of patent analysis.  <o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'>It isn’t a general tool, but an

application of NLP analysis.  I am planning a more general analysis tool,

but that won’t be ready for quite a while yet.  Once I have solved

all operational problems for the patent analysis task, I will reorganize the

software components to provide the general capability.  For now, this is

as much as I can handle with the available resources.  <o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'>Please feel free to ask questions if any

of the above isn’t clear.  <o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><o:p> </o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'>-Rich<o:p></o:p></span></font></p>


<p class=MsoNormal><font size=2 color=blue face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:blue'><o:p> </o:p></span></font></p>


<div>


<p class=MsoNormal><font size=3 color=black face="Times New Roman"><span

style='font-size:12.0pt;color:black'>Sincerely,<o:p></o:p></span></font></p>


<p class=MsoNormal><font size=3 color=black face="Times New Roman"><span

style='font-size:12.0pt;color:black'>Rich Cooper<o:p></o:p></span></font></p>


<p class=MsoNormal><font size=3 color=black face="Times New Roman"><span

style='font-size:12.0pt;color:black'>EnglishLogicKernel.com</span></font><font

color=blue><span style='color:blue'><o:p></o:p></span></font></p>


<p class=MsoNormal><font size=3 color=black face="Times New Roman"><span

style='font-size:12.0pt;color:black'>Rich AT EnglishLogicKernel DOT com</span></font><font

color=blue><span style='color:blue'><o:p></o:p></span></font></p>


<p class=MsoNormal><font size=3 color=black face="Times New Roman"><span

style='font-size:12.0pt;color:black'>9 4 9 \ 5 2 5 - 5 7 1 2</span></font><o:p></o:p></p>


</div>


<div>


<div class=MsoNormal align=center style='text-align:center'><font size=3

face="Times New Roman"><span style='font-size:12.0pt'>


<hr size=3 width="100%" align=center tabindex=-1>


</span></font></div>


<p class=MsoNormal><b><font size=2 face=Tahoma><span style='font-size:10.0pt;

font-family:Tahoma;font-weight:bold'>From:</span></font></b><font size=2

face=Tahoma><span style='font-size:10.0pt;font-family:Tahoma'> Siddhartha

Jonnalagadda [mailto:sid.kgp@gmail.com] <br>

<b><span style='font-weight:bold'>Sent:</span></b> Friday, December 09, 2011

11:03 AM<br>

<b><span style='font-weight:bold'>To:</span></b> Rich Cooper<br>

<b><span style='font-weight:bold'>Cc:</span></b>

nlp2rdf@lists.informatik.uni-leipzig.de; CORPORA List; Jens Lehmann<br>

<b><span style='font-weight:bold'>Subject:</span></b> Re: [Corpora-List]

[NLP2RDF] Announcement: NLP Interchange Format(NIF)</span></font><o:p></o:p></p>


</div>


<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'><o:p> </o:p></span></font></p>


<p class=MsoNormal style='margin-bottom:12.0pt'><font size=3 face=Verdana><span

style='font-size:12.0pt;font-family:Verdana'>Hey Rich,<br>

<br>

RDBMS is an industry standard that works well for some things such as storing

the extracted metadata, but might not be optimal for performing reasoning over

it. That might be one reason some people use other representations such as

RDF/SPARQL for higher-level tasks. In general, storing everything in the Common

Analysis Structure defined UIMA's type system works for me and where needed I

could write them into a Database. What is the optimal way to represent the

metadata for reasoning tasks? How could I transfer my UIMA CAS into that

"thing"?<br>

<br clear=all>

Sincerely,<br>

Siddhartha Jonnalagadda, </span></font>Ph.D.<font face=Verdana><span

style='font-family:Verdana'><br>

</span></font><a href="http://sjonnalagadda.wordpress.com" target="_blank"><font

face=Verdana><span style='font-family:Verdana'>sjonnalagadda.wordpress.com</span></font></a><font

face=Verdana><span style='font-family:Verdana'><br>

<br>

</span></font><br>

<br>

<o:p></o:p></p>


<div>


<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'>On Fri, Dec 9, 2011 at 11:56 AM, Rich Cooper <<a

href="mailto:rich@englishlogickernel.com">rich@englishlogickernel.com</a>>

wrote:<o:p></o:p></span></font></p>


<div link=blue vlink=blue>


<div>


<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><font

size=2 color=blue face=Arial><span style='font-size:10.0pt;font-family:Arial;

color:blue'>Dear Siddhartha,</span></font><o:p></o:p></p>


<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><font

size=2 color=blue face=Arial><span style='font-size:10.0pt;font-family:Arial;

color:blue'> </span></font><o:p></o:p></p>


<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><font

size=2 color=blue face=Arial><span style='font-size:10.0pt;font-family:Arial;

color:blue'>Could you please provide more detail about what you need in the way

of “more computer-interpretable than RDBMS”?  I use the RDBMS

columns with unstructured text, analyze the text in software, and populate new

columns to store the analyzed NLP information.  By iteratively aggregating

RDBMS columns, I am able to process NLP quite well using the RDBMS capabilities

plus software functionality for interpretation.  </span></font><o:p></o:p></p>


<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><font

size=2 color=blue face=Arial><span style='font-size:10.0pt;font-family:Arial;

color:blue'> </span></font><o:p></o:p></p>


<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><font

size=2 color=blue face=Arial><span style='font-size:10.0pt;font-family:Arial;

color:blue'>More information would be useful,</span></font><o:p></o:p></p>


<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><font

size=2 color=blue face=Arial><span style='font-size:10.0pt;font-family:Arial;

color:blue'>-Rich</span></font><o:p></o:p></p>


<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><font

size=2 color=blue face=Arial><span style='font-size:10.0pt;font-family:Arial;

color:blue'> </span></font><o:p></o:p></p>


<div>


<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><font

size=3 color=black face="Times New Roman"><span style='font-size:12.0pt;

color:black'>Sincerely,</span></font><o:p></o:p></p>


<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><font

size=3 color=black face="Times New Roman"><span style='font-size:12.0pt;

color:black'>Rich Cooper</span></font><o:p></o:p></p>


<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><font

size=3 color=black face="Times New Roman"><span style='font-size:12.0pt;

color:black'>EnglishLogicKernel.com</span></font><o:p></o:p></p>


<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><font

size=3 color=black face="Times New Roman"><span style='font-size:12.0pt;

color:black'>Rich AT EnglishLogicKernel DOT com</span></font><o:p></o:p></p>


<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><font

size=3 color=black face="Times New Roman"><span style='font-size:12.0pt;

color:black'>9 4 9 \ 5 2 5 - 5 7 1 2</span></font><o:p></o:p></p>


</div>


<div>


<div class=MsoNormal align=center style='text-align:center'><font size=3

face="Times New Roman"><span style='font-size:12.0pt'>


<hr size=3 width="100%" align=center>


</span></font></div>


<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><b><font

size=2 face=Tahoma><span style='font-size:10.0pt;font-family:Tahoma;font-weight:

bold'>From:</span></font></b><font size=2 face=Tahoma><span style='font-size:

10.0pt;font-family:Tahoma'> <a href="mailto:corpora-bounces@uib.no"

target="_blank">corpora-bounces@uib.no</a> [mailto:<a

href="mailto:corpora-bounces@uib.no" target="_blank">corpora-bounces@uib.no</a>]

<b><span style='font-weight:bold'>On Behalf Of </span></b>Siddhartha

Jonnalagadda<br>

<b><span style='font-weight:bold'>Sent:</span></b> Friday, December 09, 2011

9:07 AM<br>

<b><span style='font-weight:bold'>To:</span></b> <a

href="mailto:nlp2rdf@lists.informatik.uni-leipzig.de" target="_blank">nlp2rdf@lists.informatik.uni-leipzig.de</a>;

CORPORA List<br>

<b><span style='font-weight:bold'>Cc:</span></b> Jens Lehmann<br>

<b><span style='font-weight:bold'>Subject:</span></b> Re: [Corpora-List]

[NLP2RDF] Announcement: NLP Interchange Format(NIF)</span></font><o:p></o:p></p>


</div>


<div>


<div>


<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><font

size=3 face="Times New Roman"><span style='font-size:12.0pt'> <o:p></o:p></span></font></p>


<p class=MsoNormal style='mso-margin-top-alt:auto;margin-bottom:12.0pt'><font

size=3 face=Verdana><span style='font-size:12.0pt;font-family:Verdana'>Somewhat

related issue:<br>

Since UIMA is seeing an increasing use within NLP community (both Information

Extraction and others such as Question/Answering), I wonder why another

standard as opposed to an interface between the UIMA type system and one of the

many existing standards. In other words, is there some work on representing the

information we extract in a way more computer-interpretable than RDBMS?<br>

<br clear=all>

Sincerely,<br>

Siddhartha Jonnalagadda, </span></font>Ph.D.<font face=Verdana><span

style='font-family:Verdana'><br>

</span></font><a href="http://sjonnalagadda.wordpress.com" target="_blank"><font

face=Verdana><span style='font-family:Verdana'>sjonnalagadda.wordpress.com</span></font></a><font

face=Verdana><span style='font-family:Verdana'><br>

<br>

<br>

</span></font><o:p></o:p></p>


<div>


<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><font

size=3 face="Times New Roman"><span style='font-size:12.0pt'>On Fri, Dec 9,

2011 at 10:39 AM, John F. Sowa <<a href="mailto:sowa@bestweb.net"

target="_blank">sowa@bestweb.net</a>> wrote:<o:p></o:p></span></font></p>


<div>


<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><font

size=3 face="Times New Roman"><span style='font-size:12.0pt'>Before making a

firm commitment to any notation as a standard for NLP,<br>

I suggest that you poll computational linguists and ask them what they<br>

would prefer for their work.  Among the questions you could ask is to<br>

look at those five serializations and check which one(s) they prefer.<br>

<br>

Corpora List is a good place to start such a poll.<o:p></o:p></span></font></p>


</div>


</div>


<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><font

size=3 face="Times New Roman"><span style='font-size:12.0pt'> <o:p></o:p></span></font></p>


</div>


</div>


</div>


</div>


</div>


<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'><o:p> </o:p></span></font></p>


</div>


</body>


</html>