[Corpora-List] TextGraphs-6 and semantic networks

John F. Sowa sowa at bestweb.net
Sun Mar 13 19:09:05 UTC 2011


On 3/12/2011 4:15 PM, Nathan Hu wrote:
> The bottleneck currently is how can we get the high-precision results of
> coreference resolution to build completed Conceptual Graphs from texts.

That is the bottleneck that has plagued every version of formal
semantics from Montague to the present.  Logicians publish papers
with toy sentences like "John seeks a unicorn".  But the fatal flaw
is that they usually assume Frege's principle:

    The meaning of a sentence is completely determined
    by the meaning of its symbols and the syntax for
    combining those symbols.

Natural languages violate Frege's constraint in multiple ways.
In general, the meaning of a sentence depends critically on context-
dependent factors, such as the time and place of utterance, the
speaker (or writer), the listener (or reader), their background
knowledge, their intentions, the speaker's guess about the listener's
knowledge and intentions, the listener's guess about the speaker, etc.

Linguists and logicians have published many excellent analyses
of each of those issues.  But NL texts that occur "in the wild"
violate Frege's principle in many different and highly creative
ways -- sometimes several different ways in a single sentence.

One of the famous epigrams of programming by Alan Perlis:

   "One can't proceed from the informal to the formal by formal means."

There are some well-written texts that are sufficiently precise that
they can be translated to a formal logic.  For example, Naproche
(NAtural-language PROof CHEcker) maps a mathematical proof stated
in English to logic and checks the proof ( http://naproche.net/ ).

What makes that English precise is that the author (a mathematician)
(1) has a precise formal semantics in mind and (2) makes an effort
to describe it clearly.  Very few texts meet both of those criteria.

Programming languages are just as formal as mathematics, but programmers
are notoriously lazy about documenting what they do in any language.
When they do, the results look like Slide 27 in the talk I mentioned:

    http://www.jfsowa.com/talks/pursue.pdf

That language cannot be translated to the formal language of the
original program (COBOL, in this case).  However, if you *start*
with the COBOL and map it to a formal notation (in this case to
conceptual graphs), you can generate precise, formal graphs.

With the usual methods of NLP, you can map informal English to graphs
that are just as informal as the English.  Then with suitable graph-
matching algorithms, you can find an approximate match of the informal
graphs to the formal graphs (assuming, of course, that you have some
independent source for the formal graphs -- and that's a very big
assumption that is often difficult or impossible to satisfy).

Note that this method does not violate the epigram by Perlis:
the precision does *not* come from English, but from COBOL
(or some other source for the formal graphs).

> As my understanding, you build an index for conceptual graphs.
> Does this method work for sub-graph matching?

Yes.  There are many different, but related algorithms for doing
such indexing and searching.  I cited some based on methods for
chemical graphs.  Those applications *require* searching and
finding subgraphs.

For pharmaceutical applications, they want to find chemicals that
have the same active subgraph (the critical part for some drug) but
may have different molecular structures attached to that subgraph.
That's very similar to the requirements for NLP.

John

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list