[Corpora-List] adaptable FSM (finite state machine) for NLP

Albretch Mueller lbrtchx at gmail.com
Thu Jul 3 18:27:29 UTC 2008


 As we well know, there are many languages that are very similar. In
many cases, to a large extent if not fully, they share the same
alphabet, syntax rules and even phonemes

 So, since parsers are essentially fed text sequentially (and
naturally so) I wonder what are the strategies developed out there for
pluggable parsing strategies (depth or breath first), some lexicon
(which does not have to be totally complete) and rules describing the
generative possibilities of this lexicon

 As you could tell I am not a linguist myself, but after reading James
Allen's Natural Lang Understanding, in which he, even if the theory is
general, exclusively uses plenty of examples of English, I think such
an "English Grammar file" may not be that difficult to device and if
you do it for English I could easily imagine that there are such files
for other NL which definitely are less fractured/more homogeneous

 Where can you find actual well-formed, declarative description of
some NL grammar including language features, constrains and
everything( possible ;-)) in XML format or Backus-Naur form or such
in-depth theoretical studies?

 thanks
 lbrtchx

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list