26.3643, Software: English; Portuguese; Spanish; Computational Linguistics; Semantics; Syntax; Text/Corpus Linguistics: SentiLecto 2.4

The LINGUIST List via LINGUIST linguist at listserv.linguistlist.org
Fri Aug 14 16:49:15 UTC 2015


LINGUIST List: Vol-26-3643. Fri Aug 14 2015. ISSN: 1069 - 4875.

Subject: 26.3643, Software: English; Portuguese; Spanish; Computational Linguistics; Semantics; Syntax; Text/Corpus Linguistics: SentiLecto 2.4

Moderators: linguist at linguistlist.org (Damir Cavar, Malgorzata E. Cavar)
Reviews: reviews at linguistlist.org (Anthony Aristar, Helen Aristar-Dry, Sara Couture)
Homepage: http://linguistlist.org

*****************    LINGUIST List Support    *****************
Please support the LL editors and operation with a donation at:
              http://funddrive.linguistlist.org/donate/

Editor for this issue: Andrew Lamont <alamont at linguistlist.org>
================================================================


Date: Fri, 14 Aug 2015 12:49:05
From: Fernando Balbachan [fernandobalbachan at gmail.com]
Subject: English; Portuguese; Spanish; Computational Linguistics; Semantics; Syntax; Text/Corpus Linguistics: SentiLecto 2.4

 
SentiLecto demo: http://dev.natural.do/sentilecto 

Sentilecto is a NLU engine that yields a highly fine-grained representation of complex texts. The pipeline starts by splitting text into sentences and clauses, then maps clauses into SVO slots just the way native spearker would understand natural language. SentiLecto leans on outstanding linguistic features such as: passive/active voice transformation, negation scope, anaphora resolution and co-reference chains, modality treatment, semantic features (animity and others) and accurate verbal frames for all Spanish verbs, even with 'se-impersonal' usages ('se mostraron retratos' = 'alguien mostró retratos' = 'somebody showed portraits'), 'se-clitic' usages (for example, plain action 'mostrar' 'to show something' vs. 'mostrarSE' 'to show yourself, namely to feel some way before a situation'). 

Also, SentiLecto can flawlessly identify whether or not an utterance is a real fact (fact mining) over which an opinion could span, and it can recognize & classify named-entities (NERC) with identity matching. 

Finally, SentiLecto better suits into entity-based Sentiment Analysis paradigm. Unlike other approaches, this solution can deal with polarity shifting in the same sentence ('I like chocolate but I hate strawberry ice-cream'), within embedded clauses ('Norwegians, who are an aggressive people, export the exquisite herring'), or even onto the very same word ('Somebody who wasted a chance to do something' means that person did something bad about something good). SentiLecto better represents the premise whereby the entities involved in the opinion are syntactically mapped onto SVO (subject-verb-object) slots for their sentiment assignments: 'Mary hates John' (2 entities but only the object has a negative presentation) vs. 'Mary defames John' (the same 2 entities but only the subject has negative presentation). 

SentiLecto is being used to automatically generate this blog http://entretenimientobit.com with more than 300 high-quality posts on a daily basis, rewriting and enriching content and, more interestingly, merging news covering the same facts. This is just a show case of SentiLecto's NLU capabilities. 

SentiLecto currently works only for Spanish, but soon it will be available for Brazilian Portuguese (1 month) and English (3 months) 

Looking forward to hearing about Linguists' feedback. 

Dr. Fernando Balbachan, Ph.D.
fernandobalbachan at gmail.com

Linguistic Field(s): Computational Linguistics
                     Semantics
                     Syntax
                     Text/Corpus Linguistics

Subject Language(s): English (eng)
                     Portuguese (por)
                     Spanish (spa)



----------------------------------------------------------
LINGUIST List: Vol-26-3643	
----------------------------------------------------------







More information about the LINGUIST mailing list