[Corpora-List] Need POST system for French and English

Michele Filannino michele.filannino at cs.manchester.ac.uk
Wed May 9 13:47:27 UTC 2012


Dear Imad,

voilà: http://norvig.com/spell-correct.html

Let me know.

Bye,
Michele Filannino.

CDT PhD student in Computer Science
Room IT301 - IT Building
The University of Manchester
filannim at cs.manchester.ac.uk

On Wed, May 9, 2012 at 2:19 PM, imad eddin Jerbi <jerbi.imad.eddin at gmail.com
> wrote:

> *Dear Michele,*
>
> Thank you for your help in this matter, could you please give me more
> information about how to write a spell-checker corrector and how it works.
> Thank you in advance.
> Have a nice day.
>
> *Best regards.*
>
> 2012/5/9 Michele Filannino <michele.filannino at cs.manchester.ac.uk>
>
>> If you want to solve that kind of problems you could easily write a
>> spell-checker corrector using a language model that considers subparts of
>> each word. The pattern "you -> u" will emerge. Alternatively, if you have a
>> constrained vocabulary you could use Damerau-Levenshtein distance measure<http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance> among
>> words.
>>
>> Bye,
>> Michele Filannino.
>>
>> CDT PhD student in Computer Science
>> Room IT301 - IT Building
>> The University of Manchester
>> filannim at cs.manchester.ac.uk
>>
>>
>> On Wed, May 9, 2012 at 9:04 AM, Renaud Richardet <
>> renaud.richardet at epfl.ch> wrote:
>>
>>> Dear Imad,
>>>
>>> You can ask Nicolas Hernandez (see
>>> http://www.mail-archive.com/opennlp-users@incubator.apache.org/msg00564.html)
>>> for POS taggers in french.
>>>
>>> Regarding "compyouter", that might be more difficult to map…
>>>
>>> All the best, Renaud
>>>
>>>
>>> --
>>> Renaud Richardet
>>> Blue Brain Project  PhD candidate
>>> EPFL  Station 15
>>> CH-1015 Lausanne
>>>
>>>
>>> On Wed, May 9, 2012 at 4:35 AM, imad eddin Jerbi <
>>> jerbi.imad.eddin at gmail.com> wrote:
>>>
>>>> *Dear Corpora Subscribers,*
>>>>
>>>> My name is Imad Eddin Jerbi, doing my master's thesis at Faculty of
>>>> Economics and Management of Sfax, Tunisia.
>>>> I am working on construction and morphosyntactic annotation of a
>>>> Tunisian dialect corpus.
>>>> I need a free and open source (JAVA) part of speech tagging system for
>>>> French and English.
>>>> This system has to do a linguistic correction first, because the input
>>>> could be an incorrect word.
>>>> *Example:*
>>>> Arabic Dialect: “كَمْبْيُوتَرْ “this word is original English language,
>>>> I converted to Latin characters using SAMPA for Arabic: “compyouter”
>>>> So, the system have to correct the input word “compyouter” to computer,
>>>> and then give us at the output the possible morphosyntactic annotation.
>>>> I would be very grateful if you could give me a names list of the best
>>>> available systems.
>>>> Thank you in advance.
>>>> Email: jerbi.imad.eddin at gmail.com
>>>>
>>>> *Best regards, *
>>>>
>>>> --
>>>>
>>>> Imad Eddin JERBI
>>>>
>>>> Student at Faculty of Economics and Management of Sfax
>>>>
>>>> http://www.fsegs.rnu.tn/
>>>>
>>>>
>>>> ANLP Research Group
>>>> http://sites.google.com/site/anlprg
>>>>
>>>> MIRACL Laboratory
>>>> www.miracl.rnu.tn
>>>>
>>>>
>>>> Page Web: https://sites.google.com/site/jerbiimadeddinanlp/
>>>> Email: jerbi.imad.eddin at gmail.com
>>>> Adress: El Wahheb, Chebba : 5170 - Mahdia - TUNISIE.
>>>> Gsm:  +216 55688555
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>>>> Corpora mailing list
>>>> Corpora at uib.no
>>>> http://mailman.uib.no/listinfo/corpora
>>>>
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>>> Corpora mailing list
>>> Corpora at uib.no
>>> http://mailman.uib.no/listinfo/corpora
>>>
>>>
>
>
> --
>
> Imad Eddin JERBI
>
> Student at Faculty of Economics and Management of Sfax
>
> http://www.fsegs.rnu.tn/
>
>
> ANLP Research Group
> http://sites.google.com/site/anlprg
>
> MIRACL Laboratory
> www.miracl.rnu.tn
>
>
> Page Web: https://sites.google.com/site/jerbiimadeddinanlp/
> Email: jerbi.imad.eddin at gmail.com
> Adress: El Wahheb, Chebba : 5170 - Mahdia - TUNISIE.
> Gsm:  +216 55688555
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20120509/f23d0e36/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list