<html><head><style type="text/css"><!-- DIV {margin:0px;} --></style></head><body><div style="font-family:times new roman,new york,times,serif;font-size:12pt"><div>Hi,<br><br>There is no error now. I have changed the parameter file using lantin-1 file now. Thank you very much.<br>( I did not pay attention to lantin-1 parameter file)<br><br><br>Regards<br>Samir<br></div><div style="font-family: times new roman,new york,times,serif; font-size: 12pt;"><br><div style="font-family: arial,helvetica,sans-serif; font-size: 13px;"><font face="Tahoma" size="2"><hr size="1"><b><span style="font-weight: bold;">De :</span></b> Alberto Simões <albie@alfarrabio.di.uminho.pt><br><b><span style="font-weight: bold;">À :</span></b> Samir Bilal <samirbilal2@yahoo.fr><br><b><span style="font-weight: bold;">Cc :</span></b> corpora@uib.no<br><b><span style="font-weight: bold;">Envoyé le :</span></b> Lun 27 décembre 2010, 23h 17min 04s<br><b><span
style="font-weight: bold;">Objet :</span></b> Re: Re : [Corpora-List] Invalid UTF8 character encountered! with Treetagger french parameter file<br></font><br><br><br>On 27/12/2010 22:15, Samir Bilal wrote:<br>> Hi,<br>><br>> I open the file with Notepad++, it detects ANSI encoding.<br><br>Then try with the other parameter file available on treetagger website <br>(that does not include the 'utf8' in the name).<br><br>Or force Notepad++ to save the file in UTF8 (use save as. As precaution, <br>save with other name)<br><br>Cheers<br><br>><br>> Regards<br>><br>> ------------------------------------------------------------------------<br>> *De :* Alberto Simões <<a ymailto="mailto:albie@alfarrabio.di.uminho.pt" href="mailto:albie@alfarrabio.di.uminho.pt">albie@alfarrabio.di.uminho.pt</a>><br>> *À :* <a ymailto="mailto:corpora@uib.no" href="mailto:corpora@uib.no">corpora@uib.no</a><br>> *Envoyé le :* Lun 27
décembre 2010, 22h 53min 12s<br>> *Objet :* Re: [Corpora-List] Invalid UTF8 character encountered! with<br>> Treetagger french parameter file<br>><br>> One first suggestion would be to recheck if your input file is in UTF8<br>> encoding.<br>><br>> Try opening the text file in an editor like Notepad++ and check what<br>> encoding it detects.<br>><br>> cheers<br>><br>> On 27/12/2010 21:43, Samir Bilal wrote:<br>> > Hi,<br>> ><br>> > I am testing the cureent POS taggers for the french languague. For<br>> > Treetagger I have an error in some case.<br>> > For a sentence with accent(for example:" l' étiqueteur se bloque".) , I<br>> > encounter this error :<br>> ><br>> > Invalid UTF8 character encountered!<br>> > because of the accent with é.<br>> > But if the sentence has no accent character, the tagger
works well.<br>> ><br>> > I use the french parameter file at<br>> ><br>> <a href="ftp://ftp.ims.uni-stuttgart.de/pub/corpora/french-par-linux-3.2-utf8.bin.gz" target="_blank">ftp://ftp.ims.uni-stuttgart.de/pub/corpora/french-par-linux-3.2-utf8.bin.gz</a><br>> > .<br>> > My OS is Windows XP.<br>> ><br>> > Can anybody help me?<br>> ><br>> > Regards<br>> > Samir<br>> ><br>> ><br>> ><br>> > ------------------------------------------------------------------------<br>> > *De :* DJamé Seddah <<a ymailto="mailto:djame.seddah@free.fr" href="mailto:djame.seddah@free.fr">djame.seddah@free.fr</a> <mailto:<a ymailto="mailto:djame.seddah@free.fr" href="mailto:djame.seddah@free.fr">djame.seddah@free.fr</a>>><br>> > *À :* Samir Bilal <<a
ymailto="mailto:samirbilal2@yahoo.fr" href="mailto:samirbilal2@yahoo.fr">samirbilal2@yahoo.fr</a> <mailto:<a ymailto="mailto:samirbilal2@yahoo.fr" href="mailto:samirbilal2@yahoo.fr">samirbilal2@yahoo.fr</a>>><br>> > *Envoyé le :* Dim 26 décembre 2010, 0h 47min 27s<br>> > *Objet :* Re: Re : [Corpora-List] Looking for free french POS tagger.<br>> ><br>> > Hi, in that case I'll recommand to use<br>> > morfette as it provides windows binaries and pretrained models.<br>> ><br>> > input format (unix line separator)<br>> > one word per line<br>> > one blank line to separate sentences<br>> > and all in utf8<br>> ><br>> > use this command<br>> > c:|whereverver/morfette predict MODELNAME < input > output.tagged<br>> ><br>> ><br>> > Djamé<br>>
><br>> ><br>> ><br>> > Le 25 déc. 2010 à 23:43, Samir Bilal a écrit :<br>> ><br>> > > Hi,<br>> > ><br>> > > Thank you very much. My operating system is Window XP. I did not<br>> > succed to run<br>> > > MeLT on it yet.Plesae can you help me?<br>> > > It will be wonderful, if I can use it on python program also.<br>> > ><br>> > ><br>> > > Many thanks<br>> > > Samir<br>> > ><br>> > ><br>> > ><br>> > ><br>> > > ________________________________<br>> > > De : DJamé Seddah <<a ymailto="mailto:djame.seddah@free.fr" href="mailto:djame.seddah@free.fr">djame.seddah@free.fr</a><br>> <mailto:<a ymailto="mailto:djame.seddah@free.fr"
href="mailto:djame.seddah@free.fr">djame.seddah@free.fr</a>> <mailto:<a ymailto="mailto:djame.seddah@free.fr" href="mailto:djame.seddah@free.fr">djame.seddah@free.fr</a><br>> <mailto:<a ymailto="mailto:djame.seddah@free.fr" href="mailto:djame.seddah@free.fr">djame.seddah@free.fr</a>>>><br>> > > À : <a ymailto="mailto:corpora@uib.no" href="mailto:corpora@uib.no">corpora@uib.no</a> <mailto:<a ymailto="mailto:corpora@uib.no" href="mailto:corpora@uib.no">corpora@uib.no</a>> <mailto:<a ymailto="mailto:corpora@uib.no" href="mailto:corpora@uib.no">corpora@uib.no</a><br>> <mailto:<a ymailto="mailto:corpora@uib.no" href="mailto:corpora@uib.no">corpora@uib.no</a>>><br>> > > Envoyé le : Sam 25 décembre 2010, 22h 54min 42s<br>> > > Objet : Re: [Corpora-List] Looking for free french POS tagger.<br>> > ><br>> > > Hi,<br>> > > There're
also two state-of-the-art data driven pos tagger available<br>> > ><br>> > > MeLT<br>> > > <a href="https://gforge.inria.fr/frs/download.php/27240/melt-0.6.tar.gz" target="_blank">https://gforge.inria.fr/frs/download.php/27240/melt-0.6.tar.gz</a><br>> > > and<br>> > > Morfette (which also provides a data driven lemmatizer)<br>> > > <a href="http://sites.google.com/site/morfetteweb/" target="_blank">http://sites.google.com/site/morfetteweb/</a><br>> > ><br>> > > both provide training models from the French Treebank (tagset CC,<br>> > around 97.6 -<br>> > > 98% of accuracy, the one to use for stat parsing ) and for a richer<br>> > tagset<br>> > > (tagset max, around 92-94%)<br>> > ><br>> > ><br>> > > Best,<br>> >
><br>> > > Djamé<br>> > ><br>> > ><br>> > ><br>> > > Le 25 déc. 2010 à 19:35, Samir Bilal a écrit :<br>> > ><br>> > >> Hi everybody,<br>> > >><br>> > >> I am looking for a free french POS tagger.<br>> > >><br>> > >> Thank you<br>> > >> Samir<br>> > >><br>> > >><br>> > >> _______________________________________________<br>> > >> Corpora mailing list<br>> > >> <a ymailto="mailto:Corpora@uib.no" href="mailto:Corpora@uib.no">Corpora@uib.no</a> <mailto:<a ymailto="mailto:Corpora@uib.no" href="mailto:Corpora@uib.no">Corpora@uib.no</a>> <mailto:<a ymailto="mailto:Corpora@uib.no" href="mailto:Corpora@uib.no">Corpora@uib.no</a><br>> <mailto:<a
ymailto="mailto:Corpora@uib.no" href="mailto:Corpora@uib.no">Corpora@uib.no</a>>><br>> > >> <a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>> > ><br>> > ><br>> > > _______________________________________________<br>> > > Corpora mailing list<br>> > > <a ymailto="mailto:Corpora@uib.no" href="mailto:Corpora@uib.no">Corpora@uib.no</a> <mailto:<a ymailto="mailto:Corpora@uib.no" href="mailto:Corpora@uib.no">Corpora@uib.no</a>> <mailto:<a ymailto="mailto:Corpora@uib.no" href="mailto:Corpora@uib.no">Corpora@uib.no</a><br>> <mailto:<a ymailto="mailto:Corpora@uib.no" href="mailto:Corpora@uib.no">Corpora@uib.no</a>>><br>> > > <a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>> > ><br>>
> ><br>> > ><br>> ><br>> ><br>> ><br>> ><br>> > _______________________________________________<br>> > Corpora mailing list<br>> > <a ymailto="mailto:Corpora@uib.no" href="mailto:Corpora@uib.no">Corpora@uib.no</a> <mailto:<a ymailto="mailto:Corpora@uib.no" href="mailto:Corpora@uib.no">Corpora@uib.no</a>><br>> > <a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>><br>> --<br>> Alberto Simões<br>><br>> _______________________________________________<br>> Corpora mailing list<br>> <a ymailto="mailto:Corpora@uib.no" href="mailto:Corpora@uib.no">Corpora@uib.no</a> <mailto:<a ymailto="mailto:Corpora@uib.no" href="mailto:Corpora@uib.no">Corpora@uib.no</a>><br>> <a href="http://mailman.uib.no/listinfo/corpora"
target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>><br><br>-- <br>Alberto Simões<br></div></div>
</div><br>
</body></html>