[Corpora-List] To segment HTML document?
Chris Jordan
cjordan at cs.dal.ca
Tue Oct 25 11:35:44 UTC 2005
Hey Imen,
Sounds like you are writing a crawler in Java. If so why reinvent the
wheel? There are plenty of open source ones lying around.
ismi.touati wrote:
> Dear all,
>
> Does anyone know of :
> - program to segment HTML documents (web pages),
> - command java that can connect to a web page on the internet
> having his URL.
>
> Thanks
>
> All the best
>
> Imen.
>
> //****************************//
> Imen Touati
> Master Student at Faculty of Economic Science and management of sfax,
> Tunisia.
> LARIS laboratory
> Addresse : LARIS, FSEGS, BP 1088, 3018 Sfax, Tunisia
> Tel : (216) 74 27 87 77
> e-mail : ismi.touati at laposte.net <mailto:ismi.touati at laposte.net>
>
>
> /Accédez au courrier électronique de La Poste : www.laposte.net ;/
> /3615 LAPOSTENET (0,34 /mn) ; tél : 08 92 68 13 50 (0,34/mn)/
More information about the Corpora
mailing list