Corpora: Html Concordancing?

John Milton lcjohn at ust.hk
Tue May 9 11:24:55 UTC 2000


I've been using a shareware program (HTMASC) to strip html tags before
concordancing: download from http://www.bitenbyte.com. If you're looking
for a concordancer that's wordlist-driven and includes text to speech
(good for students), try WordPilot: download at http://www.compulang.com/

.......................
John Milton
lcjohn at ust.hk
On Tue, 9 May 2000, Maritza vd Heuvel wrote:

> Hi
>
> Let me start off by introducing myself. I'm a postgrad researcher
> working on a lexicon for the speech recogntion component of a
> spoken dialogue system. The electronic material available for use
> in corpora and for concordancing purposes is very limited and one
> of our options is using web sites containing relevant information to
> generate word lists. Does anyone know of a concordancing tool
> that allows concordancing of files that contain html tags without
> first requiring conversion of the html into a text format?
>
> Thanks!
> Maritza van den Heuvel
>
> *****************************************
>
> Maritza van den Heuvel
> Research Unit for Experimental Phonology (RUEPUS)
> University of Stellenbosch
>
> Tel: 021-808 3974
> Fax: 021-808 3975
>
>



More information about the Corpora mailing list