Corpora: Html Concordancing?

John Milton lcjohn at
Tue May 9 11:24:55 UTC 2000

I've been using a shareware program (HTMASC) to strip html tags before
concordancing: download from If you're looking
for a concordancer that's wordlist-driven and includes text to speech
(good for students), try WordPilot: download at

John Milton
lcjohn at
On Tue, 9 May 2000, Maritza vd Heuvel wrote:

> Hi
> Let me start off by introducing myself. I'm a postgrad researcher
> working on a lexicon for the speech recogntion component of a
> spoken dialogue system. The electronic material available for use
> in corpora and for concordancing purposes is very limited and one
> of our options is using web sites containing relevant information to
> generate word lists. Does anyone know of a concordancing tool
> that allows concordancing of files that contain html tags without
> first requiring conversion of the html into a text format?
> Thanks!
> Maritza van den Heuvel
> *****************************************
> Maritza van den Heuvel
> Research Unit for Experimental Phonology (RUEPUS)
> University of Stellenbosch
> Tel: 021-808 3974
> Fax: 021-808 3975

More information about the Corpora mailing list