[Corpora-List] SCANNED TEXTS ARE VALID FOR CORPORA PURPOSES?

williams geoffrey.williams at wanadoo.fr
Fri Aug 1 10:26:55 UTC 2008


Dear J.L

If they are not OCRed, I fail to see how you will use a concordancer on
them, and such tools are really the mainstay of corpus linguistics. In
some senses of the word 'corpus' they could be considered a 'corpus',
that is a collection of texts, but in corpus linguistics a corpus needs
to be queriable with a concordancer.

Best

Geoffrey


Le jeudi 31 juillet 2008 à 13:38 -0700, J.L. DeLucca a écrit :
> Dear friends,
> 
> In the digital world there are the digital libraries like the "
> Gallica, Bibliothèque nationale de France digital library "
> that works with scanned texts NO OCR treatment or the Ebook projects
> that works wirh full texts. well,I want to know if you would consider
> scanned texts NO OCR treatment as digital corpora, especially oldest
> texts.
> 
> Thank you for your advice.
> 
> J.L. De Lucca
> 
> 
> 
> 
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora



_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list