[Corpora-List] help on document comparison for historians

António Branco Antonio.Branco at di.fc.ul.pt
Mon Jun 6 10:56:39 UTC 2011


yet another one from:

Sobha <sobhanair at yahoo.com>

"We have a similarity Analyser which can identify identical 
text/paragraph given two documents."





On 6/6/11 11:49 AM, António Branco wrote:
>
>
>
> Dear all,
>
>
> Following my previous message concerning the subject above,
> there were a number of replies from:
>
> Dina Demner Fushman <ddemner at mail.nih.gov>
>
> Eric Ringer <ringger at cs.byu.edu>
>
> Serge Heiden <slh at ens-lyon.fr>
>
> Paul D Clough <p.d.clough at sheffield.ac.uk>
>
> Tony Mcenery <eiaamme at exchange.lancs.ac.uk>
>
>
> I would like to thank you all for your help.
> The compilation of your suggestions follow below.
>
> Best regards,
>
> António
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
> = Tools suggested
>
>
> - Dina Demner Fushman <ddemner at mail.nih.gov>
>
> eTBLAST
> http://etest.vbi.vt.edu/etblast3/
>
>
>
> - Eric Ringer <ringger at cs.byu.edu>
>
> If running on Windows, download WinMerge from SourceForge.
> Similar tools exist for Mac and Linux/Unix.
>
>
>
>
> = Publications suggested
>
>
> - Serge Heiden <slh at ens-lyon.fr>
>
> Russell Horton, Mark Olsen, and Glenn Roe, "Something Borrowed: Sequence
> Alignment and the Identification of Similar Passages in Large Text
> Collections,", Digital Studies / Le Champ numérique (Forthcoming 2010).
>
> Russell Horton and Les Henderson, "Sequence Alignment and Similarity in
> Biology and the Humanities", Chicago Colloquium on Digital Humanities
> and Computer Science (DHCS), Northwestern University, November 2010.
>
> Mark Olsen, "From Words to Works: Machine Learning, Sequence Alignment
> and Text Mining at ARTFL", Computation Institute, University of Chicago,
> June 2010.
>
> Glenn Roe, Encyclopedic Intertextuality: Identifying Intertextual
> Relationships in the Encyclopédie using Sequence Alignment,"Knowledge
> Production, Technology, and Cultural Change: Colloquium on the Digital
> Encyclopédie" - University of Minnesota, April 23-24, 2009.
>
> Russell Horton and Mark Olsen, "Sequence Alignment, Shared Services, and
> Digital Humanities", Project Bamboo Workshop, Tucson, Arizona, January
> 2009.
>
> All from http://artfl-project.uchicago.edu/content/papers-and-presentations
>
>
>
> - Paul D Clough <p.d.clough at sheffield.ac.uk>
>
> Collating Texts Using Progressive Multiple Alignment Matthew Spencer
> and Christopher J. Howe Computers and the Humanities Vol. 38, No. 3
> (Aug., 2004), pp. 253-270
> http://www.jstor.org/pss/30204940
>
> And this:
> http://opus.bibliothek.uni-wuerzburg.de/volltexte/2011/5660/pdf/Nassourou_DesignArchitectureCollationSystem.pdf
>
>
>
>
> - Tony McEnery <eiaamme at exchange.lancs.ac.uk>
>
> A couple of papers that you may find of interest looking at this very
> issue are listed below. The work was done using a tool developed by
> Scott Piao (based on work he was involved in at Sheffield):
>
> Hardie, A, McEnery, T, and Piao, S. (2010) ?A corpus-based approach to
> text reuse in the newsbooks of the Commonwealth? in Dooley, B (ed.) The
> Dissemination of News and the Emergence of Contemporaneity in Early
> Modern Europe, Ashgate, Farnham, pp 251-286.
>
> Hardie, A and McEnery, T (2009) (2009) ?Corpus linguistics and
> historical contexts: text reuse and the expression of bias in early
> modern English journalism?, in R. Bowen, M. Mobärg and S. Ohlander
> (eds) Corpora and discourse ? and stuff: papers in honour of Karin
> Aijmer, Gothenburg Studies in English 96, Acta Universitatis
> Gothoburgensis, Göteborg, pp. 59-92.
>
>
>
>
>
> le 01/06/2011 19:02 Selon Ant—onio Branco:
>>
>>
>>
>> Dear all,
>>
>> A friend of mine is working on medieval history and would
>> like to find a (user-friendly) tool that could help her with
>> the following functionality: one enters different documents and
>> the tool will deliver the excerpts (may be of several paragraph
>> length) that are identical across documents.
>>
>> Any hint or help will be most welcome. Please reply to me.
>> I'll post a summary.
>>
>> Kind regards,
>>
>> António Branco
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>


_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list