[Corpora-List] discussion on reproducibility at ACL 2011 business meeting

Mike Maxwell maxwell at umiacs.umd.edu
Mon Jul 4 02:45:35 UTC 2011


On 7/3/2011 9:46 PM, Ruvan Weerasinghe wrote:
> May be we can address some of the issues raised by talking to the
> Biology (Bioinformatics) people who seem to make publishing data and
> code a precondition for publication?

Speaking of code+data publication in other fields, should anyone want to 
track down some of this, here's an excerpt from a paper I have submitted 
(but which has not yet been accepted for publication).  It addresses 
publishing data and code from the perspective of Literate Programming, 
which of course is not the only way this could be done.  (And yes, Ted 
Pederson's "Empiricism Is Not a Matter of Faith" gets mentioned too :-).)
-----------------
Literate Programming has been used to publish reproducible research in 
geophysics (Claerbout and Karrenbach 1992), bioinformatics (Gentleman 
2004, Hothorn 2011), epidemiology (Peng, Dominici and Zeger 2006), 
signal processing (Buckheit and Donoho 1995; Vandewalle, Kovačević, and 
Vetterli 2009), statistics (Leisch 2002, Donoho et al 2009, Lenth and 
Højsgaard 2011), econometrics (Koenker and Zeileis 2009), and other fields.
...
Some journals have... encourag[ed] authors to submit their articles as 
true reproducible research using Literate Programming: the Annals of 
Internal Medicine (Laine et al 2007); Biostatistics (Peng 2009); The 
Insight Journal (see http://www.insight-journal.org/); Computing in 
Science and Engineering (Fomel and Claerbout. 2009); and IEEE 
Transactions on Signal Processing (see 
http://www.signalprocessingsociety.org/publications/periodicals/tsp/). 
The publisher Elsevier has sponsored the “Executable Paper Grand 
Challenge” at the The International Conference on Computational Science 
in Singapore in June 2011. For linguistics, the new Journal of 
Experimental Linguistics 
(http://www.elanguage.net/journals/index.php/jel/index), part of the 
LSA’s eLanguage initiative, could begin to address these issues.
------------------

Here are the citations mentioned above (if I leave one out, ask and it 
shall be given):
------------------
Claerbout, Jon, and Martin Karrenbach. 1992. Electronic Documents Give 
Reproducible Research a New Meaning. In Proc. 62nd Ann. Int. Meeting of 
the Society of Exploration Geophysics, 601–604. 
http://sepwww.stanford.edu/doku.php?id=sep:research:reproducible:seg92.

Gentleman, Robert, and Duncan Temple Lang. 2004. “Statistical Analyses 
and Reproducible Research.” Bioconductor Project Working Papers Working 
Paper 2. http://www.bepress.com/bioconductor/paper2.

Hothorn, Torsten, and Friedrich Leisch. 2011. “Case studies in 
reproducibility.” Briefings in Bioinformatics. doi:10.1093/bib/bbq084. 
http://bib.oxfordjournals.org/content/early/2011/01/28/bib.bbq084.abstract.

Peng, Roger D, Francesca Dominici, and Scott L Zeger. 2006. 
“Reproducible Epidemiologic Research.” American Journal of Epidemiology 
163: 783–789. doi:10.1093/aje/kwj093.

Buckheit, Jonathan, and David L. Donoho. 1995. WaveLab and Reproducible 
Research. In , 55–81. Springer-Verlag. 
http://www-stat.stanford.edu/~wavelab/Wavelab_850/wavelab.pdf.

Vandewalle, Patrick, Jelena Kovačević, and Martin Vetterli. 2009. 
“Reproducible Research in Signal Processing—What, why, and how.” IEEE 
Signal Processing Magazine 26: 37–47. doi:10.1109/MSP.2009.932122.

Leisch, Friedrich. 2002. Sweave: Dynamic Generation of Statistical 
Reports Using Literate Data Analysis. In Compstat 2002 — Proceedings in 
Computational Statistics, ed. Wolfgang Härdle and Bernd Rönz, 575–580. 
Physica Verlag, Heidelberg. http://www.stat.uni-muenchen.de/ leisch/Sweave.

Donoho, David L., Arian Maleki, Inam Ur Rahman, Morteza Shahram, and 
Victoria Stodden. 2009. “Reproducible Research in Computational Harmonic 
Analysis.” Computing in Science and Engineering 11: 8–18. 
doi:http://doi.ieeecomputersociety.org/10.1109/MCSE.2009.15.

Lenth, Russell, and Søren Højsgaard. 2011. “Reproducible statistical 
analysis with multiple languages.” Computational Statistics: 1–8. 
doi:10.1007/s00180-011-0245-5.

Koenker, Roger, and Achim Zeileis. 2009. “On reproducible econometric 
research.” Journal of Applied Econometrics 24 : 833–847. 
doi:10.1002/jae.1083.

Laine, Christine, Steven N Goodman, Michael E Griswold, and Harold C 
Sox. 2007. “Reproducible Research: Moving toward Research the Public Can 
Really Trust.” Annals of Internal Medicine 146: 450–453.

Peng, Roger D. 2009. “Reproducible research and Biostatistics.” 
Biostatistics 10: 405–408. doi:10.1093/biostatistics/kxp014.

Fomel, S., and J. F Claerbout. 2009. “Guest Editors’ Introduction: 
Reproducible Research.” Computing in Science Engineering 11: 5–7. 
doi:10.1109/MCSE.2009.14.
------------------
-- 
	Mike Maxwell
	maxwell at umiacs.umd.edu
	"My definition of an interesting universe is
	one that has the capacity to study itself."
         --Stephen Eastmond

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list