[Corpora-List] discussion on reproducibility at ACL 2011 business meeting

John F. Sowa sowa at bestweb.net
Mon Jul 4 04:00:06 UTC 2011


> Maybe we can address some of the issues raised by talking to the
> Biology (Bioinformatics) people who seem to make publishing data and
> code a precondition for publication?

Encouraging publication of an executable version is desirable.
But sometimes an article addresses some aspect of a system that
is too large to be published in its entirety.

For example, the IBM Watson system for Jeopardy is a large, complex
system that runs on a supercomputer.  Very few people would be able
to run the programs.  Furthermore, it's not clear what they could
learn from running them.  Following is an article that describes
the overall design.

http://www.cccblog.org/2011/06/07/watsons-lead-developer-deep-analysis-speed-and-results/

I would like to read a collection of articles that describe each aspect
of the design, some of which would require more detail than others.
But I can't imagine what most readers would do with the software,
if they could download and run it.

They could try typing Jeopardy questions to it, but that wouldn't
give them much more insight than watching the game show.  And the
various components wouldn't give much insight if run separately.

I believe that people who submit a paper should be encouraged
to publish the programs, when it is practical to do so.  But for
many kinds of systems, readers can learn more from a well-written
article than they could by running the programs or plowing through
the source code.

The question of whether publishing the source code would be useful
should be decided on a case-by-case basis by the reviewers of a paper.

John Sowa




_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list