Corpora: Peer evaluation of (web-based) corpora and other materials

Mark Davies mdavies at ilstu.edu
Wed Apr 5 22:19:48 UTC 2000


I hope this post is not too far off-topic, although it does pertain to the
type of (web-based) corpora that many of us are creating for use inside and
outside of academia. I also apologize for any cross-postings; I am sending
this to two other lists as well.

------------------

We are in the process of revising the promotion and evaluation procedures
in all of the departments at Illinois State University, and I have been
asked by a committee to get input from individuals at other institutions
concerning how non-peer-reviewed, web-based materials could/should be
evaluated by the institution.

Perhaps I can provide some concrete examples of the type of issues that the
committee is looking at.  In my case, I have created several online corpora
that have been used by researchers and students at other
institutions.  These include a "Polyglot Bible"
(http://mdavies.for.ilstu.edu/bible) that allows users to search for a word
in the entire Gospel of Luke in one of thirty languages and see all of the
hits, along with (most importantly) the parallel passages for other related
languages (eg. Gothic, Old English, Icelandic, German, etc), which allows
cross-linguistic comparison. (A more expanded version of this is also
available for just Latin, Old Spanish, and Modern Spanish
(http://mdavies.for.ilstu.edu/bible/span3.htm), and includes nearly the
entire Bible).

More important for the type of issues the committee is looking at, I have
created a searchable, web-based corpus of 3,000,000 words of historical
Spanish texts (1200s-1900s) (http://mdavies.for.ilstu.edu/corpus), and I
will soon start work on a web-based 100,000,000 word corpus of historical
Spanish, based in large part on other available electronic corpora, but
with enhanced search features and tied in with other linguistic tools (word
frequencies, dictionaries, bibliographical information, etc).  In each
case, the materials have been used by many researchers and students at
other institutions.

In the evaluation of materials such as these, the committee wants to know
what the procedures and policies are at other institutions.  For example:

1a) In general terms, are materials that are not peer-reviewed at the
outset (but rather are simple something that a researcher has created and
puts on the web, and only later receives some type of external validation)
considered for promotion and evaluation?

1b) If so, at what level are they considered -- that of books, journal
articles, book reviews, or potentially any of these levels, depending on
the quality of the materials?

2a) Since they are not peer-reviewed at the outset, is the faculty member
expected to provide documentation to show how they have been used and
accepted by peers at other institutions?

2b) If so, what form would this documentation take -- logfiles showing the
number of hits, email from many different users, comments from a selected
set of peers, etc.

3) Many of these materials would be used by both researchers _and_ students
at other institutions -- probably much more than a journal article, which
would be primarily used by other researchers.  Therefore, how can one avoid
"double-dipping", by including these materials in both the "scholarly" and
the "teaching" categories, for those institutions that organize things
thusly?  In other words, would developers need to document and prove that
one or the other groups (scholars / students) are the main users of the
resource?

I would very much appreciate your comments on any of these questions
(mdavies at ilstu.edu).  Although I will most likely just be summarizing the
responses for presentation to the committee, please feel free to indicate
if you would like your comments to be anonymous.

Thanks in advance for your help.

Mark Davies


=======================================
Mark Davies, Associate Professor, Spanish Linguistics
Dept. of Foreign Languages, Illinois State University
Normal, IL 61790-4300

Voice:309/438-7975      email:mdavies at ilstu.edu
Fax:309/438-8038          http://mdavies.for.ilstu.edu/personal/
=======================================



More information about the Corpora mailing list