[Corpora-List] POS-tagger maintenance and improvement

Andras Kornai andras at kornai.com
Thu Feb 26 20:50:18 UTC 2009


On Thu, Feb 26, 2009 at 09:33:45PM +0100, Francis Tyers wrote:
> It does not allow derivative works. So for example if I want to take the 
> corpus and add some fancy new markup to it, I could not redistribute it[1] under a
> free software licence (BSD, LGPL, GPL, ...) for others to benefit.
> 
> 1. For example put it in a public revision control system.

Yes, absolutely true, you can't redistribute LDC corpora. (I think we
actually retained the right for us to distribute Hunglish, but so far
had no reason to exersise it.) However, to get back to the main point,
if you spotted errors and created diffs, a clearinghouse could hold
the diff in CVS (this is your work, and is clearly de minimis, so you
can LGPL or BSD license it), making it trivial for future users to
pull this down and patch the corpus. A repository of this sort makes
good sense, if you(all) have patches you are willing to contribute drop
me a line, maybe we will set something up after all. 

Andras Kornai

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list