[Corpora-List] How to create a corpus

Martin Reynaert reynaert at uvt.nl
Tue Oct 9 12:13:19 UTC 2012


On 10/09/2012 01:37 PM, Krishnamurthy, Ramesh wrote:
> I think you need to:
>
> a) consider copyright issues, if you intend to use the corpus for commercial purposes
>    
Hi Meganathan,

A corpus is only really useful if it can be shared, at least for 
research purposes, if not also commercial ones.

To be able do that, Intellectual Property Rights (IPR) issues need to be 
settled.

How we have gone about that for the 500 million word reference corpus of 
contemporary written Dutch SoNaR is detailed in:

De Clercq O. and Reynaert M., (2010), SoNaR Acquisition Manual 
<http://taalunieversum.org/taal/technologie/stevin/documenten/sonar_manual.pdf>, 
LT3 Technical Report LT3 10-02, Hogeschool Gent, Gent, Belgium

http://lt3.hogent.be/media/uploads/publications/2010/DeClercq2010a.pdf

Success!

Martin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20121009/2421d57e/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list