[Corpora-List] corpus of mathematical equations

Jason Eisner jason at cs.jhu.edu
Thu Jan 10 16:25:41 UTC 2008


Here is a small corpus of automatically generated formal mathematical
proofs paired with their "verbalizations" into English (I believe):
   http://www.cs.cornell.edu/Info/Projects/NuPrl/html/nlp/

Also, you might be able to get a corpus of papers that contain TeX
equations, if the TeX markup language itself constitutes sufficient
markup for your purposes.  (It reveals the recursive subconstituents
of a formula, although it doesn't attach any semantics to them.  So
it's certainly a lot more informative than an scanned image of an
equation!)  For example, the digital library at arXiv.org used to ask
authors to submit their original TeX / LaTeX / AMSTeX files when
adding a paper.

-cheers, jason

On Jan 10, 2008 9:07 AM, Mary Hearne <mhearne at computing.dcu.ie> wrote:
> Hi all,
>
> on behalf of my colleague, Dónal Fitzpatrick:
>
> Do you know of any kind of corpus of mathematical equations where the constituent parts are tagged
> in any meaningful way?  I am uncertain as to:
> 1.  How the parts of an equation could be tagged
> or
> 2.  whether this has been done before.
>
> If you would like to contact him directly, Dónal's e-mail address is dfitzpat at computing.dcu.ie.
>
> Best regards,
> Mary Hearne
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list