[Corpora-List] Simple instructions to scale a java application?

Ashish Almeida ashishfa at gmail.com
Mon May 23 05:46:12 UTC 2011


Hi Siddhartha,
 you could use a Hadoop map reduce to solve your problem.

In Map-reduce, your code will be part of MAP and you can use default reduce
.. Ii is easy to use.
If you want quicker solution do not use hadoop api but use python and pipes
concept.

pls refer to this tutorial
http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/


Date: Sat, 21 May 2011 15:14:15 -0700
> From: Siddhartha Jonnalagadda <sid.kgp at gmail.com>
> Subject: [Corpora-List] Simple instructions to scale a java
>        application?
> To: corpora <corpora at uib.no>
>
> I have a single threaded java (NLP) application that processes 1000
> sentences in 1 hour. I obviously can't wait for 1000 hours to process
> million sentences. Are there any simple instructions to make my program run
> in 100 servers at a time? This involves migrating the project workspace
> into
> each of them (or create them from a snapshot that contains it) and
> concatenate the output that each server produces.
>
> Any quick pointers, please? I spent couple of hours browsing through Amazon
> MapReduce documentation, but that didn't take me as far...
>
> Since I don't own shares in Amazon, I am open to non-Amazon solutions too.
>
> Sincerely,
> Siddhartha Jonnalagadda,
> Text mining Researcher, Lnx Research, LLC, Orange, CA
> sjonnalagadda.wordpress.com
>
>
> Confidentiality Notice:
>
> This e-mail message, including any attachments, is for the sole use of the
> intended recipient(s) and may contain confidential and privileged
> information. Any unauthorized review, use, disclosure or distribution is
> prohibited. If you are not the intended recipient, please contact the
> sender
> by reply e-mail and destroy all copies of the original message.
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: text/html
> Size: 3175 bytes
> Desc: not available
> URL: <
> http://www.uib.no/mailman/public/corpora/attachments/20110521/0909399b/attachment.txt
> >
>
>
>


-- 
Ashish Almeida
---------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20110523/983c1c5e/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list