[Corpora-List] Interlanguage links in Wikipedia

Michele Filannino michele.filannino at cs.manchester.ac.uk
Sun Jun 17 16:58:26 UTC 2012


I Nasrin,

I replied in the hope I do not arrive too late. If you need to collect just
the links to the different languages for each Wikipedia article, I suggest
you to write down a simple script using your favourite programming language.

Starting from the HTML code of one page, you could easily detect a DIV
element with ID equals to "p-lang". You could extract all the href
attributes of A elements in there.

I this will be useful for you.

Best wishes,
Michele.

On Fri, Jun 15, 2012 at 7:49 AM, Nasrin Baratali
<nasrin.baratali at gmail.com>wrote:

> To whom it may concern,
>
> I want to access pages in the Wikipedia that have different language and
> their content are nearly equivalent or exactly equivalent. It
> seems Interlanguage links have enough information for me. However  I do not
> know how I could extract these links or equivalent pages. I would be
> appreciate if any one could help me.
>
> Regards,
>
> Nasrin Baratalipour
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>


-- 
Michele Filannino

CDT PhD student in Computer Science
Room IT301 - IT Building
The University of Manchester
filannim at cs.manchester.ac.uk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20120617/1ab80806/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list