7

I'm trying to get hold of the Wikidata inter-language links, i.e. a file listing all of the articles and which articles they are linked to in other languages.

( http://www.wikidata.org/wiki/Wikidata:Glossary#Sitelink )

It seems like this should exist, and such a file seems available in the Wikidata data dumps: http://dumps.wikimedia.org/wikidatawiki/

I would have thought the one I want is "...langlinks.sql.gz", which is described as "Wiki interlanguage link records." That file is only 32 KB though, and the SQL file only produces a few thousand lines.

What am I not understanding? Is it possible to get a list of these links?

MHG
  • 173
  • 5

2 Answers2

6

The official Wikidata dumps are still in a very early stage.

At the moment, you might find these processed files helpful, especially the one that ends with links.ttl.gz (currently http://semanticweb.org/RDF/Wikidata/turtle-20130801-links.ttl.gz), which is a Turtle file that provides the inter-language links extracted from the Wikidata dump.

The export scripts are available on GitHub, and you might also find the corresponding announcement on the mailing list interesting.

Please also have a look at the related question How can I download the complete Wikidata database?

Patrick Hoefler
  • 5,790
  • 4
  • 31
  • 47
1

To get the inter-language aka sitelinks for a limited set of items use the API, for instance item Q1 and Q2:

https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q1|Q2&props=sitelinks%2Furls

As answered on StackExchange you can also query by language and page title:

https://www.wikidata.org/w/api.php?action=wbgetentities&sites=enwiki&titles=Ore%20Mountains&languages=cs|de|es|fr|it|pl|pt|ru&props=sitelinks%2Furls

Jakob
  • 195
  • 5