3

I am looking for the best method to map a given Wikipedia/Wikisource/Wikibooks/... URL to the corresponding Wikidata entity id (similar to this question but via API calls instead of a full data dump). The URL can be in many forms, e.g.

Methods I tried so far:

The most reliable (but nasty) method seems to enulate a browser to do a HTTP request to the URL, follow all redirects and parse the resulting HTML page to get the Wikidata ID.

Jakob
  • 195
  • 5

1 Answers1

3

As a general rule, Wikidata won't know about redirects - they're not part of the model there. They only record the canonical page title (leaving aside a few rare cases where the Wikidata sitelink is a redirect, which is deprecated but does occasionally happen). So any 'de-redirecting' will need to be done on the Wikipedia side.

If you use the local API with &redirects, you can get the Wikidata entity ID in a single query whether or not there's a redirect -

https://en.wikipedia.org/w/api.php?action=query&prop=pageprops&titles=Element_18&redirects

https://en.wikipedia.org/w/api.php?action=query&prop=pageprops&titles=Argon&redirects

both contain "wikibase_item": "Q696"

You'll still have to process the URL a bit to get the API call, though.

Andrew is gone
  • 803
  • 4
  • 9