1

I need to collect all articles that are about countries and continents in Arabic from Wikipedia. I was wondering how do I find these articles if I downloaded the latest wiki dump?

New to the field. Trying to learn something new every day.

  • 1
    I wrote an answer about ar-wiki dumps, but you may have better luck using Wikidata. Maybe a Wikidata/SPARQL user can write an answer here, or consider formulating your question and asking here: https://www.wikidata.org/wiki/Wikidata:Request_a_query – philshem Jan 28 '20 at 09:42

1 Answers1

2

Here's the main page for Wiki{p,m}edia dumps:

https://dumps.wikimedia.org/

and the index of dumps:

https://dumps.wikimedia.org/backup-index.html

and for Arabic Wikipedia you'd select "ar" and "wiki", for example:

https://dumps.wikimedia.org/arwiki/20200120/ (note the datestamp)

(if you wanted Arabic Wiktionary instead, it would be another dump: https://dumps.wikimedia.org/arwiktionary/20200120/)

and the full dump of Arabic Wikipedia is here:

enter image description here


Read here about how to parse the results

https://meta.wikimedia.org/wiki/Data_dumps

Or search through this forum for tips, for example:

How can I download the complete Wikidata database?

philshem
  • 17,647
  • 7
  • 68
  • 170