Wikipedia has pages for containing a list of towns and cities with 100,000 or more inhabitants. I'd like this same data in machine readable format, with country and city names.
5 Answers
I have a compiled list of all the cities of the world which I collated from the USGS GNIS database (US) and the NGA GNS Server (other countries) into CSV file. I've made the list available FREE as a public domain dataset.
Metadata
Column 1: ISO 3166-1 alpha-2 country code.
Column 2: US FIPS 5-2 1st level administrative division code (e.g., state/province).
Column 3: NGA GNS Feature Description (DSG) code.
Column 4: NGA GNS Unique Feature Identifier (UFI).
Column 5: ISO 639-1 alpha-2/3 code for language corresponding to the feature name.
Column 6: Language script (e.g., latin, arabic, chinese, etc) corresponding to the feature name.
Column 7: Feature name.
Column 8: Latitude coordinate of the area centroid.
Column 9: Longitude coordinate of the area centroid.
- 8,657
- 17
- 28
Freebase will have this data.
is an example query that looks like this:
[{
"type": "/location/citytown",
"limit": 2,
"name": null,
"/location/statistical_region/population": [{
"number": null,
"number>": 100000
}],
"/location/location/geolocation": [{
"latitude": null,
"longitude": null
}]
}]
I'm limiting it to 2 here, you can change that to get more results.
In this case I'm also pulling in longitude and latitude, just to show that you can get a lot of stuff from freebase!
The population data is annoyingly broken down by year, so this is actually getting every city that has ever had more than 100k people living there, rather than ones that do now.
I'm not sure how to deal with that.
Clicking on the link button on the top right gives you an option for an MQLRead Link - that is the endpoint you'd query normally.
This is the extend of my knowledge - hopefully it's enough to pique your interest though - Freebase looks like it could be absolutely incredible once you've got your head round how to query it.
- 141
- 3
-
By the way, if anyone knows how to improve this query, feel free to edit it. I'd also be very interested in having a quick informal chat with anyone who is very familiar with querying Freebase. – Rich Bradshaw Mar 11 '14 at 16:44
Your own link includes a reference to the UN-stats office (population density and urbanization). The link was dead but I was able to remove the final extension and then re-find it.
Population of capital cities and cities of 100 000 or more inhabitants: latest available year, 1993-2012 (Excel link)
- 17,647
- 7
- 68
- 170
I put together a quick SPARQL query to query DBpedia:
select *
where {
?place <http://dbpedia.org/ontology/populationTotal> ?population;
<http://dbpedia.org/property/name> ?name;
<http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat;
<http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long.
}
order by DESC(?population)
limit 20
offset 0
Alas, the data in DBpedia is dodgy. I've reported the issue to DBpedia.
If the data wasn't broken, you could use this SPARQL query against the DBpedia endpoint to retrieve data (modifying the limit and offset as you would to page through SQL results).
Wikidata may have this data in the near future. It'd be nice if there were also a public SPARQL endpoint...
- 131
- 2