27

Is there a free csv, xml or in some other format database of all (or least top 20-50 biggest) cities for each country in the world?

Patrick Hoefler
  • 5,790
  • 4
  • 31
  • 47
Incerteza
  • 401
  • 1
  • 4
  • 5

5 Answers5

23

My open data project (I am a co-founder) has a free list of all the cities in the world, along with their area centroid (lat/lng), as a CSV file. It is compiled from the USGS/GNIS (US) and NGA/GNS (non-US) databases.

http://www.opengeocode.org/download.php#cities

As an alternate source, the United Nations Statistical Division publishes an annual yearbook on world statistics. Table 8 has the population of cities > 100,000

http://unstats.un.org/unsd/demographic/products/dyb/dyb2012/Table08.xls

We have a version of it converted to ur Linked CSV format/vocabulary:

http://www.opengeocode.org/cude1.1/UN/UNSD/dyb2012-pop100k.zip

METADATA (dyb2012-pop2k)

  1. (Empty)
  2. ISO 3166-1 alpha-2 country code (e.g., US => United States)
  3. National Geospatial Intelligence Agency (NGA) Geographic Name Server (GNS) Feature Code (e.g., P = Populated Place Type Feature)
  4. NGA/GNS Feature Designation Code (e.g. PPL = Populated Place (incorporated))
  5. Extended Feature Description (e.g., city, capital)
  6. Total Area in Square Kilometers
  7. ISO 639-1 language code for language that name field is in (e.g., lc = local language native to the country)
  8. Language Script for name fields (e.g., latin, arabic, chinese)
  9. Short Name (Gazetteer) for City
  10. Year of Population Statistics
  11. Total Population (e.g., within city proper)
  12. Urban Population (e.g., within agglomerated area of city)
  13. Total Male Population
  14. Total Female Population
Andrew - OpenGeoCode
  • 8,657
  • 17
  • 28
  • how do I get top X cities for each country? – Incerteza Oct 18 '14 at 17:58
  • @alexander - I would use the UN dataset I listed above to identify cities per country > 100K. You can use the ISO 3166-1 alpha-2 code in the UN and my datasets to aggregate records for the same country if you need to. – Andrew - OpenGeoCode Oct 18 '14 at 21:06
  • I downloaded http://www.opengeocode.org/cude1.1/UN/UNSD/dyb2012-pop100k.zip but I can't get along with it, could you explain how to do that? For instance, this ,EG,P,PPL,city,,lc,latin,Aswan,2010,543,396,277,732,265,664, doesn't make sense to me. – Incerteza Oct 19 '14 at 03:49
  • @alexander - I added the metadata to my answer. The line in question is sadly a bad data line. The dataset was generated using an older version of our ETL and did not detect commas in numbers. I regenerated/uploaded the dataset using our newer parser and verified that these lines are correct. I would download the updated version. – Andrew - OpenGeoCode Oct 19 '14 at 16:08
  • what field actually stands for the name of a city? If it's 9 then where is the normal, not short, the name? – Incerteza Oct 25 '14 at 12:09
  • for example, why do these 2 rows have the same name of the city ,NZ,P,PPL,city,4938,lc,latin,Auckland,2012,1507600,,740900,766700 and ,NZ,P,PPL,city,4938,en,english,Auckland,2012,1507600,,740900,766700,? – Incerteza Oct 25 '14 at 12:11
  • @AlexanderSupertramp - The UN dataset only has the short names of cities. I believe when you say normal name you mean the formal name. Generally, the formal name is something like 'City of Auckland', 'Township of ..'. The difference between line 1 and 2 is the 'lc,latin' and 'en,english'. The first line says the name is spelled in the local language using latin (romanized) script. The second line is the UN English version of the name spelled in English character script. Since New Zealand is English speaking, both are the same. – Andrew - OpenGeoCode Oct 25 '14 at 15:40
  • 3
    Andrew, what happened to opengeocode? I get "account suspended" message when I try to download the file. I am interested in the dataset you point to and describe. Is it still available somewhere? – marfi Feb 19 '17 at 09:53
  • @Andrew-OpenGeoCode - See comment from marfi above – Martin Hügi May 22 '17 at 09:38
  • Still "account suspended". Is the file published elsewhere? – Per Lundberg Jan 18 '18 at 20:21
11

Consider geonames, which probably has the largest collection of place names anywhere (excluding street names, which is the purview of openstreetmap.org):

http://download.geonames.org/export/dump/

From the directory above, you can download a list of "large" cities (or every single placename geonames knows about), and "readme.txt" in that directory explains further.

6

UN/LOCODE includes over 103,034 locations in 249 countries and installations in international waters. It is used by most major shipping companies, by freight forwarders and in the manufacturing industry around the world. It is also applied by national governments and in trade related activities, such as statistics where it is used by the European Union, by the UPU for certain postal services, etc.

http://www.unece.org/cefact/locode/welcome.html

1

You can query the GlobalWeather API at WebserviceX.net, specifically the GetCitiesByCountry call. You would have to input a list of country names, but these are easily obtainable.

0

The highest quality dataset that I've seen is Natural Earth. About 7,000 cities total so you would need to filter it down if you don't want that many

https://www.naturalearthdata.com/downloads/10m-cultural-vectors/10m-populated-places/

jaksco
  • 101
  • 1