2

I am interested in understanding which US airports are used by various airlines. One could visualize this with a spreadsheet:

enter image description here ​ ​

This examples shows that Southwest flies out of Midway, but not O'Hare. Delta flies out of O'Hare, but not Midway. Both fly out of Baltimore.

I'm wondering if there is a clever way to tease out a matrix from openflights.org.

A spreadsheet would be preferable although, I suspect that if the data is out there, it is stored in a RDBMS.

Jeanne Holm
  • 4,447
  • 1
  • 18
  • 40
gatorback
  • 121
  • 3
  • 1
    the 'teasing' is actually just programming, so I'm marking this to close because it's actually a programming question. More details below – philshem Jul 30 '16 at 07:12
  • i'm confused because gatorback is asking for a spreadsheet. i'm assuming it would take programming to make said spreadsheet? – albert Jul 31 '16 at 19:16
  • The spreadsheet is merely a visualization tool to present data examples. Other useful visualization \ analytical tools include MATLAB. That data is really the focus and the encoding of the data (CSV, .mat, etc). This is not a programming question of how to visualize the data. I am indifferent to how data is stored / encoded, though standard encoding simplifies loading data into spreadsheets and other analytical tools – gatorback Aug 01 '16 at 05:38

1 Answers1

3

The data you are looking for is already at openflights.org/data. The reason it's not displayed as a matrix (pivot table, non-normalized, crosstab) is that there are 9'541 listed airports and 19'845 listed airlines and creating a sparse matrix is neither good for storage space, performance, user-queries, etc. One reason is that you've used a piece of data, the airport name, as a column name. So you can't easily make a dynamic filter.

Anyway, to create a matrix you'll need to know some programming, or a spreadsheet tool like Excel.

First, download the airlines.dat file. Each airline is given a 3-letter code (key).

Then, download the routes.dat file (2 MB) and use the airlines.dat 3-letter code to join the airline name to the airline code.

Then you need to de-normalize the data, which is a fancy way to make a pivot. Each SQL-flavor, excel, and programming languages can do this in their own way.


In my opinion, because the data is easily found and with a friendly license, I'm going to suggest closing the question. I'd recommend starting with a programming or Excel solution and then ask in the appropriate forum if you need support.

philshem
  • 17,647
  • 7
  • 68
  • 170