Most Popular

1500 questions
174
votes
14 answers

Is there a global database of all products with EAN 13 barcodes?

EAN 13 is an international system. Is there an API or database that contains all items that have these barcodes? Like all food, goods you can buy in a regular convenience store. Is there a global open database for this?
bogen
  • 1,913
  • 4
  • 13
  • 10
158
votes
29 answers

A database of open databases?

While there are many open databases available, is there a database or the project that would contain the information of such databases? In other words, is there an open meta-database of open databases?
user139
114
votes
21 answers

Is there a Git for data?

I know you can put CSV or XML in Git already, but there are issues for those wishing to collaborate on a dataset for creation, cleaning it up, pull requests, etc. Are there alternative version control systems that suit data better? The key…
D Read
  • 2,361
  • 2
  • 16
  • 22
93
votes
21 answers

How can I work with a 4GB csv file?

What's the best way to access a 4GB csv file? I would like get a 'cut' of this open data set: Full Replacement Monthly NPI File, available here. Specifically, I want only the rows for hospitals; though, one might want the rows for healthcare…
user1453
  • 949
  • 1
  • 7
  • 3
76
votes
5 answers

Open API for nutritional information and/or food barcodes?

I've seen many applications (mobile and web) that use a database of nutritional information and barcodes to track daily food consumption. Smartphones have the ability to scan barcodes, and many mobile applications have started to include a barcode…
CaesiumFarmer
  • 2,088
  • 1
  • 14
  • 26
53
votes
6 answers

A Python guide for open data file formats

If you are an open data researcher you will need to handle a lot of different file formats from datasets. Sadly, most of the time, you don’t have the opportunity to choose which file format is the best for your project, but you have to comport with…
Tasos
  • 4,714
  • 3
  • 20
  • 43
51
votes
16 answers

Multinational list of popular first names and surnames?

Is there a database containing the list of the most popular first names and surnames (with occurrence count, or at least sorted by popularity) for many nations/countries? I need such data for the generating of sample customer database. Customers…
user139
46
votes
9 answers

Benefits of using CC0 over CC-BY for data

I've heard a few times that for data one should use rather CC0 (Creative Commons - Public Domain Dedication) license than CC-BY (Creative Commons - Attribution). What is the reason for that? (As sure, one loses the requirement of attribution.)
Piotr Migdal
  • 824
  • 6
  • 14
43
votes
8 answers

Should I approach an agency unofficially before FOIAing them?

Are there drawbacks to FOIAing (Freedom of Information Act, either federally or equivalent state/local laws) an agency? For example, can it lead to an adversarial relationship?
Ben Sheldon
  • 1,278
  • 9
  • 18
41
votes
7 answers

Cryptocurrency historical prices

I'm looking for cryptocurrency historical data, including prices and market cap (either from exchanges or average price) of the main cryptocurrencies, namely: Bitcoin, Ripple, Litecoin, Ethereum, Dash. So far I've only been able to find this source…
wacax
  • 1,042
  • 1
  • 8
  • 23
40
votes
6 answers

Let's suppose I have potentially interesting data. How to distribute?

This is a very simple question: Suppose that I have some sort of specialized data, perhaps that I've collected myself or been a part of the collection. And suppose that nothing prevents me from handing this data out to people. In what method should…
davidlowryduda
  • 501
  • 3
  • 7
39
votes
7 answers

Are there any good libraries available for doing normalization of company names?

I've run into a number of use cases where I need to normalize company names in a database before running automated and manual matching. We've usually ended up writing a specific script with endless subtleties for each application, but I'm curious if…
skyebend
  • 903
  • 8
  • 11
38
votes
11 answers

Good tools to parse repetitive unstructured data

I'm looking to parse a large number of lines of repetitive but unstructured data. This is a task that happens at least once every project, in my experience, so I'm looking for a tool to transform fairly standard text into structured data. Right now…
RCA
  • 943
  • 1
  • 7
  • 14
38
votes
10 answers

Downloadable archive of weather conditions for Europe?

Is there a downloadable archive of weather conditions in Europe? Also, the temperature, humidity, precipitation by location? So the database would enable, for example, to draw the chart with average temperatures in February or May in Berlin Or…
user139
34
votes
10 answers

Open API for SEC data?

Is there any free API for programmatically grabbing SEC filing data, such as company financials or insider trading? It seems ironic that the EDGAR search gives you information in a nice tabular form, but there isn't any obvious way to get the raw…
BrenBarn
  • 809
  • 1
  • 7
  • 16
1
2 3
99 100