Most Popular

1500 questions
33
votes
3 answers

Bulk download Sci-Hub papers

I wonder whether it is possible to bulk download all papers stored in Sci-Hub. I am aware of the questions: Is there a more user-friendly way to download multiple articles from arXiv? Bulk download of arXiv (or other publication data set) with…
Franck Dernoncourt
  • 7,780
  • 9
  • 39
  • 86
33
votes
12 answers

Is there a list of all US Government agencies and sub agencies and is it available via API?

Specifically I'm looking for: Agency canonical names Agency abbreviature/acronym Hierarchy of agencies e.g. Census Bureau is part of Department of Commerce Agency logos Website/social media accounts HQ Address/contact info
Dmitry Kachaev
  • 1,552
  • 16
  • 27
32
votes
4 answers

Open downloadable recipe database?

I'm interested in doing some analysis of recipes for fun. Ideally, I would like to obtain open recipe database(s) behind {foodily, allrecipes, recipes, bigoven, cooking, cooks}.com or something like that. I am interested in databases which have…
respectPotentialEnergy
  • 1,550
  • 1
  • 10
  • 11
30
votes
4 answers

List of public holidays by countries?

When working across country border it makes things easier to know in advance if people in a specific country are going to be on public holiday a certain day. While some calendars application offer to import this information, is there an open list of…
Auberon Vacher
  • 403
  • 1
  • 4
  • 7
30
votes
8 answers

How can I download the complete Wikidata database?

Wikidata is a new Wikimedia project: It centralizes access to and management of structured data, such as interwiki references and statistical information. This data would be of enormous interest to the Open Data community. Does anyone know if it…
Patrick Hoefler
  • 5,790
  • 4
  • 31
  • 47
30
votes
19 answers

Are there any open datasets for soccer statistics?

I'm mainly interested in soccer related statistics. There are quite a few different API's relating to soccer, but most of them are commercial and far, far out of my price range. I've looked at DBpedia, but a lot of their data is quite out of date.
Luke Hansford
  • 409
  • 1
  • 5
  • 5
29
votes
12 answers

What are the most common ways that users find out about new data sets?

I am interested in how typical open data users, such as journalists, researchers, companies, developers, and others, find out about new open data sets today. For example, how do sources like search engines, re-distributors of open data, tech media,…
Sophie Raseman
  • 986
  • 8
  • 12
29
votes
12 answers

Metadata standards and best practices for data dictionaries for CSV files/data

We publish most of our machine-readable open data in CSV format. What are best practices and/or standards to publish data dictionaries (e.g. definitions of columns in CSV files including human-readable names, data types, possible values and their…
Dmitry Kachaev
  • 1,552
  • 16
  • 27
27
votes
3 answers

Are there any regular Open Data conferences?

I'm interested in any Open Data conferences which are held on a regular basis (e.g., yearly). Are there any such conferences?
Sicco
  • 709
  • 5
  • 20
27
votes
7 answers

Evidence for the economic impact of open data?

Open Data is discussed in many contexts, ranging from transparency, government accountability, and their economical benefits. Are there up-to-date studies and publications analyzing the real and expected value for the various economic sectors…
giohappy
  • 371
  • 2
  • 4
27
votes
5 answers

A list of cities of each country

Is there a free csv, xml or in some other format database of all (or least top 20-50 biggest) cities for each country in the world?
Incerteza
  • 401
  • 1
  • 4
  • 5
27
votes
9 answers

Free database or API of all North American businesses

I'm looking for a free/open source downloadable database/API of all the businesses in North America. The sort of data that I'm looking for includes: name, address, industry, email address, website, ... The license should preferably be compatible…
TabithaVas
  • 271
  • 1
  • 3
  • 5
26
votes
7 answers

Twitter open datasets

Where can I get Twitter datasets available for analysis? I found two: The May 2011 Calufa Twitter Scrape Cheng-Caverlee-Lee September 2009 - January 2010 Twitter Scrape
Anton Tarasenko
  • 3,641
  • 4
  • 20
  • 34
26
votes
7 answers

Extracting tables from multiple PDFs

What's the best practice of extracting tables from a large number of PDF, which may be formatted differently? For example, I have a series of PDFs like this one, and I would like to extract the tables and save them as more machine-readable format…
Andreas Blaesus
  • 446
  • 4
  • 10
26
votes
6 answers

Sources of weather data

Barry, could you abuse this site's "answer your own question" feature to create a community wiki answer for sources of weather data (both current and historical), since it gets asked so often?
user3856