12

I have CSV and RDF files that I would like to upload somewhere so that everyone can download them.

What is the most appropriate website for this?
I need at least:

  1. Easily upload, just select the license, enter some info and upload the file. No need to wait for anyone's approval
  2. Ability to replace the file with a new version, URL does not change. (just upload file to overwrite. Online line-per-line editing features NOT needed)
  3. Accepts most text-based structured formats: CSV, RDF, XML, TXT
  4. List of the files, and download button for each
  5. Direct HTTP link so that the data can be downloaded by a script
  6. Anyone can download without any registration
  7. File download statistics are visible by everybody

Bonus for format-aware features:

  • Preview of the data (eg. First 3 lines if CSV, First 3 triples if RDF)
  • Data browser
  • For RDF, generate hyperlinks to other linked data
  • Re-uploading the file does not reset the statistics

Tried but failed:

  • Dropbox does not show download statistics
  • Sourceforge is for source code, and I am not sure how far they tolerate big amounts of pure data
Patrick Hoefler
  • 5,790
  • 4
  • 31
  • 47
Nicolas Raoul
  • 8,426
  • 5
  • 28
  • 61
  • 2
    Quick list of links for you which I'll try and turn into an answer in the future: http://opendata.socrata.com, http://datahub.io/, http://www.quandl.com/, and http://github.com – Mark Silverberg Aug 29 '14 at 12:43
  • related : http://opendata.stackexchange.com/q/980/263 – Joe Sep 02 '14 at 00:56
  • Also, is this a static release (ie, won't be modified after publications), or is this something that you will be updating in the future? GitHub actually links to Zenodo if you want to fix it to a specific version / edition. – Joe Sep 02 '14 at 00:58
  • I also ran across Google Fusion Tables. It doesn't provide the download stats, but will host your files.

    https://sites.google.com/site/fusiontablestalks/stories

    – Sun Sep 03 '14 at 19:08
  • @Joe: We generate a new version every 2 weeks. It could be considered as an update, but it i actually more like overwriting. – Nicolas Raoul Sep 10 '14 at 06:43
  • @NicolasRaoul : as the blob of data to be stored/served is changing, most archives would consider it to not be static -- even if individual datum aren't changing and you're just appending to the end, as a whole it's changing. – Joe Sep 10 '14 at 13:36
  • @Joe: Thanks for your insight! It is actually data extracted from Wikivoyage. Hotel/restaurant/attraction get added/updated al the time, and we just extract. There is zero update of data that has been extracted, it is just overwritten at the next extraction. – Nicolas Raoul Sep 11 '14 at 02:34
  • @NicolasRaoul : ah ... so you don't need tools at the repository for updating the data ... but the data is being updated in bulk, so many of the sites that are intended for archiving likely wouldn't work for you, unless they had provisions for replacement editions. – Joe Sep 11 '14 at 03:01

2 Answers2

9

Well my comment received a number of up votes which I take as a signal of quality and I am posting the links here so they are more visible to future visitors:

  • opendata.socrata.com - you can upload a number of different file types here, create visualizations, link to them, and take advantage of a very mature set of APIs for data consumption and publishing
  • datahub.io - like opendata.socrata.com, datahub.io is an open data portal but it runs CKAN which is free and open source unlike Socrata. CKAN is modular and you can write plugins for it but in my experience, out of the box, Socrata has better visualization and mapping tools
  • github.com - You mention SourceForge and I would recommend taking a look at github instead. You will have limited analytics compared to the first two sites which are built specifically for what you are looking for but Github has a lot of advantages like the ability for folks to fork your code and they do render CSV and GeoJSON documents nicely now. Also take a look at this blog post by OKFN (folks behind CKAN) for some suggestions on how to use Git (and Github) for Data
  • quandl.com and modeanalytics.com - these sites seem promising but I have not had a chance to work with them as much and they do not appear to be as feature complete or widely adopted
Mark Silverberg
  • 5,184
  • 14
  • 25
3

Knoema is the free to use public and open data platform for users with interests in statistics and data analysis, visual storytelling and making infographics and data-driven presentations

sergey
  • 31
  • 1