4

I was looking for the dataset of links between web pages and found these options:

  • Common Crawl: Public web crawling data. It includes "the HTTP headers returned and the links (including the type of link) listed on the page." AFAIK, it's the most comprehensive and up-to-date source.
  • SNAP: Small samples of the web graph (up to 7M edges). Including Google's version.

Are you aware about other sources of the similar data? Preferably comprehensive, with big coverage, rather than frequent updates.

Anton Tarasenko
  • 3,641
  • 4
  • 20
  • 34
  • Is there anything unusable with the sources you link to, or are you just looking for a bigger list? – philshem Apr 29 '15 at 09:45

0 Answers0