I was looking for the dataset of links between web pages and found these options:
- Common Crawl: Public web crawling data. It includes "the HTTP headers returned and the links (including the type of link) listed on the page." AFAIK, it's the most comprehensive and up-to-date source.
- SNAP: Small samples of the web graph (up to 7M edges). Including Google's version.
Are you aware about other sources of the similar data? Preferably comprehensive, with big coverage, rather than frequent updates.