9

I want to know about recently available datasets for fake news analysis

4 Answers4

5

Buzzfeed News has been doing work on this, and has published data related to fake news, news patterns, and social media patterns on their Github: https://github.com/BuzzFeedNews/everything. Might be a good repo to browse.

wirefire
  • 351
  • 1
  • 6
2

Here are some of the datasets available for fake news detection:

LIAR dataset: https://www.cs.ucsb.edu/william/data/liar_dataset.zip

BS Detector: https://github.com/bs-detector/bs-detector

Scorpio
  • 21
  • 2
1

You should check out the Observatory on Social Media (OSoMe) at Indiana University. The team have been been archiving 10% of public activity on Twitter for the last 10 years. The data isn't directly available to people not affiliated with the University they have a number of algorithms and visualization tools that you can run against the data.

  • They have a service called 'BotSlayer' which you can set up yourself on a free AWS instance and track certain hashtags and key phrases.
  • There is also 'Botometer'which will assess any twitter user name and socre it based on how 'bot-like' it is.
  • Finally, they have a tool called 'Hoaxy' which allows you to visualize the spread of a news or fake-news story across twitter to see which accounts are sharing/re-tweeting it.
Matt
  • 211
  • 1
  • 6
0

Kaggle hosts a dataset where the CSV has URL, title, text, and a flag "reliable" or "unreliable"

https://www.kaggle.com/c/fake-news/data

id: unique id for a news article

title: the title of a news article

author: author of the news article text: the text of the article; could be incomplete

label: a label that marks the article as potentially unreliable

1: unreliable

0: reliable

accessing the data requires registration

philshem
  • 17,647
  • 7
  • 68
  • 170