Computers

MIT tackles fake news with machine learning system that classifies accuracy and bias

A new machine learning system has been developed to classify entire news sources on both factual accuracy and political bias
A new machine learning system has been developed to classify entire news sources on both factual accuracy and political bias
View 2 Images
An example of how the system tracks linguistic features in a given article to evaulate accuracy and bias in a source
1/2
An example of how the system tracks linguistic features in a given article to evaulate accuracy and bias in a source
A new machine learning system has been developed to classify entire news sources on both factual accuracy and political bias
2/2
A new machine learning system has been developed to classify entire news sources on both factual accuracy and political bias

Fake news is a problem. This is inarguable. However, exactly how to identify and deal with it is a challenge with no really good solution. Google, Facebook and other tech giants are struggling with exactly how to moderate the spread of both outright misinformation, and extremely biased news reporting. Individual fact-checking and moderating of specific articles is an insanely time-consuming process, and one still frustratingly subject to the bias of individual moderators. So, while Facebook, for example, currently has thousands of human moderators working to keep tabs on the content shooting around its platform, it is clear that some kind of technological resource will play an increasingly significant role in the solution.

A team of researchers from MIT's Computer Science and Artificial Intelligence Lab and the Qatar Computing Research Institute has set out to develop a new machine learning system designed to evaluate not only individual articles, but entire news sources. The system is programmed to classify news sources for general accuracy and political bias.

"If a website has published fake news before, there's a good chance they'll do it again," says Ramy Baly, explaining why the team targeted the veracity of entire sources instead of individual stories. "By automatically scraping data about these sites, the hope is that our system can help figure out which ones are likely to do it in the first place."

The first step in developing the system was feeding it data from a source called Media Bias/Fact Check (MBFC). This independent, non-partisan resource classifies news sources based on political bias and accuracy. On top of this dataset, the system was trained to classify the bias and accuracy of a source based on five features: textual, syntactic and semantic article analysis; its Wikipedia page; Twitter account; URL structure; and web traffic.

Perhaps the most comprehensive of these analytical tools is the system's ability to examine individual articles using 141 different values to classify accuracy and bias. These include linguistic features such as hyperbolic or subjective language, and lexicon classifiers that signify bias or complexity. A source's Wikipedia page is also leveraged for signs related to bias, while the lack of a Wikipedia page could suggest a reduction in credibility.

An example of how the system tracks linguistic features in a given article to evaulate accuracy and bias in a source
An example of how the system tracks linguistic features in a given article to evaulate accuracy and bias in a source

A news source's factual accuracy is ultimately classified using a three point scale (Low, Mixed or High), while political bias is measured on a seven point scale (Extreme-Left, Left, Center-Left, Center, Center-Right, Right, and Extreme Right).

Once the system was up and running it only needed 150 articles from a given source to be able to reliably detect accuracy and bias. At this stage the researchers report it is 65 percent accurate in detecting factual accuracy and 70 percent accurate at detecting political bias, compared to similar human-checked data from MBFC.

It's suggested that despite potential improvements in predictive accuracy that are sure to be developed by improving the algorithm, it is essentially designed to work in tandem with human fact-checking sources.

"If outlets report differently on a particular topic, a site like Politifact could instantly look at our fake news scores for those outlets to determine how much validity to give to different perspectives," explains Preslav Nakov, co-author on the new research.

One outcome the researchers hope to generate from this system is a new app that offers users accurate news stories spanning a broad political spectrum. In being able to effectively separate factual accuracy from political bias, the app could hypothetically present an individual user with articles on a given story that may be from a different political angle than they are commonly exposed to.

"It's interesting to think about new ways to present the news to people," says Nakov. "Tools like this could help people give a bit more thought to issues and explore other perspectives that they might not have otherwise considered."

Absolute trust in the algorithm is not really the ultimate goal here. After all, as the recent bias in facial recognition controversy illustrated, an algorithm will essentially reflect the bias of the humans that create it. So, a machine learning system such as this one is by no means a magic bullet to the problem of fake or biased news, but it certainly presents us with a valuable tool to help manage the ongoing issue.

The new study is available on the preprint journal website Arxiv.

Source: MIT

6 comments
fb36
IMHO, the main problem causing "extremely biased news reporting" is that, biased/exaggerated (so more interesting/exciting) news reporting always gets more clicks/viewers/readers/listeners, and that brings more advertising revenue! So any media outlet not doing it finds itself in a big disadvantage against any/all other media outlets! And that means, mass media can never fix itself, even if it really wanted, IMHO! Only realistic solution, IMHO, would be to make a national/global law that makes separation of news/facts and personal opinion mandatory for all (mass) media outlets!
JustJim
Return to teaching critical thinking and how to learn in schools. We really don't want or need anyone helping us review the output of those "reporters" and opinion generators. I don't want the government protecting me from "wrong speak". While we are at it, get rid of the entire attitude of "politically correct".
otto17
Skynet** is brought to reality. ...and when news reports gravitate toward "truth" as they should when reporting is motivated by the search for truth and NOT "more advertising revenue" as fb36 previously pointed out, then will this new "fake-news-machine-learning" "bias detector" decide for us that truthful news, because it has become more prevalent, must be biased and must be quashed? The opinions of a well-educated public will be biased toward the truth and so those opinions will be suppressed in favor of falsehoods by this artificial intelligence so that reporting is "not biased"? So how do you qualify to be one of the programmers who will determine what the truth detector is going to be today, the next day? Will the truth/bias detector rewrite history when it later learns IT was wrong? (note the clever wordplay IT, InfoTech, vs. it, pronoun for the incorrect historical reporting decisions made). Just how is someone who is curious about the truth going to research it? I see this as another university government grant scam. Just go back to building cars that know better how to drive than I do. ** (Skynet [see Terminator] is a fictional neural net-based conscious group mind and artificial general intelligence )
Kpar
JustJim is right on the money. Critical thinking and logic have not been taught in (government) schools for decades. Turning over to an algorithm the responsibility for determining truth or falsity is the ultimate surrender. The state will have won (along with those self-nominated 'truth detectors", no doubt aligned with the state).
Anne Ominous
This is just ludicrous. It isn't possible to do this. One person's "bias" is another person's research.
toyhouse
It's curious how so many people are now suddenly worried about the accuracy of other people's sources of information. And they almost never question their own sources. In the past, that was left up to the individual. But it's all being done in the name of eliminating fake news. Prediction; if a.i. filtering or labeling takes-off, it will be the new standard employed by governments and industry giants. One can decide for themselves whether or not that's a good thing. Apparently, the masses can't be trusted to decide for themselves without someone else stepping in to help them, (a machine a.i. is a ruse),. And many folks, will welcome that scenario.