Truth-Seeking in the Post-Truth Era: Tutorial at EMNLP 2020
In the post-truth era, how can we use NLP to tackle fact-checking and fake news?
One of the tutorials at EMNLP 2020 is “Fact-Checking, Fake News, Propaganda, and Media Bias: Truth Seeking in the Post-Truth Era.” I attended this tutorial because I had done some previous research in this area and found it quite fascinating. Here are the primary points I learned that I think you’ll also find quite interesting.
Fake News is a Problem. In modern times, fake news has become a huge issue for many reasons. Not only has the public lost confidence in traditional media, but it also has low levels of critical thinking and news literacy which, combined with a shift in business models by malicious actors who seek financial or political gains, has resulted in fake news becoming a huge problem. In fact, studies have shown that fake news spreads six times faster than real news and can spread in less than ten minutes. Thus, it's important to be able to quickly and accurately identify fake news.
Information Incorrectness. When classifying the information within these fake news articles, there are three primary types with a range of factualness and malintent.
- Misinformation: unintentional mistakes that are factually incorrect but not made with malintent. Examples include incorrect image captions or taking a satirical article (like this one from The Onion) seriously.
- Disinformation: fabricated or manipulated content that is factually incorrect and made with malintent. Examples include manipulated graphs or incorrect statistics.
- Malinformation: private information (which may or may not be factually correct) and made with malintent. Examples include the publication of revenge porn or someone’s voting history.
AI for Assistance in Fact-Checking. Now that we have a basic understanding of the problem of fake news, how can we tackle it? Of course, there is always the path of manual fact-checking, such as when news groups fact-check presidential candidates during a debate. And completely opposite of this is the path of automatic fact-checking, which would rely on artificial intelligence to automatically verify claims and analyze news articles. While automatic fact-checking is considered by many to be the long-term goal, the problem is that people currently don’t trust machines to fact-check accurately and prefer humans to be the ones doing the fact-checking. Thus, the middle-ground is to use technology to assist human fact-checkers. There are two primary ways of doing this:
- Identify a claim’s check-worthiness. In other words, use machines to analyze the importance of a fact to be checked so that humans can spend more time focusing on the actual fact-checking. But how can machines judge what’s important and what’s not important? Current research utilizes four questions as a rubric for a fact’s worthiness to be checked:
(1) Does the fact contain a verifiable factual claim? If it’s impossible to verify the fact, then there’s no point in attempting to fact-check it.
(2) Is the fact likely to be false? If there’s only a tiny probability that the claim is false, then time could be better spent fact-checking claims that are more likely to be false.
(3) Is the fact of interest to the general public? Obviously we shouldn’t be spending time checking trivial claims like a Tweet saying “today I called my mom.”
(4) Does the fact have the potential to be harmful to someone? If the claim is not dangerous, then time could be better spent on checking more-harmful claims.
- Ensure that facts aren’t being reverified. Another crucial element where artificial intelligence can assist humans is helping identify when someone says something that has already been checked. By doing so, the AI can make sure that time isn’t being wasted verifying the same claim twice, as you want to maximize the number of distinct facts that are being checked. This is a complex problem that could be tackled through methods ranging from simple cosine similarity between claims to complex deep neural networks for identifying similarity and underlying intent/implications between claims.
Automated Fact-Checking. If, eventually, we decide that an automatic fact-checking AI is acceptable, how can we implement it? Well, there are two ways of automatic fact-checking:
- Evidence-based fact checking. The obvious method is to cross-reference a claim with existing facts to verify whether that claim is true. This is accurate and highly explainable, but it also comes at some significant costs. Not only does it assume that the claim is checkable (e.g., what if a world leader decided to say “the entire universe was created last Thursday”), but it also requires a huge database of evidence because the AI needs sufficient evidence on so many different areas in order to be able to verify an acceptable proportion of claims.
- Contextual fake-news detection. The alternate method is to use the context of a claim to try to make an assumption on whether or not it’s true. For example, if the claim was made by The Onion, we could probably assume it’s not true. Or, if a claim has been denied by multiple large media outlets like CNN and Fox News, then it can be assumed to be false. Of course, an AI would be able to look much deeper to try to maximize the accuracy of its intuitions. This method is less accurate and less explainable, but it removes the costly requirement of explicit evidence.
Concluding Thoughts. Fake news is a challenging problem in our time. It is an extremely important issue, yet it arguably is not getting enough attention by the public. It is, however, exciting to see how the field of AI in general and NLP, in particular, is attempting to help tackle the problem. Hopefully, with enough time and research, we can come up with a way to quickly and accurately identify fake news so that we don’t continue to suffer the growing consequences of being misinformed.
Truth-Seeking in the Post-Truth Era: Tutorial at EMNLP 2020 was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.