How AI Accelerates the Fight Against Fake News
Microchips in coronavirus vaccines. Pedophile rings in pizza restaurants. Jewish space lasers. The Internet–bless its heart–has always suffered from its share of wacky wingnuts and conspiracy theories. But in the wake of the November 3 presidential election and the January 6 Capitol riot, social media platforms and governments are stepping up their efforts to crack down on the most problematic content, and AI plays a leading role.
The concern is not academic. The January 6 Capitol riot is painful example of how misinformation spread online can culminate in citizens being driven to partake in real world violence. Weeks after the riot, Facebook and Twitter took action by banning large numbers of people and groups that they identified as sources or perpetuators of misinformation.
That wasn’t fast enough for Kristen Clarke, president and executive director of the Lawyers’ Committee for Civil Rights Under Law.
“It shouldn’t take several weeks for a company to formally decide to remove conspiracy theories, lies, and false election rumors, especially when these disinformation campaigns have resulted in real-life threats of violence, caused public confusion and undermined democracy,” she said.
The question, then, becomes about speed: How quickly can misinformation and fake news be detected? Social media platforms traditionally have relied on human moderators for much of this work. But with billions of posts per day, the volume of content is simply too great for human eyes alone. Automation is the only viable approach, which means a greater reliance on AI.
One company on the cutting edge of using AI to combat fake news is Logically. The UK firm has developed a sophisticated solution–available as a mobile app or a Chrome plug-in–that can classify a given piece of content as fake news or misinformation before the user has a chance to read it.
The magic of Logically’s solution happens behind the scenes. According to Anil Bandhakavi, who heads up the company’s data science initiative, Logically’s blended approach gives it an advantage over competitive offerings.
“We have a three-pronged approach that broadly looks at the origins of a given content, the content itself, and also the associated metadata that provides us additional informant and clues about a given piece of content,” he says.
From a technical standpoint, the company makes extensive use of machine learning, NLP, network theory, knowledge graphs, and traditional rules-based decisioning to automatically identify and classify large amounts of suspicious content. It also relies extensively on human experts, not only to provide a check on the algorithms for iffy edge-cases that could go either way, but also for training the algorithms and making them better. Natural language generation is also used to explain why the system determined a given piece of content is fake or harmful.
AI technologies and techniques are used at several stages of Logically’s solution. It uses deep learning techniques to bolster its capability to understand the meaning behind text and other content. Specifically in this vein, it uses a custom-trained, multi-billion-parameter language model based on BERT (Bidirectional Encoder Representations from Transformers), a transformer-based NLP mode model that was released by Google in 2018, Bandhakavi says.
Natural language understanding is a critical and technically challenging aspect of what Loagically does, but it forms just part of the overall solution. According to Bandhakavi, it’s also important to track with whom a particular piece of content or a news article originated and the networks across which it spreads (which is where its knowledge graphs comes in handy).
“We built technology to understand the context in which misinformation is embedded, and how it spreads in a network, the Internet, or the social media,” Bandhakavi says. “[We also built technology] to understand the integration patterns and communities and users with fake news and misinformation.”
Having a multi-pronged approach to detecting fake news and misinformation is critical to success, says Joel Mercer, who heads up product design for Logically.
“The difficult thing is basically taking one point–whether that’s taking credibility or the content level or sentiment–on its own to indicate that’s a piece of problematic content by itself,” Mercer says. “Generally, what we’re doing is taking a multitude of different signals whether that’s simple stuff like automation scoring and bot scoring, in combination with technical analysis of the content, and then the accounts themselves to really build a broader picture.”
One of the challenges is how linguistic patterns constantly change. The Internet is a wellspring for new ideas and new words, and that requires Logically to retrain its language model very frequently. But even that alone is not enough to ensure the highest possible accuracy.
“Just looking at content and its origin might not give you the full picture about the nature of the content,” Bandhakavi tells Datanami. “We look beyond these two aspects as we model a lot of metadata and [conduct] the network analysis. And our in-house experts help us to constantly improve our model.”
The in-house experts are a critical element of Logically’s approach, as they provide a check against the AI models for late-breaking news and fast-changing information types. The humans also provide a valuable pool of knowledge upon which to train the NLU models.
“AI alone can get very stale when it comes to the understanding the evolution of content, so it’s always important and useful to have expert intelligence input,” Bandhakavi says. “We’ve realized the value of coupling AI with expert intelligence. And we’re able to adopt our own unique way in which human-in-the-loop frameworks can be leveraged.”
According to internal benchmarks, Logically’s systems demonstrates an accuracy rate in the mid 90s. In other words, it will misidentify a piece of real news as fake (or identify a piece of fake news as real) about five times out of 100. That is not perfect, by any means, but it’s currently the state of the art, Bandhakavi says. “Our technology is constantly advancing and improving, especially with our investment in human-in-the-loop AI,” he says.
Getting this technology to the front-lines in the battle against misinformation is critical. During the 2020 presidential election, Logically worked with the electoral commission for a major battleground state to fight fake news. It also worked with the government of India for its 2019 general election. The company is working with major social media platforms, although it doesn’t say which ones.
Facebook and Twitter users are accustomed to seeing content that is flagged as false or misleading. But just having a given piece of content flagged isn’t enough to stop the tsunami of bad data, according to a recent report, titled “Tailoring heuristics and timing AI interventions for supporting news veracity assessments,” published in Computers in Human Behavior Reports.
The researchers found that, when news consumers hold strong prior beliefs about a particular topic, they are less willing to accept the advice offered by the AI system. But when they don’t hold strong beliefs—such as with a novel news story—then they’re more receptive to AI.
“It’s not enough to build a good tool that will accurately determine if a news story is fake,” Dorit Nevo, an associate professor at Rensselaer Polytechnic Institute (RPI) and a co-author of the report, says in a press release. “People actually have to believe the explanation and advice the AI gives them, which is why we are looking at tailoring the advice to specific heuristics. If we can get to people early on when the story breaks and use specific rationales to explain why the AI is making the judgment, they’re more likely to accept the advice.”
The battle over fake news and misinformation is as dynamic as the human experience, which means there will never be a single solution or approach that works 100% of the time. Humans are fallible, which means an infallible AI system for fake news identification is an impossibility. But with enough time, technology, and diligence, we can create AI systems can put a dent in the spread of damaging information, while minimizing collateral damage to the freedom of speech and thought that the Internet was created with in the first place. Considering the real-world harm that misinformation is creating, that possibility must be explored.