Follow Datanami:
October 8, 2021

Big Data Shines a Light on Bad Actors, But Shadows Remain

(Image courtesy ICIJ)

This week’s publication of the Pandora Papers–which the International Consortium of International Journalists based on a trove of private data leaked from offshore tax havens–showcased the alarming extent of fraud and corruption in the world. While big data tech like graph analytics and machine learning can help to a shine light on bad actors, we’ll always be playing catch up, fraud hunters tell Datanami.

The sheer numbers behind the Pandora Papers, which the ICIJ published on October 3, 2021, are staggering. The ICIJ was provided with 11.9 million documents, including text files, PDFs, images, emails, and spreadsheets, from 14 offshore tax havens, totaling 2.9 TB of data. The documents contained information about 27,000 shell companies created to protect the assets of 29,000 beneficial owners, including 130 billionaires and 330 politicians from 90 countries.

As with earlier ICIJ investigations–including the FinCEN files in 2020, the West Africa Leaks in 2018, and the original Panama Papers in 2016, among others–the journalists turned to big data software to help connect the dots. The tech stack ultimately deployed by the ICIJ included Neo4j’s graph database, graph visualization tools from Linkurious, and a healthy dollop of Python for machine learning, including scikit-learn and Fondeur.

The technology side of the story is certainly an interesting one. The folks at Neo4j did good job of showing how their graph database was employed to expose the elaborate connections among various entities that bad actors and their agents string together to shield their illegal activities. That the nefarious dealings of people like King Abdullah II of Jordan, Azerbaijan’s ruling Aliyev family, Ukrainian president Volodymyr Zelensky, and former British Prime Minister Tony Blair was finally brought to light is a result of this tough sleuth work.

Fraud Is Booming

But don’t count former FBI agent Clark Frogley as one of the folks who are startled by the extent of the fraud.

To read the Pandora Papers, click here (Image courtesy ICIJ)

“It doesn’t surprise me,” said Frogley, who has spent 30 years fighting financial crime. “I don’t know how long it will be before we get the next one. Just like with the Panama Papers and Paradise Papers and other things that happened, there’s so much of this stuff going on in the world today.”

Frogley currently is the head of financial crime solutions at Quantexa, a UK-based company that helps banks, telecommunication providers, and government agencies to uncover risk that may be hidden in their data. Specifically, Quantexa employs technology like graph analytics to cut through the clutter that white-collar criminals purposefully create to enable their fraud and money-laundering schemes.

In other words, Quantexa does the type of work that ICIJ did with Pandora Papers, but it does it on behalf of companies 365 days per year. Unfortunately, fraud is a growth industry, and no amount of technology is likely going to stop it.

“We saw a huge spike in fraud during the pandemic,” said Frogley, who previously fought financial crimes while employed by IBM, Ernst and Young, AIG, Deutsch Bank, Goldman Sachs, and other companies. “The amount of COVID related fraud is just skyrocketing. It’s absolutely crazy.”

The ability to spot fraud in real time is critical to having any chance to stop it, according to Frogley. Machine learning technology, in particular, enables banks and others with significant risk exposure to quickly create new models to target the fraudsters and their schemes. In some cases, instead of months to create a new fraud model, it can be rolled out in just minutes, which is having a noticeable impact on payment fraud, he said.

Unfortunately, even with the advantage afforded by machine learning, it’s a constant game of cat and mouse, and the mouse appears to be winning.

“From a technology perspective and an interdiction perspective, it’s challenging,” Frogley said. “It’s difficult to keep up with the changes. By the time you figure something out, they moved on two or three times to a whole new scam and we’re always playing catch up.”

Bad Actors in Plain Sight

Spotting the fraudulent schemes can be tough, especially when they are hidden among trillions of legitimate transactions. A related technique is to ensure you have a firm grasp on the actual identities of the folks you are dealing with.

Fraud has skyrocketed during the COVID-19 pandemic (beeboys/Shutterstock)

That’s the approach that enabled by Tresata, which develops technology based on network theory to help financial institutions know the company their clients and prospective clients keep.

“The known bad actors are actually not hidden in plain sight. They’re completely open,” said Abhishek Mehta, the CEO and founder of Tresata. “If you go and look, there are about 14,000 of them. At their peak there were 17,000 and luckily for humanity, some of them do succumb to natural circumstances and, humans not being immortal, have gone and left the planet. But more take their place.”

Mehta is referring to the Department of Treasury’s Office of Foreign Assets Control, or the OFAC list, which includes the names of specially designated nationals (or SDNs) whose assets have been blocked by the U.S. Government and with whom it is illegal to do business. OFAC is one source of data for Tresata’s database, as is the information that the ICIJ turns up in its investigations, such as the Pandora Papers. Tresata collects it all, and then uses AI technology based on network theory to find the common denominators that link these bad actors to others.

Last month, Tresta announced that it was making its database of bad actors available for the public to search, free of charge, as a service. The appropriately named Bad Actors Discovery as a Service, or BADaaS, will help people avoid dealing with corrupt individuals, Mehta said.

“You look at some of these celebrities that are gathering the common mass media’s attention on Pandora Papers–if you had five minutes, if you were to go into BADaaS and look for Julio Iglesias or look for Shakira or look for Tony Blair–they’re already in our database,” Mehta said. “The solution to our problem as humanity isn’t more leaks…The issue isn’t the data. The issue is the ability to integrate that data at absolute scale, and then use AI to find the hidden tracks that people are using.”

Tresata’s BADaaS program is free; you can sign up here (Image courtesy Tresata)

Linked Data

Tresata’s software enables users to search for connections using names as other important pieces of data, including telephone numbers, Social Security numbers, and even IP addresses. It goes a step beyond that, however, and searches for behavioral data. It’s all based on the idea of connectivism, Mehta said.

The result is a system that’s able to spot links in records, even if the links are not based on explicit attributes. This extra AI-driven capability is critical because criminals have learned to bypass rules-based systems that are designed to signal on explicit attributes.

“The challenge becomes, with actors, they are smart enough to know what rules are looking for connections, and they change the rules of the game. “So what we have bult is an artificial intelligence system that’s not rules-based, and says as long as I find any common behavior, expose it.”

That’s not to say that every link that Tresata’s software delivers uncovers a bad actor. There are many legitimate reasons for people to form limited liability corporations, Mehta said. The hope is that, by putting more linked data in the hands of the citizenry, then the type of sleuthing that investigative journalists do can be done on a much wider scale.

“We’re not saying you are guilty,” Mehta said. “It is simply a new tool in our armory, whether you’re a regulator whether you are a bank or a realty firm…Good people are being exploited by bad people, and we want to ultimately stop that.”

Havens of Criminal Conduct

Fraudsters know they can hide in the trees, but big data helps spot them when they slip up and make a mistake, according to Quantexta’s Frogley.

“So many people [are] trying to hide various bits of information that at some point, that information always seems to make its way out,” he said. “Usually people mess up. Nobody’s perfect. So we just wait for some of those mess ups to bring them down.”

It’s the little things that trip them up, Frogley says. Clues lie in the metadata, including IP addresses that give away location. However, while big data tech is making a dent in fraud, ultimately the cards are stacked against the people who would stop it, he said.

Tax havens will continue to provide cover for criminals (Zephyr_p/Shutterstock)

“There are lots of tax havens, lots of ways around disclosing who are the ultimate beneficial owners,” Frogley said. “And as long as that exists, then you’re always going to have situations like this where people will take advantage, try to hide their activity, and most of the time for nefarious reason.”

Many of the folks who were exposed in the Pandora Papers hail from countries with less-than-stellar laws when it comes to transparency. In fact, in most cases, while the money-hiding schemes the ICIJ uncovered are unsavory, they are perfectly legal.

“Unfortunately, a lot of the people that are in the various governments around the world are also guilty of some of the secrecy that is coming to light,” Frogley said. “You’ve got political corruption and other things that just exist everywhere. That makes changing the laws and really making progress that much slower.”

Related Items:

ICIJ Turns to Big Data Tech to Unravel FinCEN Files

Analytics Power Discoveries in ICIJ’s West Africa Leaks

Inside the Panama Papers: How Cloud Analytics Made It All Possible