How Advanced Analytics Is Helping to Flush Out Insurance Fraud
Advanced analytics and machine learning modeling require lots of data to be trained, and in the insurance business, there’s no better place to get data than Verisk Analytics. Here’s how the Jersey City, New Jersey-based company is using big data tools and techniques to detect fraudulent claims and connect the dots on fraud networks.
Founded as a governmental entity in the early 1970s, Verisk Analytics today is a privately held clearinghouse for data and analytics services for organizations in the insurance, financial services, energy, and government businesses. Most (if not all) of the insurance companies in the United States share their anonymized claims data with Verisk, which then aggregates and enriches it and sells it back to insurance company in the form of an analytic service.
Having an expansive view of the insurance activity of individual policyholders benefits the entire industry, says Anthony Fiorino, vice president and chief data officer Verisk.
“By giving them claims history across all the different carriers – not just themselves – the number they plug into their underwriting models is more accurate because they have a broad view, not just their own perspective,” Fiorino tells Datanami. “We’ve been trusted in the insurance industry for 50-something years to be that person in the middle.”
A Hurricane of Fraud
Insurance fraud (excluding medical fraud) is a $40-billion business in the United States, according to the latest data from the FBI. Large storms that get national media attention also tend to attract the attention of fraudsters. For example, the FBI estimates that fraudsters collected $6 billion of the $80 billion that the Federal Government provided for relief from Hurricane Katrina.
As hurricane season approaches along the Atlantic Coast, Verisk is busy bolstering its clients defenses against the possibility of fraud. One of the ways it does this is by assessing the risk that its clients’ policies pose along possible storm routes. The company uses weather models to determine the claims exposure that insurance companies have for likely routes for hurricanes.
This enables insurance companies prepare for the worst, Fiorino says. “Let’s say I’m policy heavy in Houston, Texas and I know this hurricane may be going in this direction,” he says. “I don’t want to have too much risk in one particular area, so I may have to buy additional reinsurance to cover myself.”
When hurricanes and other severe weather events do happen, Verisk tracks the claim activity over both time and geographical metrics, using TIBCO‘s Spotfire analytics tool. It also uses deep learning to determine whether individual claims are likely to be fraudulent.
“Let’s say a hurricane is hitting Charleston, South Carolina. We can run this report and you’ll see the claims start to light up,” he says. “As I model this over time, you see the red areas start to inflate, and as you track the path of the storm, you see these claim coming in.”
Most insurance claims are filed within a couple weeks of the event, so claims filed outside of that period draw a red flag. “We’ll say, hey it’s been a month and a half, why are we still getting a claim for this area?” Fiorino says. “Or this claim came in outside the reported hail area. Why do they have a hail claim? We’ll use it to identify fraud as well.”
TIBCO’s Spotfire powers the visualizations that analysts use to identify anomalous claims that could be fraudulent. “If I see on the map one claim laying outside the circle I draw for an area I think that hail was present, that’s what I want to focus on,” Fiorino says. “So you can drill into that data and look for those suspect cases.”
This approach lets Verisk and its clients concentrate their manpower on the outlier cases that are more likely to be connected to a fraudulent claim. In the past, when this level of detail was not available, insurance companies would have protected themselves by setting thresholds and then manually inspecting the claim. “But you would miss certain things,” Fiorino says.
While most insurance companies still like to have human eyes in the process, some are beginning to automate processes, using both traditional rules-based approaches as well as artificial intelligence. One area where Verisk is getting traction with AI is in claims processing.
As part of the claims process, policyholders must submit an image of the claim. Verisk has developed a service that utilizes deep neural networks to identify doctored images on behalf of its clients. The service analyzes various aspects of the image, including the metadata, to detect whether the image has been manipulated in some way. Some fraudsters may try and grab a photo off the Net, but Verisk will find it with a simple Google Image search.
“The more data that comes in, the better trained those models are, and we augment them all the time,” Fiorino says. ” The other thing is just getting more and more advance on image forensics and Identifying doctored images and things that just don’t make sense. A lot of our stuff comes in through pictures, so that’s where there’s a lot of potential for fraud.”
Verisk uses a range of tools to analyze data for signs of fraud, including Apache Spark, R, Python, and various ETL tools and AI frameworks. It uses Syncsort‘s Trillium and Melissa Data to verify data, and stores data in Oracle, MongoDB, and SQL Server databases, a Hadoop data lake, a Snowflake data warehouse, and IBM mainframes.
One of the more powerful ways that Verisk can track down fraudsters is by tracing the personal connections among the folks who file claims. It uses Neo4j‘s graph database to find friends, family members, and other acquaintances that may be involved in fraud.
“[The fraudsters] always seem to be one step ahead, and that’s where AI picks up a lot of these trends and picks up a lot of the connections between people who don’t seem to be connected, but are,” Fiorino says. “Somebody’s brother-in-law files a claim instead of the person who filed the claim last year for a hurricane. But it’s at the same address. So it picks up on those things.”
As the insurance industry adopts automation, it could potentially increase fraud rates. That’s because fraudsters will detect the thresholds that insurance companies set for so-called straight through processing (STP). When the maximum threshold for STP becomes known, insurance companies will often see a flurry of activity coming in just below that threshold.
That’s one of the reasons why companies like Verisk are adopting advanced analytics AI as automation spreads: to flesh out these fraud networks before they even have a chance to file a claim.
“Identifying the fraud after the fact still takes a lot of man hours and a lot of work,” Fiorino says. “But identifying it in the underwriting stage and connecting these networks is the big thing. These guys know how to game the system. So identifying these fraud networks is where the biggest financial savings is.”