How Auto Insurers Detect and Use Your Driving ‘Fingerprint’
You may not know it, but the way you drive is unique–sort of like a fingerprint. How fast you drive, how tight you turn, and how long you idle in the driveway before hitting the road all help to identify you from others on the road. Thanks to advanced telematics and machine learning systems, automotive insurance companies are finding creative ways of putting this “driving fingerprint” to use in the real world.
To be sure, the impact of big data analytics on the auto insurance industry is certainly growing. While some use the technology to identify segments of the marketplace to target with ads for car insurance, other companies are looking to pull data directly out of the car to help identify dangerous drivers, who will be charged higher premiums.
One of the companies working behind the scenes to help insurers make use of new automotive data feeds is Intelligent Mechatronic Systems (IMS). Every day, the Waterloo, Ontario-based company collects upwards of 6 billion data points from hundreds of thousands of automobiles in the U.S. and Canada for usage-based insurance programs operated by its partners in the insurance industry. That makes the company the largest provider of usage-based car insurance data on the continent.
The driving data, which IMS stores on Hadoop and Cassandra clusters, is collected on behalf of its partners in one of three ways. First, it can collect the data from vehicle telematics systems, if available. Secondly, it can work with insurers to procure “black box” hardware devices, like Progressive’s “Snapshot,” that are installed into the car’s ODB-II port. Lastly, the company has created a smartphone app that collects data using sensors in the phone itself.
The smartphone app is less expensive than the other methods, and offers access to sensors such as accelerometers, which are useful for measuring forward and lateral acceleration. Accelerometers aren’t always available in cars, and that also makes the smartphone approach attractive.
According to Gary Richardson, the UK head of data engineering for KPMG and an expert in the use of big data in the insurance business, insurers in Europe and North America are increasingly turning to the smartphone-based data collection approach.
“We have quite a few companies in the UK who started off putting little black boxes into cars, and are using driver behavior data to reduce the premium they pay,” Richardson tells Datanami. “But with the new wave of companies, it’s less about the black box and more about having a driver app in your Android or iOS phone, because it’s quite expensive to install the box in the car. With the rise of smartphones and accelerators, they’ve moved to those.”
Who’s Got the Wheel?
However, the quality of the data from smartphone apps is not always as consistent as what comes out of car-bound sensors. Plus, there are other problems one has to deal with, such as differentiating when a smartphone owner is a driver, a passenger, or something else entirely.
The failure to accurately determine the actual driver of a car can have certain consequences. Richardson relates a story from a friend in the UK, who caused a minor kerfuffle upon dropping his daughter off at the airport.
“The next month, when she got her statement, her insurance premium was up because of some erratic driving,” Richardson says. “She said, ‘You owe me 20 pounds, because when you dropped me off and drove my car home, you drove like an idiot.'”
This is the sort of data challenge that IMS VP of Innovation Ben Miners is tasked with solving. Last year, the Ph.D. started looking around for a new machine learning library to help solve the “who’s driving?” question, as well as the “mode of operation” question (i.e. are you travelling in a car, a bus, or a train?)
Up to that point, Miners’ development team had primarily used scikit-learn to build the machine learning models to determine the driver of the vehicle. But after working with scikit-learn in a sandboxed environment, Miners was eager to connect his developers with something a little more feature rich and ready for enterprise deployments in a Java environment.
That’s when Miners learned about Weka, a well-regarded (if little known) library of machine learning algorithms maintained by the University of Waikato in New Zealand, and distributed by Pentaho (now a Hitachi Data Systems company). While Miners also considered the MLlib machine learning library in Apache Spark for this project, he thought Weka’s library was better suited for this purpose.
In late 2015, IMS started using Weka’s random forest algorithm to solve these two problems. After learning how a particular driver drives, or detecting the “driver fingerprint,” the algorithm can then pick out those signals in the wild.
“There are certain habits that individuals have whenever they drive,” Miners says. “These are things like how much time it takes to turn the ignition on, or back out of the driveway. Or how tightly do you take corners. How do you slow down, not just the rate of deceleration, but the profile of that braking as well. Do you slow down from 70 to 50 really quickly and then gracefully glide to a halt? Do you do it all in one sudden stop, or is it one linear deceleration? We’re looking at factors that are outside of risk factors, but are unique to individual drivers.”
According to Miners, after a two week training period, IMS can differentiate a driver and passenger using mobile phone data on its own, just based on the driving habits of a driver, with 95% accuracy. “It’s extremely accurate,” he says. The mode-of-transportation question, however, cannot be answered with mobile data alone, and requires geospatial data input into the equation.
Driving in Random Forests
After determining the identity of the driver, then all the other measurements kick into gear, which can either increase how much you pay for insurance or reduce it. IMS is unique in that it provides feedback to the driver to let them know how they’re doing.
“After each trip, they get information about how aggressively each individual has driven, how smoothly they drove on the last trip and how they ranked, and how they performed in terms braking, acceleration and a few other factors,” Miners says. “Those scores are coupled with actual feedback, or reminders to slow down to a more constant speed and predict the road ahead, to have a smoother drive.”
(While IMS has built a demonstration car that provides real-time feedback via audio messages, the company prefers to send reports after the trips are over, via the Web app and email alerts, for safety reasons.)
One of IMS’ next projects involves crash detection. Weka’s machine learning figures to play heavily in this project as well, owing to its “short learning curve” and the “breadth of algorithms available out of the box,” Miners says.
KPMG’s Richardson has seen insurance companies getting more innovative in how they use vehicle information, above and beyond detecting aggressive driving. One of these involves post-crash reconstruction.
“This involves the ability to understand the conditions, both in terms of what the road conditions were like, as well as understanding where you were on the road when you had the accident,” he says. “By stitching those together, they have a better understanding when it comes to claims payout and whose fault it was.”
The car data is also proving useful in detecting dangerous stretches of road. “In the UK there are specific areas of roads which are accident black spots, and it’s interesting using data from OpenStreetMap, which has the road furniture,” (an English-ism for traffic signs and lights) “to see how they distract drivers. Using this kind of data helps to inform local authorities in term of designing road layout. It’s an interesting side consequence.”
Big data analytics is reshaping the super-competitive car insurance industry is multiple ways. While this story is focused on in-car data, there is also a lot of work being done in micro-segmenting the population, which can help insurers reach new customers in new markets who the insurance company previously would not have done business with for lack of detailed risk analysis.
“It enables you to see patterns in market that historically you couldn’t because you’re using very blunt instruments to measure the risk,” Richardson says.