Kinetica Gets Fuzzy with In-GPU Algorithms
Fuzzy Logix made its name by providing analytics that ran inside MPP databases, giving customers a 10x to 100x performance boost compared to standard relational databases. Now that it’s ported its library of algorithms to run atop Kinetica’s GPU-based database, customers can expect to see performance gains of up to 500x more.
Fuzzy Logix has done the hard work of parallelizing a good chunk of its library of 650 or so algorithms to run inside of Kinetica‘s database, which sits atop a combination of GPUs and CPUs, Fuzzy Logix CEO and co-founder Partha Sen told Datanami last week in a phone interview ahead of this week’s NVidia‘s GPU Technology Conference.
“There’s a substantial amount of effort that’s actually required,” Sen said. “That’s in a way good because that opens the door for us. I don’t think that customer are going to actually be able to do this [themselves].”
Fuzzy Logix has been experimenting with GPU technology for years, and found that it could get good results by using NVidia’s CUDA tools to run the algorithms in a parallelized fashion atop GPUs. But the resulting integration was hamstrung by one key element that was lacking: a database. Most of the early GPU work ran atop flat files. This simply wouldn’t do for a company that calls itself “the in-database analytics company.”
“We needed a database that was accelerated on GPUs,” says Aashu Virmani, chief marketing officer of Fuzzy Logix. “Kinetica provide that framework for us.”
Piggybacking atop Kinetica’s GPU database gives Fuzzy Logix a head start in solving a new class of data problems. Despite the gains from massively parallel processing (MPP) databases and flat-file approaches in Hadoop, there are a class of problems that continue to defy practical solutions, Virmani says. Example of these problems include the retailer looking to model consumer behavior every night, the bank that needs to analyze risk in its balance sheet, and the hospital hoping to predict onset of chronic diseases.
One of Fuzzy Logix’s customers in the telecommunication space could have used GPUs with its micro-segmentation work. “You can only predict so accurately when you’re looking at granularity that capture 100,000 customers,” Virmani says. “If you could go down to 1,000 person granularity, you could obviously be much more predictive. And that’s where the CPU-GPU combination helps.”
The GPU solution could also come in handy with another Fuzzy customer that’s a supermarket chain in the UK that needs to optimize the distribution of fresh food to its stores. “They have 40,000 SKUs [stock keeping units] but they only have 5,000 fresh foods or perishable and they wanted to find out how much of this perishable food to supply to a given store on a given day based on the weather,” Virmani says. “That just is a very hard problem to crack, when have 3,000 stores, 5,000 SKUs and the weather changes every four hours.”
Faced with the technological limitations — including the pace of SQL execution within a column-oriented relational databases running atop a standard CPUs — it took the company four to five days to optimize the food distribution, which made its utility questionable. However, with GPUs dramatically shrinking the model runs, that problem suddenly becomes possible to solve in a timeframe that matters. “The beauty of this is we are beginning to solve problem that previously were intractable,” Virmani says.
The two companies plan to go to market together through Kinetica’s growing sales and distribution organization. The analytics will be labeled “powered by Fuzzy Logix” if they choose the company’s library.
Not all algorithms will see a big speed-up when running under a GPU database, but some will see a tremendous performance boost, Virmani says. “There are certain techniques, like sparse matrix multiplication, that are incredibly distributable,” he says. “If there are 4,000 cores on a [GPU] chip, you can almost farm out each column of the matrix and that multiplication happens on each core of the GPU.”
Fuzzy Logix has about 40 customers, most of which are Fortune 500 firms. The Charlotte, North Carolina-based company was founded in 2007 by Sen and COO Mike Upchurch, both of whom were exposed to big data problems and quantitative data mining methods while working at Bank of America’s investment arm.
Kinetica, meanwhile, was founded a little ways north, at the United States Army Intelligence and Security Command (INSCOM) at Fort Belvoir in Virgina to assist with the military’s tracking of terrorist targets. The company, which was founded by CEO Vij Amit Vij and CTO Nima Negahban, changed its name from GIS Federal last year when it sought to sell its GPU database to the commercial market.