DeepliteRT Enables Fast and Efficient AI Modeling on Edge Devices
Enterprises today have increasingly complex and expansive datasets that require substantial computing resources to manage. Machine learning with artificial intelligence models has been an indispensable tool for reducing the time and energy it takes to train and deploy AI models for deep neural networks of accurate, usable data.
One of the companies looking to help organizations leverage these technologies is Deeplite, a Montreal-based developer of AI optimization software for deep neural networks.
The company has just released Deeplite Runtime (DeepliteRT). DeepliteRT enables utilization of AI models to process complex data, such as that used in video analytics or facial recognition, on smaller devices on the edge.
Since edge devices typically have finite processing and power capabilities, the company asserts that they have solved this problem with DeepliteRT, calling it “an innovative way to run ultra-compact quantized models on commodity Arm processors, while at the same time maintaining model accuracy” in a company press release.
The Arm processors mentioned are the Cortex-A Series CPUs, a low-power series found in small equipment like security cameras, drones, and mobile devices. Deeplite’s compatibility with Arm CPUs dispenses with the need for expensive or custom GPU-based hardware. DeepliteRT can run AI models with 2-bit quantization runtime for Arm Cortex-A CPUs to speed up computer vision and video analytics ML inference.
Quantization is a method used to execute some of a model’s operations on tensors with integers rather than floating point values. According to Mathworks, quantization reduces memory and power expenditure on smaller devices and is “an iterative process to achieve acceptable accuracy of the (deep neural) network.”
DeepliteRT expands the capabilities of Deeplite’s current inference optimization solutions, most notably Deeplite Neutrino, an intelligent optimization engine for deep neural networks. According to the company, Neutrino takes edge device limitations into account while allowing for input of “large, initial DNN models that have been trained for a specific use case,” resulting in condensed and accurate models in a shorter timeframe. The company claims Neutrino can run models with 10 times more speed while being 100 times smaller using 20 times less power.
In an increasingly mobile world where essential tasks are often carried out by edge devices, speed and performance gains are a win for the organizations that can break free from computing restraints using artificial intelligence solutions like DeepliteRT.
“To make AI more accessible and human-centered, it needs to function closer to where people live and work and do so across a wide array of hardware without compromising performance. Looking beyond mere performance, organizations are also seeking to bring AI to new areas that previously could not be reached, and much of this is in edge devices,” said Bradley Shimmin, Chief Analyst for AI Platforms, Analytics, and Data Management at Omdia. “By making AI smaller and faster, Deeplite is helping to bring AI to new edge applications and new products, services and places where it can benefit more people and organizations.”