Follow Datanami:
October 7, 2020

AI Startup Uses FPGAs to Speed Training, Inference

(By BeeBright/Shutterstock)

The latest AI startup emerging from stealth mode claims to be the first to integrate model training and inference for deep learning at the network edge, replacing GPUs with FPGA accelerators.

Deep AI Technologies said Wednesday (Oct. 7) its edge approach to AI training and inference runs on commodity FPGAs rather than costly, power-hungry GPUs, yielding what the startup claims is a 10-fold cost and performance boost.

The edge framework is also promoted as faster and more secure than cloud-based model development.

Along with leveraging FPGA accelerator cards from Xilinx and server vendors, the Israeli startup’s proprietary technology uses an 8-bit fixed-point data type with a high sparsity ratio for network weights during model training. That approach contrasts with GPU-based training using a 32-bit floating-point with no sparsity.

Among the goals of sparse data is removing as many unneeded parameters as possible, shortening the amount of time to achieve a desired result.

Deep AI said it uses new algorithms to compensate for the lower precision of the 8-bit fixed-point data point with high sparsity, thereby maintaining accuracy while accelerating the model-training process.

While FPGA hardware is designed to be transparent to data scientists, the framework supports Keras, PyTorch, TensorFlow and other deep learning frameworks, it added.

The startup’s edge training and inference approach runs on Xilinx Alveo accelerator cards along with PCIe add-in cards from server vendors. Along with Xilinx, Deep AI said it is working with Dell Technologies (NYSE: DELL) and datacenter infrastructure vendor One Convergence.

The edge trainer is available now for on-premise deployments running on Xilinx FPGA cards. A cloud-based version running on Xilinx FPGA-as-a-service instances will be released during the first quarter of 2021.

The three-year-old company makes the case that data fed into cloud platforms to update training models and inference queries is generated mostly at the edge. Moving all those data sets to and from the cloud results in “unsustainable network bandwidth, high cost and slow responsiveness” while compromising data privacy and security and reducing “device autonomy and application reliability.”

The framework is promoted as a “holistic” deep learning approach for training and inference at the network edge, from 5G cell sites to mobile devices. The result, the startup claimed, is “real-time retraining of [models] in parallel to online inference on the same device.”

Moshe Mashali, Deep AI’s CTO and co-founder, was scheduled to detail the edge training and inference framework during this week’s Linley Group fall processor conference.

“Deep AI has demonstrated impressive capability to address the challenges of fixed-point training for deep learning models” said Ramine Roane, a marketing vice president at Xilinx.

An engineer from Intel Corp. (NASDAQ: INTC), which owns FPGA rival Altera, was also scheduled to speak at the processor forum on its FPGA architecture “optimized for AI acceleration.”

Recent items:

How Sparse Data Can Drive Information Density

When Dense Matrix Representations Beat Sparse

Datanami