Follow Datanami:
September 17, 2021

SambaNova Brings Custom Silicon To Bear on High-End AI Workloads

With its own custom silicon for AI workloads and a $5 billion valuation, it seems likely you’ll be hearing more about the Silicon Valley startup SambaNova Systems and its complete AI stack in the years to come.

SambaNova Systems was founded in 2017 by an all-star cast of processor experts, including Rodrigo Liang, who led the development of 12 generations of SPARC processors at Sun Microsystems and Oracle; Stanford University professor Kunle Olukotun, who’s been called the “father of the multi-core processor;” and Chris Ré, a Stanford associate professor who was awarded the MacArthur Fellowship.

At SambaNova, these chip heavyweights developed their own custom silicon. According to SambaNova Vice President of Product Marshall Choy, existing processors just don’t cut it for modern AI workloads.

“We prototyped this stuff on CPUs, GPUs, FPGAs–you name it–and it quickly became clear that with AI being more probabilistic and less deterministic than transactional processing, all these other traditional processor architectures just weren’t right,” Choy said. “There’s too much overhead for loads and stores and stuff like that, and not enough flexibility and configurability of the silicon. And so we thought, ‘Oh [shoot], we gotta build another chip!’”

But don’t make the mistake of thinking that SambaNova is just another chip company. While it did develop its Reconfigurable Dataflow Unit (RDU) with a 7nm process, and contract with TSMC to manufacture it, the company doesn’t actually sell the chip. Instead, the company built a complete machine learning stack around this processor.

SambaNova Systems co-founders (l to r): Chief Technologist Kunle Olukotun; CEO Rodrigo Liang; and Chris Ré, head of engineering

The company sells this combined hardware and software stack in one of two ways: in pre-assembled racks that companies can roll into their data centers, called the DataScale offering; or via the software-as-a-service (SaaS) delivery route, where all customers do is call the stack via APIs, which it calls Dataflow-as-a-Service. (Customers can also get the hardware behind the DaaS offering installed on-prem beyond their firewall, and have SambaNova manage it, providing a blended approach.)

What sets SambaNova apart from other vendors chasing AI opportunities is its capability to deliver accuracy and performance at scale for computer vision, NLP, and machine learning projects, according to Choy.

For example, in computer vision, its DataScale and DaaS offerings are able to train and infer on very high-resolution images, including those 4K and above. By comparison, most other commercially available solutions require the image to be downscaled or chopped up into multiple images before it will fit into memory, Choy said.

“We can train a model with what we call the true resolution of the image,” he said. “So without down-sampling it, without tiling it, all the way up to 60k by 40k images generated by a satellite and anything below that.”

While customers can make their AI work by downscaling images, they will lose potentially valuable accuracy, Choy said. Tiling an image also introduces the need to hand label many more images before feeding it into the model, he said. And it also runs the risk of missing important details that exist in the original image if it happens to be split in that particular place, potentially missing the cancer tumor or manufacturing defect that the AI was designed to detect.

With 1.5TB of memory per RDU, SambaNova is able to bring large amounts of memory to bear on AI problems (Source: SambaNova Hot Chips presentation)

“That’s a core advantage of something like this,” Choy said of SambaNova’s approach. “You basically get out of memory errors with other platforms. So it’s literally enabling people to do things that they cannot do today and deliver results that were unattainable prior.”

Among the handful of customers that SambaNova can disclose are a pair of national laboratories. Lawrence Livermore National Lab is using a DataScale cluster with a pair of workloads, including a modeling and simulation workload for physics research, and another for anti-viral research for COVID-19. The system is paired with LLNL’s Corona supercomputer.

“We’re offloading certain parts of the larger mod-sim workload onto a machine learning framework,” Choy said. “We’re doing large outer loops of training with many, many dozens of inner loops of inferencing, and then feeding the results back to the main simulation, which is then speeding up the overall simulation by about 5x, according to the customer.”

Argonne National Lab also has a DataScale deployment in its AI testbed.

Other current customers include unnamed banks, which are using SambaNova offerings for anomaly detection and fraud detection, as well as to speed up claims processing. SambaNova also has customers in the high-speed trading arena, but Choy doesn’t know what they’re using it for. “I have no idea what their model is,” he said. “They’ll never tell anybody.”

Organizations with more established data science programs will be more likely to buy the shrink-wrapped DataScale offering, enabling their teams of data scientists to bring their own in-house models developed in Python and PyTorch, and benefit from the increases in performance and accuracy that SambaNova can provide, without the overhead and complexity of assembling, integrating, and maintaining their own infrastructure.

“And then there’s many other people who are purely looking at outcomes,” Choy said. “What do they care if it’s BERT model, an LSTM model, or a GPT model for language processing? They just want to have the best results. And so they’re basically offloading all that work to SambaNova and we’re just providing a results-oriented outcome to consume.”

These types of customers are more likely to buy the DaaS offering, which the company introduced in late 2020.

SambaNova can train and infer on images with up to 50,000 pixels across (Source: SambaNova Hot Chips presentation)

“We had a bunch of other folks that we were talking to say, ‘look, this sounds really great, but…I’m not Google. I don’t have 3,000 data scientists. I don’t have 300 data scientists. I don’t even have 30. I’ve got three [data scientists] and budget plans to expand that team to six people in the next year or so. And so how do I use this?’

“That’s where we said, look, we’re just going to up level the abstraction level of the system beyond the hardware, beyond the models themselves, and just give you API calls,” he continued. “This makes it accessible to people who maybe don’t know much about AI at all.”

To be sure, SambaNova is not a silver bullet for AI. It’s not handling every aspect of the machine learning process. It’s up to customers to bring good, clean data to the party. And as Choy explained, the company isn’t providing MLOps tools or anything like that (although it is looking to particulate in that growing ecosystem).

But if your data is in fairly reasonable shape, the company can help you automate decisions with it using AI.

“I’ve got a bounty of PhDs who are keeping up with and driving the latest trends in these areas,” Choy said. “We give you the model. You don’t have to worry about model selection, model tuning, model maintenance, with all the cost and time related to that. We just [run] your custom data sets.”

In April, the Palo Alto, Calif., company announced the closing of a Series D round in the amount of $676 million at a valuation of $5.1 billion. The round was led by SoftBank, with participation by new investors Temasek and the government of Singapore Investment Corp. (GIC), both new investors, along with existing investors BlackRock, Intel Capital, GV (formerly Google Ventures), Walden International, and WRVI.

While building your own chip is a capital-intensive business, the more than $1 billion in total investments ($1.1 billion to be exact) shows that venture capitalist have a lot of faith in SambaNova’s approach. With AI expected to generate trillions of dollars in new value in the years to come, it may not be a bad investment.

Related Items:

The Data Proxy That Let CVS See Around the COVID Corner

AI and ML for the Masses

Companies Going ‘All In’ on AI, Appen Study Says