June 14, 2023

OctoML Introduces New Compute Service to Unlock Generative AI Innovation

SEATTLE, June 14, 2023 — OctoML today announced OctoAI, the industry’s first self-optimizing compute service for AI. The new platform offers developers a fully-managed cloud infrastructure designed to abstract away the complexity of building and scaling AI applications. OctoAI provides the freedom to run, tune and scale the models you choose, including off-the-shelf, open-source software (OSS) and custom models. With OctoAI, developers now have easy access to cost efficient and scalable accelerated computing, so they can focus on building high-performance cloud-based AI applications and deliver great user experiences for their customers.

To help developers quickly build on the latest and greatest models, OctoAI is also introducing a library of the world’s fastest and most affordable generative AI models—powered by the platform’s model acceleration capabilities. OSS foundation model templates available at launch include Stable Diffusion 2.1, Dolly v2, Llama 65B, Whisper, FlanUL, and Vicuna.

“AI is no longer a novelty, it’s real business. But efficient compute is critical to making it viable,” said Luis Ceze, CEO, OctoML. “Every company is scrambling to build AI-powered solutions, yet the process of taking a model from development to production is incredibly complex and often requires costly, specialized talent and infrastructure. OctoAI makes models work for businesses, not the other way around. We abstract away all the complexity so developers can focus on building great applications, instead of worrying about managing infrastructure.”

Ceze added, “Our early OctoAI customers are using generative AI models like Stable Diffusion, FILM, and Flan UL to build a huge variety of applications. But they all share two things in common: first, customization is fundamental to delivering unique experiences for their customers, which is how they differentiate. Second, they require the ability to scale their services quickly, leveraging flexible hardware options from NVIDIA GPUs to specialized AI silicon like AWS Inferentia2.”

Features and benefits of OctoAI include:

Ease-of-use: Choose from a library of ready-to-use templates for popular open-source models to simplify deployment. Select and customize (fine-tune) models to meet specific requirements. Easily integrate with app and model development workflows.
Efficiency: Run, tune and scale off-the-shelf, open-source software (OSS) and custom models. Automated hardware selection lets you decide on price-performance tradeoffs.
Freedom: Upgrade to new models as they emerge. Bring your own custom models. No lock-in into the model or service.

OctoML is hosting a virtual event to unveil its new service today Wednesday, June 14 at 10am PT. To register, please visit this page.

About OctoML

OctoML is on a mission to make AI more accessible and sustainable so it can be used to improve lives. Our platform, OctoAI, is a self-optimizing compute service that delivers the most efficient AI infrastructure for open source and custom models. With OctoAI, developers get a managed infrastructure that rivals closed model-API vendors, delivers optimal price/performance, and grants the freedom to customize, integrate, and adopt future generative AI models.

Source: OctoML