November 29, 2023

OctoML Launches OctoAI Text Gen Solution

SEATTLE, Nov. 29, 2023 — OctoML announced the launch of its OctoAI Text Gen Solution to empower application builders to run and scale applications on their choice of Llama 2 Chat, Code Llama Instruct and Mistral Instruct models—all on one unified API endpoint.

The new release offers the fastest fleet of accelerated open source LLMs, including numerous configurations of Llama 2, Mistral, and the unique option to bring your own fine-tuned Llama 2 models. OctoAI’s Text Gen Solution, together with the OctoAI Image Gen Solution, now offers a flexible “model-cocktail” alternative to monolithic multi-modal models, enabling developers to build highly composable multi-modal applications.

“There’s no one-size-fits-all approach to building text generation applications,” said Luis Ceze, CEO of OctoML. “And not every use case calls for an inefficient, costly mega-model. There are many instances where a smaller, fine-tuned model can get the job done with less overhead. OctoAI Text Gen gives app builders the flexibility to mix their own model cocktail using OSS models, or run their own model variant if that’s the best fit.”

With the OctoAI Text Gen Solution, developers can now easily run inferences against multiple OSS model families, sizes, and variants—all against one scalable production-grade API endpoint. This allows for easy swapping of models with minimal code changes, a seamless approach that has resonated with early adopters given today’s focus on evaluating and bringing together multiple OSS models. In addition, OctoAI’s enterprise tier allows customers to work with the team for contractual latency SLAs, and for private network connectivity to their environments.

Benefits and features:

Unparalleled Speed and Cost Efficiency: Early results show speeds up to 169 tokens per second on the popular Code Llama 34B model, with no quantization and before applying optimizations like batching—all at best per token prices available today
Broadest optionality with OSS LLM Models: The most comprehensive set of production-ready LLM models including your choice of Llama 2, Code Llama, and Mistral variants—all delivered on one unified API endpoint
Robust Delivery and Proven Scalability: More than a billion customer inferences served, with individual customers running greater than one million per day, and reliably handling 10X usage surges with proven performance.
Flexible “model-cocktail” approach to multi-modal needs: Text Gen solution complements OctoAI’s recently launched Image Gen Solution and all the models available in the OctoAI compute service, empowering customers to easily build multi-modal application using their preferred mix of OSS models, as demonstrated in the OctoStudio demo application walkthrough.

“The LLM landscape is changing almost every day, and we need the flexibility to quickly select and test the latest options,” said Matt Shumer, CEO of Hyperwrite. “OctoAI made it easy for us to evaluate a number of fine-tuned model variants for our needs, identify the best one, and move it to production for our application.”

OctoAI Text Gen customers can also bring their own fine-tuned Llama 2 variant or checkpoint and run it at low-latency at massive scale. This BYO model capability allows for a high degree of customization to align with specific requirements of customer projects.

About OctoML

OctoML is on a mission to make AI more accessible and sustainable so it can be used to improve lives. Our platform, OctoAI, delivers generative AI infrastructure for app builders to run, tune, and scale the models that power AI applications. With the fastest foundation models on the market, including Llama 2, WhisperX, and SDXL, end-to-end solutions, and world-class ML systems under the hood, developers can focus on building apps that wow their customers without becoming AI infrastructure experts.

Source: OctoML

OctoML Launches OctoAI Text Gen Solution

April 26, 2024

April 25, 2024

April 24, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

CDAO Canada Public Sector 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

OctoML Launches OctoAI Text Gen Solution

April 26, 2024

April 25, 2024

April 24, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link