Follow Datanami:
June 11, 2022

LXT Report Highlights Importance of High Quality Training Data for AI Initiatives

AI training data vendor LXT has released a new report, “The ROI of High-Quality AI Training Data.” The report highlights the level of value companies are finding with their AI initiatives.

“With the massive amounts of investment in AI made by enterprises in recent years–and the acceleration of digital transformation across organizations to address shifts in user behavior–our goal was to understand the impact of these trends in terms of AI maturity, along with the drivers for AI and the success factors for AI initiatives,” the company said.

This new report is a follow-up to another LXT report, “The Path to AI Maturity,” released in February of this year. The new report features highlights of the previous report for context, including the fact that 40% of surveyed enterprises have moved beyond the experimental phase of their AI initiatives into a deployment phase where they are beginning to see demonstrable ROI.

This chart shows Gartner’s AI maturity model, which LXT asked companies to use when rating where they are in their AI journey. Source: LXT

A key finding of “The ROI of High-Quality AI Training Data” is that the companies in the most mature stages of AI implementation reported that high quality training data is the biggest factor in the success of their AI projects. LXT uses Gartner’s AI maturity model, citing Systemic and Transformational as the highest levels.

The survey revealed that many organizations are allocating a large percentage of their AI budgets to training data. The report states that four in ten companies are dedicating 70%+ to training data, including data collection, data processing, and data annotation. The average business investment for training data in an AI budget is 59%.

Additionally, two-thirds of businesses expect their training data needs to increase over the next five years. As more and more AI models are deployed, training data will be needed in order to periodically update these models.

Why is this investment increasing? The report looks at how ROI is measured based on four key factors listed by respondents: operational efficiency (65%), cost reduction (64%), error rate reduction (59%), and improved reputation (55%). The report notes that most respondents chose multiple factors when answering this survey question. Operational efficiency factors in because without reliable, high quality training data, AI models do not move into production as quickly. Costs are reduced when companies can move quickly to deployment while avoiding delays or rework. Furthermore, using higher quality training data for machine learning models can lead to increased accuracy and reduction of errors. With more accurate AI models comes increased customer satisfaction, which can be a boon to a company’s reputation in the marketplace.

This chart shows where ROI can be found for companies across the AI maturity spectrum. Source: LXT

Another finding is that companies who are just starting AI initiatives tend to focus on cost and error rate reduction due to the need for proving that AI projects are worthwhile to senior executives. Operational efficiency and business transformation tend to be the ROI focus of companies with more mature AI strategies.

Finally, the report notes that 99% of respondents said they use third party vendors for training data, noting that the key reasons for doing so include trust, reach, longer strategic partnerships, and speed. Experienced partners induce trust by delivering a training data pipeline that allows projects to be completed on time and within budget. Third party data partners can also offer extended reach, about which the report says: “They rely on their third-party data partners to provide the proper scope of training data for their AI models in terms of language coverage and in-country experience in a wide range of locales. This point is important, as it shows that businesses are aware of the fact that training data is not something they can readily engineer on their own.” The report says that companies are also looking at third party vendors as key collaborators and not just suppliers of training data, especially for larger, long-term projects. Building strategic partnerships can increase the chance of reliable and speedy deployment of AI projects, which was noted as another deciding factor, particularly for those organizations with mature AI programs.

Companies reported many factors involved in choosing third party vendors. Source: LXT

The report concludes with a reiteration of how its findings demonstrate that high quality training data is important for companies at all stages of AI maturity. Investments can have many ROI advantages including operational efficiency, lower costs, reduced error rates, and improvement of an organization’s reputation.

The survey was commissioned by LXT and included responses from 200 senior decision-makers within U.S. organizations. To view the report in its entirety, visit this link.

Related Items:

How Data-Centric AI Bolsters Deep Learning for the Small-Data Masses

Five Ways to Drive ROI with AI

Training Data: Why Scale Is Critical for Your AI Future