

via Shutterstock
Amazon Web Services used a big data conference in the backyard of some of its largest government customers to showcase its AI and machine learning tools that are helping to funnel ever-larger volumes of data into its storage and computing infrastructure.
Making a pitch for better data management tools like metadata systems, AWS executives addressing a big data conference in Tysons Corner, Va., said the the public cloud giant aims to go beyond democratizing big data to “demystify” AI and machine learning.
The combination of organized data and analytics will accelerate the building and deployment of machine learning models, many that currently never make it to production. Those that are deployed often require up to 18 months to roll out, noted Ben Snively, a solution architect at AWS (NASDAQ: AMZN).
Open source tools for model development often advance a generation or two in the time it takes many enterprises to develop, train and launch a machine learning model, he added.
Snively asserted that the combination of big data and analytics along with AI and machine learning creates a “flywheel effect” in which organized, accessible data leads to faster insights, better products and—completing the cycle—more data.
(Hence, the cloud vendor forecasts as much as 180 zettabytes of widely varied and fast-moving data by 2025.)
As it seeks to demystify machine automation technologies and move beyond the current technology “hype phase,” AWS executives note that deployment of machine learning models and, eventually, full-blown platforms, remains hard. Among the reasons are “dirty” data that must be cleansed to foster access. The company estimates that 80 percent of data lakes currently lack metadata management systems that help determine data sources, formats and other attributes needed to wrangle big data.
That makes the heavy investments in data lakes “inefficient,” stressed Alan Halamachi, a senior manager for AWS solution architectures. “If data is not in a format where it can be widely consumed and accessible,” Halamachi stressed, machine learning developers will find themselves in “data jail.”
Once big data is wrangled and secured—“Hackers would like nothing more than to engineer a single breach with access to all of it,” Hamachi said—it can be combined with analytics on the inference side to accelerate training of machine learning models, Snively said.
Noting that most machine learning models built by enterprises never make it to production, the AWS engineers pitched several new tools including its SageMaker machine and deep learning stack introduced in November. Described as a tool for taking the “muck” out of developing machine learning models, Snively said Sagemaker is also designed to free data scientists from IT chores like standing up a server for model development.
The cloud giant is seeing more experimentation among its customers as they seek to connect big data with machine learning development. “Voice [recognition] systems are here to stay,” Snively asserted, and developers are investigating “new ways of interacting with those systems.”
“It’s really about demystifying AI and machine learning” and getting beyond the “magic box” phase, he added.
Recent items:
AWS Takes the ‘Muck’ Out of ML with Sagemaker
How to Make Deep Learning Easy
September 5, 2025
September 4, 2025
- Google Cloud: 52% of Executives Say Their Organizations Have Deployed AI Agents, Unlocking a New Wave of Business Value
- Geniez AI Raises $6M Seed Funding to Connect LLMs and AI Agents to the Mainframe
- Neo4j Launches Infinigraph: The Most Scalable Graph Database for Unified Operational and Analytical Workloads at 100TB+ Scale
- Giga Computing Expands NVIDIA RTX PRO Server Portfolio
- Starburst Announces AI & Datanova 2025, the Global Virtual Summit for Trino, Data and AI Innovation
- Dresner Advisory Publishes 2025 AI, Data Science, Machine Learning, and ModelOps Research
- Cisco Expands Secure AI Factory with NVIDIA to Accelerate Enterprise RAG Workloads
- Butterflies and Conservation: Largest AI Dataset Now Released
September 3, 2025
- Fivetran Acquires Tobiko Data to Power the Next Generation of Advanced, AI-Ready Data Transformation
- DeepL Unveils Autonomous AI Agent for Businesses
- SQL4Fusion Launches as Community and SQL Editor for Oracle Fusion Developers
- Leonardo DiCaprio Pens Foreword to Data at the Edge, a Five-Book Series on Finding Trusted GIS Data
- Precisely Unveils AI Agents and Copilot for the Data Integrity Suite
- WisdomAI Launches Proactive Agents: The Always-On AI Data Analyst That Augments Your Team
- Polars Cloud Now Available on AWS
September 2, 2025
- John Snow Labs Announces Program and Keynote Speakers for the Applied AI Summit
- Siemens and Snowflake Enable IT/OT Convergence Across Cloud for Industrial Customers
- SDSC’s Sherlock Partners with MCNC to Deliver Secure Cloud and Data Services Across North Carolina
- Cloudsmith Launches ML Model Registry to Bring Enterprise Governance and Security to Models and Datasets
- Why Metadata Is the New Interface Between IT and AI
- Beyond Words: Battle for Semantic Layer Supremacy Heats Up
- What Are Reasoning Models and Why You Should Care
- Why OpenAI’s New Open Weight Models Are a Big Deal
- Rethinking Risk: The Role of Selective Retrieval in Data Lake Strategies
- Software-Defined Storage: Your Hidden Superpower for AI, Data Modernization Success
- This Big Data Lesson Applies to AI
- The AI Beatings Will Continue Until Data Improves
- What Is MosaicML, and Why Is Databricks Buying It For $1.3B?
- Top-Down or Bottom-Up Data Model Design: Which is Best?
- More Features…
- Mathematica Helps Crack Zodiac Killer’s Code
- GigaOm Rates the Object Stores
- Solidigm Celebrates World’s Largest SSD with ‘122 Day’
- Google Pushes AI Agents Into Everyday Data Tasks
- Oracle Launches Exadata Service for AI, Compliance, and Location-Critical Workloads
- BigDATAwire Exclusive Interview: DataPelago CEO on Launching the Spark Accelerator
- The Top Five Data Labeling Firms According to Everest Group
- Databricks Now Worth $100B. Will It Reach $1T?
- AI Hype Cycle: Gartner Charts the Rise of Agents, ModelOps, Synthetic Data, and AI Engineering
- EU’s AI Act Enters New Enforcement Phase
- More News In Brief…
- Gartner Predicts 40% of Generative AI Solutions Will Be Multimodal By 2027
- Seagate Unveils IronWolf Pro 24TB Hard Drive for SMBs and Enterprises
- LF AI & Data Foundation Hosts Vortex Project to Power High Performance Data Access for AI and Analytics
- Dell Unveils Updates to Dell AI Data Platform
- Computing Community Consortium Outlines Roadmap for Long-Term AI Research
- Deloitte Survey Finds AI Use and Tech Investments Top Priorities for Private Companies in 2024
- Transcend Expands ‘Do Not Train’ and Deep Deletion to Power Responsible AI at Scale for B2B AI Companies
- Acceldata Announces General Availability of Agentic Data Management
- Redpanda Partners with Databricks to Deliver One‑Step Stream‑to‑Table Iceberg Integration for Real‑Time Lakehouses
- NVIDIA AI Foundry Builds Custom Llama 3.1 Generative AI Models for the World’s Enterprises
- More This Just In…