AWS Looks to ‘Demystify’ Machine Learning
Amazon Web Services used a big data conference in the backyard of some of its largest government customers to showcase its AI and machine learning tools that are helping to funnel ever-larger volumes of data into its storage and computing infrastructure.
Making a pitch for better data management tools like metadata systems, AWS executives addressing a big data conference in Tysons Corner, Va., said the the public cloud giant aims to go beyond democratizing big data to “demystify” AI and machine learning.
The combination of organized data and analytics will accelerate the building and deployment of machine learning models, many that currently never make it to production. Those that are deployed often require up to 18 months to roll out, noted Ben Snively, a solution architect at AWS (NASDAQ: AMZN).
Open source tools for model development often advance a generation or two in the time it takes many enterprises to develop, train and launch a machine learning model, he added.
Snively asserted that the combination of big data and analytics along with AI and machine learning creates a “flywheel effect” in which organized, accessible data leads to faster insights, better products and—completing the cycle—more data.
(Hence, the cloud vendor forecasts as much as 180 zettabytes of widely varied and fast-moving data by 2025.)
As it seeks to demystify machine automation technologies and move beyond the current technology “hype phase,” AWS executives note that deployment of machine learning models and, eventually, full-blown platforms, remains hard. Among the reasons are “dirty” data that must be cleansed to foster access. The company estimates that 80 percent of data lakes currently lack metadata management systems that help determine data sources, formats and other attributes needed to wrangle big data.
That makes the heavy investments in data lakes “inefficient,” stressed Alan Halamachi, a senior manager for AWS solution architectures. “If data is not in a format where it can be widely consumed and accessible,” Halamachi stressed, machine learning developers will find themselves in “data jail.”
Once big data is wrangled and secured—“Hackers would like nothing more than to engineer a single breach with access to all of it,” Hamachi said—it can be combined with analytics on the inference side to accelerate training of machine learning models, Snively said.
Noting that most machine learning models built by enterprises never make it to production, the AWS engineers pitched several new tools including its SageMaker machine and deep learning stack introduced in November. Described as a tool for taking the “muck” out of developing machine learning models, Snively said Sagemaker is also designed to free data scientists from IT chores like standing up a server for model development.
The cloud giant is seeing more experimentation among its customers as they seek to connect big data with machine learning development. “Voice [recognition] systems are here to stay,” Snively asserted, and developers are investigating “new ways of interacting with those systems.”
“It’s really about demystifying AI and machine learning” and getting beyond the “magic box” phase, he added.
October 18, 2021
- Fujitsu Analyzes Japanese Election Data with Foundry from Palantir Technologies
- WANdisco Announces General Availability of LiveData Platform for Azure
- Akridata Joins National Exascale Day Celebrations
October 15, 2021
- Elastic And Optimyze Join Forces to Deliver Continuous Profiling Platform
- Coveo Acquires Qubit
- Aicadium and SambaNova Partner to Bring AI Hardware Solution to Singapore
October 14, 2021
- Kinetica Now Accessible as a Service on Microsoft Azure
- Deloitte Launches CognitiveSpark for Marketing AI Solution
- Alation Acquires Artificial Intelligence Vendor Lyngo Analytics
- WeRide Relies on Alluxio for its Hybrid Cloud Storage Gateway for ML and AI
- FUJI Launches Sustainable Data Storage Initiative
- Logi Analytics Announces Logi Spark 2021 Virtual Conference
October 13, 2021
- Deephaven Community Core with Real-Time Data Capabilities Now Available
- Geospark Analytics Awarded Four-Year Contract from Department of State
- SparkBeyond Unveils No-Code AI Analytics Platform
- Dataminr is Acquiring Krizo, a Real-time Crisis Response Platform
- New Relic Launches Open Source Ecosystem of Quickstarts and Partner Integrations
- LogDNA Introduces Control API Suite to Give Customers More Control
- Elastic Announces Expanded Integrations with Google Cloud
- CrowdStrike Launches Free Humio Community Edition
Most Read Features
- Google Cloud Gives Spanner a PostgreSQL Interface
- One on One with Google Cloud Product Director Irina Farooq
- What Is Data Science? A Turing Award Winner Shares His View
- Big Data File Formats Demystified
- We’re In the Moneyball 3.0 Era. Here’s What It Means for Live Sports
- SambaNova Brings Custom Silicon To Bear on High-End AI Workloads
- Who’s Winning In the $17B AIOps and Observability Market
- What’s the Difference Between AI, ML, Deep Learning, and Active Learning?
- Five Real-World Applications for Sports Analytics
- How the Coronavirus Response Is Aided by Analytics
- More Features…
Most Read News In Brief
- Data and AI Salaries Continue Upward March, O’Reilly Says
- LinkedIn Open Sources Tech Behind 10,000-Node Hadoop Cluster
- Bigeye Observes $45 Million in Funding
- Data Prep Still Dominates Data Scientists’ Time, Survey Finds
- Gartner Shuffles the Technology Deck with Latest ‘Hype Cycle’ Report
- Why Is SAS Going Public?
- Feature Stores Emerging as Must-Have Tech for Machine Learning
- Sisu Nabs $62M to Grow Data Analytics Biz
- Logistics Operators Look to Data, Technology for Advantage
- An Interactive Analytics Whiteboard for COVID Times
- More News In Brief…
Most Read This Just In
- TIBCO NOW 2021 Showcases Limitless Power of Data
- Databricks Acquires Low-code/No-code Company to Expand its Lakehouse Platform
- Toloka Launches Data Research Grants, Announces First Eight Recipients
- BriefCam Introduces Video Analytics Enabled on Deep Learning Cameras from Axis Communications
- NetApp to Acquire CloudCheckr and Expand its Spot by NetApp CloudOps Platform
- Transaction Processing Performance Council (TPC) Launches an Artificial Intelligence Benchmark (TPCx-AI)
- Indico Data Announces General Availability of Indico Unstructured Data Platform
- Narmi Launches Narmi Analytics: Empowering Financial Institutions to Reclaim Control Over Data
- The Linux Foundation Announces Agenda and Speaker Lineup for the 2021 Linux Foundation Member Summit
- MicroAI to Bring AI Training to Renesas MCUs
- More This Just In…
Sponsored Partner Content
October 19London United Kingdom
October 27 - October 28
November 29 - December 3
December 6 - December 10San Diego CA United States
February 7, 2022 - February 9, 2022Houston TX United States
June 26, 2022 - June 30, 2022Hollywood FL United States