AWS Looks to ‘Demystify’ Machine Learning
Amazon Web Services used a big data conference in the backyard of some of its largest government customers to showcase its AI and machine learning tools that are helping to funnel ever-larger volumes of data into its storage and computing infrastructure.
Making a pitch for better data management tools like metadata systems, AWS executives addressing a big data conference in Tysons Corner, Va., said the the public cloud giant aims to go beyond democratizing big data to “demystify” AI and machine learning.
The combination of organized data and analytics will accelerate the building and deployment of machine learning models, many that currently never make it to production. Those that are deployed often require up to 18 months to roll out, noted Ben Snively, a solution architect at AWS (NASDAQ: AMZN).
Open source tools for model development often advance a generation or two in the time it takes many enterprises to develop, train and launch a machine learning model, he added.
Snively asserted that the combination of big data and analytics along with AI and machine learning creates a “flywheel effect” in which organized, accessible data leads to faster insights, better products and—completing the cycle—more data.
(Hence, the cloud vendor forecasts as much as 180 zettabytes of widely varied and fast-moving data by 2025.)
As it seeks to demystify machine automation technologies and move beyond the current technology “hype phase,” AWS executives note that deployment of machine learning models and, eventually, full-blown platforms, remains hard. Among the reasons are “dirty” data that must be cleansed to foster access. The company estimates that 80 percent of data lakes currently lack metadata management systems that help determine data sources, formats and other attributes needed to wrangle big data.
That makes the heavy investments in data lakes “inefficient,” stressed Alan Halamachi, a senior manager for AWS solution architectures. “If data is not in a format where it can be widely consumed and accessible,” Halamachi stressed, machine learning developers will find themselves in “data jail.”
Once big data is wrangled and secured—“Hackers would like nothing more than to engineer a single breach with access to all of it,” Hamachi said—it can be combined with analytics on the inference side to accelerate training of machine learning models, Snively said.
Noting that most machine learning models built by enterprises never make it to production, the AWS engineers pitched several new tools including its SageMaker machine and deep learning stack introduced in November. Described as a tool for taking the “muck” out of developing machine learning models, Snively said Sagemaker is also designed to free data scientists from IT chores like standing up a server for model development.
The cloud giant is seeing more experimentation among its customers as they seek to connect big data with machine learning development. “Voice [recognition] systems are here to stay,” Snively asserted, and developers are investigating “new ways of interacting with those systems.”
“It’s really about demystifying AI and machine learning” and getting beyond the “magic box” phase, he added.
February 15, 2019
February 14, 2019
- Anodot Named to the 2019 CB Insights AI 100 List of Most Innovative Artificial Intelligence Startups
- HERE Establishes AI Research Institute With €25+ Million Funding
- LexisNexis Expands Data as a Service Offering With OpenCorporates Legal Entity Data
- Mahindra Selects VoltDB to Power ML-Driven Customer Experiences for Telecom Providers
- MicroStrategy to Introduce Its Next Generation of Enterprise Intelligence at Gartner Conferences
February 13, 2019
- Harley-Davidson Reimagines Riding with IBM Cloud
- FICO Xpress Insight Enables Users to Operationalize Analytics
- Tableau Releases Ask Data, a New Way to Analyze Data With Natural Language
- Lexalytics Announces Text Analytics Suite Availability for Any Computing Environment
- OpenCorporates Migrates To TigerGraph
- InfluxData Secures $60 Million in Series D Funding
February 12, 2019
- TaylorMade Golf Tees Off with Oracle Autonomous Database
- Datanami Unveils 2019 People to Watch
- Attunity Launches Streaming Data Pipeline Solutions for Data Lakes and Data Warehouses on Microsoft Azure
- Import.io Acquires Connotate
- Logi Analytics Acquires Jinfonet Software
- North Carolina Health Information Exchange Partners With SAS Analytics and InterSystems
February 11, 2019
- AMA’s Integrated Health Model Initiative Launches First Data Model
- TIBCO and IHS Markit to Deliver Advanced Analytics to the Energy Industry
Most Read Features
- 10 Big Data Trends to Watch in 2019
- Is Hadoop Officially Dead?
- Big Data File Formats Demystified
- The ‘Big Bang’ of Data Science and ML Tools
- Benchmarking NoSQL Databases
- Why Knowledge Graphs Are Foundational to Artificial Intelligence
- What’s Driving Data Science Hiring in 2019
- How to Build a Better Machine Learning Pipeline
- Which Programming Language Is Best for Big Data?
- Kubeflow Emerges for ML Workflow Automation
- More Features…
Most Read News In Brief
- Inside Fortnite’s Massive Data Analytics Pipeline
- AWS Launches Time-Series Database
- Google Brings Kubernetes Operator for Spark to GCP
- California’s New Data Privacy Law Takes Effect in 2020
- Microsoft Invests in Databricks
- Cloudera Unveils CDP, Talks Up ‘Enterprise Data Cloud’
- Why Gartner Dropped Big Data Off the Hype Curve
- Intel Unveils Nauta, a DL Framework for Containerized Clusters
- Gartner Sees AI Democratized in Latest ‘Hype Cycle’
- Global DataSphere to Hit 175 Zettabytes by 2025, IDC Says
- More News In Brief…
Most Read This Just In
- Qlik Acquires CrunchBot and Crunch Data to Augment Conversational Analytics Capabilities
- Microsoft Acquires Citus Data
- New O’Reilly Report Explores Tools and Best Practices for Advanced Analytics and Artificial Intelligence
- Cognigo Pioneers Natural Language Processing Contextualization for Personal Data Protection
- Julia Is the #4 Top Machine Learning Project on GitHub
- H2O.ai Collaborates with Alteryx to Advance Data Science Workflows
- The Apache Software Foundation Announces Apache Hadoop v3.2.0
- Confluent Propels Data Architecture into Event Streaming Era with $125 Million Series D
- H2O.ai and Kx Partnership Provides Faster Insights on Time Series Data
- InfluxData Secures $60 Million in Series D Funding
- More This Just In…