June 7, 2016

IBM Seeks Data Science Unity with New Spark-Based ‘Experience’

Alex Woodie

(ProStockStudio/Shutterstock)

IBM today launched what it’s calling the first enterprise application for data science collaboration. Called the Data Science Experience, the free, cloud-based offering is aimed at enabling data scientists to perform tasks like prepping data and building machine learning models in an open and shared environment.

Developed on Apache Spark, IBM likens the Data Science Experience to an integrated development environment (IDE) where data scientists have a place to do their work. Data scientists can work with the software, which is largely based on the open source Jupyter notebook, using a variety of languages, including R, Python, and Scala. It also exposes Spark’s machine learning library, called MLlib, as well as IBM’s SystemML (which IBM contributed to Spark last year) and certain SPSS algorithms.

The offering is intended to bring data scientists together, says IBM’s vice president of product development for analytics, Rob Thomas.

“The biggest problem that we see in organizations today is data science is a very fragmented profession,” Thomas says. “It’s very much an individual sport, where they have one language they like, one tool, they work on their own and you hope they get to meaningful insight. But there’s not a lot of collaboration.”

IBM intends to jump start that collaborative process with Data Science Experience. “DSE is about bringing your expertise, whatever it is, bringing your tool whatever it is–whether you like to work in R or Python or Scala or SPSS or anything–and we’ll give you an open environment built on open source where you can collaborate and share those models. Basically it’s how you learn, how you make, and how you collaborate around data science all in one environment.”

In addition to hooking into popular data science languages like R, Python, and Scala, IBM is enabling Data Science Experience users to use tools from its partners, such as H2O and R Studio.

Data Science Experience provides functionality to help users ingest and prep data, as well as training and evaluating machine learning models. It also offers access to 250 curated data sets. The product does all this in an open and collaborative way that fosters openness among a group of data scientists, Thomas says.

“This is about the open application of analytics and data science and trying to get away from the closed wall idea,” he says. “It’s about bringing data science to the masses and enabling machine learning, given that most organizations are struggling with how to get off the ground with it.”

For example, consider the case of a retailer that has hired two data scientists, each of which is working in different languages and data types. “The Scala person builds some models for customer data, and then they find something interesting in the customer data,” Thomas says. “They publish that and say ‘Look what I found here. This is what I built. This is what the model says, here’s the regression.’

“The R guy says ‘That’s interesting. I was working off this product data and I hadn’t see that. What if we put those data sets together. Does that lead to a different set of outcomes?'” Thomas says. “So they might be trading data sets or trading actual models. They might be sharing the output of the models. But the point is we just enabled a discussion that never happens because today they sit in different parts of the building or different parts of the world, so that’s enabling a discussion that doesn’t happen naturally.”

IBM is running Data Science Experience in the cloud. Cutomers must upload their data to IBM’s cloud to make it work. It’s currently giving the software away for free, but in the future, IBM may choose to charge for certain features, such as running the machine learning models in real time to score incoming data.

“We’ll see the power of it. It will get bigger over time,” Thomas says. “Where this is going is toward a cloud-based platform of compostable data services which can drive all the analytics for an enterprise–whether the data is in Hadoop or a columnar DB or other NoSQL DB – it doesn’t’matter.”

Related Items:

The Rise of Data Science Notebooks

Why Self-Service Prep Is a Killer App for Big Data

Applications: Enterprise Analytics

Technologies: Cloud, Frameworks

Sectors: Financial Services, Manufacturing, Retail

Vendors: Databricks, IBM

Tags: analytics, big data, data science, Jupyter, Spark

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

IBM Seeks Data Science Unity with New Spark-Based ‘Experience’

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 17, 2024

April 16, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

IBM Seeks Data Science Unity with New Spark-Based ‘Experience’

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 17, 2024

April 16, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link