More Tips For Navigating Big Data
As big users of big data try to bridge the widening analytical skills gap via everything from training programs and online courses to graduate school centers of excellence, a range of strategies are emerging to help master big data analytics.
The latest is a tutorial covering the extremely promising jobs prospects in data analytics, what computer languages and skills are in most demand and, lastly, a tip sheet for mastering big data analytics. The guide was released this week by NGDATA, a vendor whose data platform is designed to sit between data insights and “customer interaction programs.” Among other things, the tool helps users gain and retain customers.
Among the nuggets in the NGDATA guide is the reality that big data skills are useful, even essential, in a growing number of corporate and educational settings. “We are now in an era where gaining access to data is not the problem; the challenge lies in determining which data are significant and why,” noted the editors of the IT journal Ariadne.
Most data mining tools target structured data, they added, an approach that will no longer work in the era of social media “Today we have so much data that come in an unstructured or semi-structured form that may nonetheless be of value in understanding more about our learners,” they added.
The emergence of machine learning as a tool for wrangling unstructured data also is highlighted in the guide. Machine learning with big data will duplicate human extrapolation from past experiences to deal with unfamiliar situations, “at massive scales,” Peter Levine of tech advisor Andreessen Horowitz, noted in a recent blog post on predictive analytics and the role of Hadoop and Spark.
Other experts noted that emerging machine learning platforms and services would help extend big data capabilities, but data scientists and a cadre of trained data professionals remain critical for helping companies “utilize data and data-mining software to translate raw numbers into actionable insights.”
Among the skills stressed in the guidebook are open source languages like R along with Python. “Learn Python, but don’t stop there,” urges a data specialist writing on the web site KDNuggets.com. There are good reasons why Python is being adopted so widely by computer scientists, and why it’s a data analysis tool of choice for so many, the main one being the ease of learning and using Python,” advised Martijn Theuwissen of DataCamp, which offers online data science courses.
Another lesson is finding ways to cut through the technology hype, or as this web site has noted, find ways of “avoiding shiny objects.”
Collecting and storing data is of course no longer a barrier. Hence the guidebook cautions against collecting data for the sake of collecting. The hard part, a data expert in the financial sector noted, is developing an accompanying strategic plan to leverage big data.