Talend Clarifies Oil and Gas Data Management
Open source integration software company, Talend, has spent around six years pushing its vision of a world with simple, accessible, affordable data integration via an open source model. The company began at data integration, and has more recently expanded to include data quality, MDM and ESB products.
Talend, which claims 400 employees globally, is standing up to the “big guys” in both its integration offerings (SnapLogic, TIBCO, IBM) and data management (IBM again, Oracle, Red Hat, Infomatica, etc.) with key customer acquisitions, including Citi, UPS, AOL and other household name-level companies.
This week marked the introduction of a new customer to their ranks from the oil and gas software industry via their gain of customer, ENSYTE, which has tapped Talend for their Enterprise Data Quality product. According to Talend, the company will be using product to migrate multiple data surces into the repository of its solutions.
ENSYTE developed its natural gas software solution, GASTAR, specifically to address the needs of the natural gas market. The product manages transactions along the entire natural gas supply chain, giving clients the power to leverage data across the organization into one streamlined business process. While there are several more details about this particular customer here, we wanted to check in with Talend and get a grasp on some of the data management and integration challenges this industry faces.
Ciaran Dynes, Senior Director of Product Marketing at Talend provided some insight…
Can you give us a no-nonsense view of what Talend delivers to the data-intensive field of oil and gas exploration, distribution and delivery? How big is this market segment to Talend’s business?
Because of the unique measuring equipment they use, the oil and gas industry needs to handle many atypical data sources that come in a variety of highly custom formats. Another added complexity is the processing of geocoded information. Talend offers a completely unified integration stack that not only covers all aspects of data management (integration, transformation, quality/cleansing, and even master data management for overall consistency). It also offers a high degree of customizability that allows clients to build connectors to custom data sources with minimal effort.
What are some of the “big data” challenges of the oil and gas industry and how does Talend address these independently and through partnerships like the one with ENSYTE today?
Both during exploration and during production, massive amounts of data are collected by specialized equipment. A lot of this data (sensor log, geocoded, etc.) is semi-structured and is not well suited to being processed by traditional relational databases. Big Data technologies such as Hadoop bring very high value for this processing. In addition, the time-value cost is very high. The shorter the processing of data can be, the higher the return on investment of this data is. There is thus a compelling argument for real time processing.
Talend automates the management of this Big Data by abstracting the MapReduce operations of Hadoop through a graphical user interface that makes it both fast and simple to design data transformation, cleansing and analytics routines.
What are some of the challenges for this industry as it seeks to integrate across different data management and analysis platforms? Where does Talend come into play?
One of the keys is to maintain consistency of the data. Talend’s data management solutions offer a platform that centralizes all data management operations, consolidates metadata, and ensures data consistency.
Can you describe a few key challenges when it comes to integrating legacy data (in context of oil and gas if possible) with new platforms and more importantly, more diverse types of data?
The 3 biggest challenges are:
- Access to the data – find the proper connectors to automate reliably the extraction of all the data, regardless of how exotic the platforms on which it resides. Talend is totally extensible and its open source community has developed a lot of extensions through an open & well documented API. Adding connectors is easy.
- Cleanse the data – obtain data that is consistent, exhaustive, deduplicated, possibly removing outliers. Ensure consistency across applications and databases.
- Handle the diversity and complexity of the data: structured, semi-structured, geocoded, etc. and integrate all sources seamlessly.