Follow Datanami:
August 18, 2020

Planning an ETL Proof of Concept? Here Is What You Need to Consider

David Lipowitz

(whiteMocca/Shutterstock)

Picture this: You are trying to track the sentiment of your product using first-party customer data, social data, and social listening data to determine the success of a new feature. Getting a daily report on this can help your product team make better prioritization decisions, or your marketing team to craft better messaging around a feature that spurs product adoption. In your current reporting structure, this will take hours of manual work. To expedite the process, you can create a proof of concept (POC) of an extract, transform, and load (ETL) workflow using cloud-based tools that can consolidate and transform your data sources for near-instant reporting and analytics.

POCs are frameworks of tests that help a user determine if a product will function as envisioned, and if it will provide the long-term value that merits a full investment in the technology and resources. These tests should not attempt to build an entire solution; they should confirm or deny if an entire solution should be implemented. A POC is an implementation on a micro-scale, that can demonstrate that the larger project can be done.

The POC should take minimal time.  If it works, it creates a starting point for the development project and helps lock in stakeholder buy-in. Specifically for data architects and engineers, an ETL proof of concept can prove the value that cloud-native data integration can bring to an organization, in a scalable and flexible way.

What does a successful ETL proof of concept look like? It’s fast. It’s relatively painless. It shows tangible value that merits a larger implementation. And with the right strategy, it can be done at low cost–or even no cost–to your organization.

Before you launch a POC, you need to identify your use case and give yourself a few guardrails to ensure you don’t burn cycles without proving value. There are several areas of focus where you will need to identify and align stakeholders, technologies, and processes. Before you create a formal project plan, here are some things you need to research and consider for your POC.

 Resources

Fill out your POC team with a cross-section of users who will use or benefit from the solution. Identify their part in the process and what their evaluation criteria will be. They should adopt an optimistic approach toward a successful outcome and not be constrained by a reluctance to change.

Take time to choose the right people for your ETL POC (pgraphis/Shutterstock)

An in-house POC team should include at least three personnas:

  • Decision Maker: The person who will ultimately make the yes-or-no decision on the technology and project once the POC is complete;
  • Technical Lead: The person who validates the integration and implementation of the product and leads the development of the PoC;
  • Business User: The person who can appropriately quantify the business benefits that feed into the success criteria.

Your vendor POC team should include an account executive who works to align the expectations for a successful adoption of the product and a technical resource. This person is the counterpart to the technical lead in-house, providing backup for the project in the form of product support and expertise.

Technical Architecture

When considering a product for the project, you need to ensure that it integrates with your target architecture and addresses any security concerns. For example, do you need to keep sensitive data within your private cloud, or is that less of a concern? Ultimately there will be a set of technical requirements that a product will need to adhere to if it is to meet an organization’s technical governance.

Costs, Installation, and Maintainability

Gain a firm grasp of an organization’s cost model. Are there additional costs as you scale, or can you scale the solution without financial penalty? Do you have control of the costs? Does the cost model offer the flexibility your enterprise needs in order to build the environments you need for your implementation?

Use the proof of concept to understand the options and speed of installation. But remember, you are not building a production system.

ETL implementations can be complex and tough to maintain

Maintenance can account for as much as 80% of the total cost of a solution. As data changes and business logic evolves, does the solution remain easy to maintain? Can you quantify any improvements in the amount of maintenance effort required, and compare them to existing processes?

Evaluation Framework

You need to ask yourself: why are we doing this and what are we trying to achieve? Ultimately a proof of concept should give you proof of value. Identifying where this value lies will provide a framework for how you want to define your success criteria. Cost savings are driven by:

  • Increased productivity: Does it speed up the development process? Is the interface easy to use and navigate?
  • Resource availability: Do you need specialists or is it simple and intuitive to use? Does it simplify the process? Does it support an industry standard language, like SQL?
  • Scalability: Do you need to have specialized configuration knowledge to scale the solution for future use? Are there cost penalties or gates that depend on usage? Does it leverage the benefits of the cloud computing approach to simple scaling and consumption-based pricing?
  • Maintainability: Can you develop data flows in a visual way? Are there built in retry mechanisms, monitoring, and alerting? Do you have control over data and alerting processes? Do you need specialized skills to maintain the product, or can it be done through a standard interface?

Use Cases

First, identify the pain points in the current process. Find a small number of typical use cases and keep it simple. Then, choose a wider variety of use cases, including those you’ve identified as particularly painful. Evaluate the technology’s ability to scale for larger projects and more complex projects when you are ready. This will provide a good indication of how the product performs.

Don’t try and replicate legacy implementations. Take advantage of the features and functionality, including best-practices, as directed by the vendor of the product you’re evaluating. This may not be an apples-to-apples comparison, and that’s a good thing. You have the opportunity to think about your existing challenges in a new way, and to redesign solutions for them accordingly.

Vendor Relationship

Evaluate vendor support. Are they willing to help you succeed? Do they value your feedback? Are they customer obsessed? Once you find answers to your questions in each area of consideration, you should be in a good spot to begin executing your proof of concept.

Overall, you’ll achieve the most benefit by keeping your POC simple and taking a measured and  methodical approach. Avoid doing too much, too soon. This will help you and your team realize cost-savings, performance benefits, and pros and cons of the solution, at a comfortable pace.

About the author: David Lipowitz is a Solution Architect Team Lead at Matillion with 20 years of software and database engineering experience. Over his career, David has focused on creating organizational success through the judicious use of database technology, developing a deep understanding of software development, database architecture, and agile methodologies. 

Related Items:

Can We Stop Doing ETL Yet?

Merging Batch and Stream Processing in a Post Lambda World

The Real-Time Future of ETL

 

Datanami