Follow Datanami:
June 4, 2021

Meet Sean Knapp, a 2021 Datanami Person to Watch

Getting data to the right place at the right time has never been more important than it is now. But for many organizations, the data movement task largely remains a manual affair. Sean Knapp founded because he knew that automating data pipelines was key to unleashing the power of data.

We recently caught up with Knapp, the CEO of and a 2021 Datanami Person to watch.

Datanami: The shortage of data scientists is often cited as a barrier to success in data science and machine learning, but you see data engineering as the real problem. Can you elaborate on your thoughts on the matter?

Sean Knapp: Data science has exploded across departments and job functions in almost every industry, which has definitely created a shortage of data scientists. However, the data scientists that companies do have are often held back by their increasing need for streamlined and scalable access to data, a function typically handled by data engineers. Data engineering is responsible for data pipelines that collect, unify, enrich, and refine data into usable building blocks for analytics.

Unfortunately, there simply aren’t enough data engineers to meet demand. For the companies that do have data engineering talent, these professionals have to devote the majority of their time to maintaining brittle systems and servicing the needs of other teams. When they are finally free to build new data pipelines, prototyping and productionizing the most basic projects takes months, if not longer.

The problem is an inability to scale – not of bytes or records, but of builders and their velocity. A company’s ability to operationally scale data initiatives requires a faster, more reliable, and automated way for businesses to democratize data access across the enterprise, allowing data teams to drive innovation and deliver insights faster. Only then will businesses be able to turn those investments into business success.

Apache Spark is at the heart of your offering at Considering all the misplaced hoopla over Hadoop, do you feel confident the same will not occur with Spark?

Apache Spark is quite a remarkable technology, and while it is certainly showing its ability to stand the test of time, we do believe that there is no one size fits all when it comes to data products and the architectures that power them. Companies are currently grappling with their approach to the data lake versus data warehouse versus data lakehouse, just as they have with batch versus streaming versus micro-batch.

Ultimately, what users want is the benefits of these various approaches, and the flexibility to move between them as their business needs require, without the need to re-architect their entire data strategy. has invested heavily to give our customers this flexibility, whether it is across clouds or across data silos. Our flex-code data connectors give customers the ability to easily connect into and even transition data systems with tremendous ease. Keep an eye out for us to continue this trend in 2021, with the ability to soon leverage far more underlying platforms for processing data than ever before.

Datanami: What is a common mistake that people make regarding their data, and what sorts of new powers can be unlocked if they’re addressed?

Knapp: A common mistake for many companies today is how their data teams are structured. When it comes to staffing, management is responsible for setting their data teams up for success; however, far too often, management may not have the insight or expertise to hire the right team with the right range of skills, which can lead to many challenges down the road. Commonly, management may have only prioritized hiring data scientists, meaning there is no data engineering or operations talent to support the data science initiatives. This unbalanced ratio of data engineers to data consumers can cripple the productivity of data teams, leading to significant delays in analytics timelines. Another scenario is that management may hire the wrong people, due to not fully grasping the tasks at hand. Data engineering is still an emerging field, which can often lead to missteps in the hiring process. Management may hire individuals into the role of “data engineer,” but far too often, these professionals may just be software engineers or database administrators. To avoid this, management must closely evaluate what personas they have to adequately determine what skills they need on their data team and be open to troubleshooting and course-correcting along the way.

Another common pitfall for data teams is the threat of what I call “accidental ransomware.” Many data engineers – especially early in their career – are solely interested in building their own data systems and platforms from the ground up, relying on open-source technologies to cobble together a proprietary system that will get the job done. The problem with this scenario is that if the data engineer who built it decides to leave the company, it’s extremely unlikely that anyone else in the business will be able to maintain – or frankly, even use – this system. It can reach the point where managers of data teams feel they are being held hostage by these platforms, hence the term accidental ransomware. Thankfully, many of the data professionals who started their careers building these systems over a decade ago – at the peak of open source – have now experienced first-hand just how daunting, time-intensive, and costly the building process can be. This has led many data teams to instead opt for buying solutions to maximize value for their business and avoid any potential accidental ransomware.

Datanami: Outside of the professional sphere, what can you share about yourself that your colleagues might be surprised to learn – any unique hobbies or stories?

Knapp: We have a pretty tightly knit team, so I’m not sure this would come as much of a surprise to my colleagues, but I absolutely love running. My parents were both runners, and even as little kids they would take my twin brother and I with them to the local track, and let us play in the long jump pit (aka a sandbox) while they ran. We both competed in cross country and track and field all the way into college and to this day run with each other as often as we can.

One of the absolute best things about running, however, is that it is an incredible way to see a new town, city, or even countryside. I used to travel a lot for work and would be in a different country almost every month with very little free time to see the sites. Tossing on my running shoes for a 6:00 a.m. run around the Imperial Palace in Tokyo, the Opera House in Sydney, or Hyde Park in London was a fantastic way of taking in the sights before a long day of meetings. I even would take a unique route back from Tokyo which had back-to-back red-eyes with a 10-hour layover in Honolulu. I’d use hotel points to get a cheap room in Waikiki for my bags, lace up my running shoes, and run to the top of Diamond Head and back before getting a huge post-run meal, showering, and heading back to the airport.

Every once in a while, however, business travel takes you somewhere quite unique and this led to my absolute favorite run of all time. I had just wrapped up a conference in Monaco – which is an experience in and of itself – and a teammate and I had a day before we flew out. We decided to rent a tiny car and picked a random town on the map way up in the hills called Sospel. Once we arrived, we set a goal: run to Italy. And, to make it more fun still: no roads. And over the course of a long run, we found our way to Italy on tiny trails, train tracks, tunnels, and even bridges (for the trains). It was an unforgettable experience.

To this day, I take my running shoes with me everywhere I travel as there is always some road, trail, or train tracks waiting to be explored.

Knapp is one of 12 Datanami People to Watch for 2021. You can read the interviews with the other honorees at this link