Starburst Data Modernizes the Processes Behind Data Analytics
Starburst Data gives data analysts the ability to work and examine various data sets, regardless of their location, without compromising on performance.
We spoke with the company’s Co-Founder and CEO Justin Borgman to learn more about Presto and who some of their customers are. We also touched upon the history of the company and what backgrounds the team has in the Boston tech space.
Colin Barry [CB]: Before we start talking about your current venture, let’s talk about your background. You and the rest of your team have eclectic careers in the tech sector in Boston. Could you share a bit of you and your team’s backgrounds?
Justin Borgman [JB]: I started my career as a software engineer before going to business school at Yale University in 2009. While there, I met Professor Daniel Abadi and a Ph.D. candidate, Kamil Bajda-Pawlikowski, and I got excited about their pioneering research around transforming Hadoop into a true SQL data warehouse. We created Hadapt to turn that vision into reality and I became CEO. We raised $17 million from Bessemer, Norwest, and Accomplice (previously named Atlas Venture), built the business and then sold it to Teradata in 2014. At Teradata, I was vice president and general manager of a business unit focused on open source software, which is where many of the Starburst founders initially became involved with the Presto project.
The Starburst Data team includes some of the best engineers I have had the privilege of working with in my career. Each has deep experience working with data warehousing and analytics, many of whom started the journey with me at Hadapt. Additional team members came from other leading companies based in Boston like Vertica, Netezza, and Ab Initio.
CB: How and why did Starburst Data come together?
JB: Starburst was formed by the leading committers to the open source Presto project with the mission of bringing the power of this platform to the masses. The technology was founded from day one to be a high-performance query engine capable of scaling to the largest datasets (exabyte scale) in the world.
Presto separates storage from compute, which means you can perform SQL analytics across data stored anywhere – no data loading is required.
CB: As someone who isn’t the most familiar with how to use SQL, would you be able to explain to me how the Presto platform works? How long was the development process on this?
JB: SQL stands for structured query language, which essentially means that it is a particular way of forming a question that database software can answer. This language has been around for decades, and it’s the language spoken by virtually all of today’s most popular data and business intelligence (BI) tools like Tableau, Looker, Microstrategy and PowerBI.
Data sources change, but analytics stay the same. In other words, the data source “du jour” may have changed from Oracle and Teradata in the 80s and 90s, to Netezza and Vertica in the 2000s, to Hadoop in the 2010s, and now it is focused on cloud object storage like AWS S3. Yet, no matter where the data lives today, you still want to access it via SQL, because that’s the lingua franca of analytics.
Presto lets you do exactly that, and there’s no need to move data anywhere. Leave it where it is or move it at your own leisure, it doesn’t matter. Regardless of where the data is stored, you can get fast results to your queries.
CB: What are some of the major problems your company is looking to tackle within your space?
JB: At Starburst, we think of our market as a spectrum of companies all trying to speed up their business while managing costs. On one end, you have the cloud-native companies, which are startups or early pioneers in moving data and applications to the cloud, and usually have no legacy technology that’s managed on premises. These companies want a cloud data warehouse that quickly accesses data in cloud object storage (i.e. S3, Blob, GCS). Presto is the only open source option that can do this.
On the other end of the spectrum are mature, complex enterprises like major banks, retailers, and health insurance companies. They already own “one of each” with respect to the various databases throughout history that I mentioned earlier. These customers need an abstraction layer that lets them perform analytics holistically across those various data silos. Since Presto itself is a query engine and not a database, it’s perfect for this task. Presto treats every data source as an equal citizen and lets you join tables across different systems. So you might have web clickstream data in one source and billing data in another source, and you want to understand how a customer’s use of your website translates to revenue. Traditional solutions require you to move all of this data around and get it into one place to find the answer you’re looking for. This takes weeks or months and occupies many hours of engineering time. This is not the case with Presto. You just query it where it lives.
CB: Who are some of the typical users of Presto? Are there any use cases that have stood out to you?
JB: Presto saw early adoption with the internet companies and it now powers the analytics for Airbnb, Netflix, Facebook, LinkedIn, Lyft, Twitter and Uber. It is a vibrant community that was designed and developed to be maintained by an independent open source community.
Since we started Starburst, the software’s adoption has expanded to large banks, retailers, and healthcare organizations that want the power of Presto but need additional enterprise features around security, connectivity, and support. Our customer FINRA, the financial regulatory authority, is a great use case example. They had a cloud mandate and migrated everything from on-premises to the cloud, and today, use Starburst to analyze 60 billion stock trades per day.
CB: Starburst is currently working in the non-profit sector with the Presto Software Foundation? Why start a non-profit?
JB: The Presto community has grown tremendously since Facebook first open sourced the software in late 2013. The original creators of Presto, Martin Traverso, Dain Sundstrom and David Phillips, along with engineers from companies including Starburst Data, created the Presto Software Foundation to ensure the project remains independent and community-driven. We have a shared vision and are passionate about this technology. We want to ensure it has staying power for decades to come.
CB: When I hear the name “Starburst” I tend to think of the candy. How did the team come up with the name Starburst Data?
JB: As a lifelong fan of the candy, I don’t hate the association. However, the origin story is that we were trying to think of a name with “star” in it because in a SQL query the asterisk symbol means “anything,” and Presto truly allows you to query “anything.” A great friend of ours had the great idea to combine it with “burst” to reflect the fast performance for which Presto is famous. Putting the two together just felt right. Hopefully, we’re as popular as the pink Starburst candies around Halloween.
CB: Any other additional comments you’d like to make?
JB: One thing that’s unique about our story is that we are a profitable, fast-growing software company that has not raised a single dollar in venture capital to date. That’s more rare than a unicorn.