Varada is trying to help increase the adoption of variants of an open source distributed SQL query engine for big data by making a Workload Analyzer it developed available under an open source license.
Workload Analyzer for Presto is compatible with both PrestoDB and Trino (formerly known as Presto SQL). Trino is being advanced by the Presto Software Foundation using a fork of the same open source code base as PrestoDB. Trino was created by former Facebook engineers as part of an effort to make a version of PrestoDB that is governed independently of Facebook. Created in 2019, the Presto Software Foundation includes contributors from Starburst, Arm Treasure Data, Qubole, and Varada that continue to work on the Trino project. Starburst, which employs the original developers of PrestoDB who created Trino, last month raised $100 million in additional funding to attain a $1.2 billion valuation.
PrestoDB, meanwhile, is now being developed under the auspices of a rival Presto Foundation that is an arm of the Linux Foundation. Also created in 2019, the Presto Foundation’s founders include Facebook, Uber, Twitter, and Alibaba. Efforts to unify the two rival consortiums have thus far proven fruitless.
Originally created in 2012, PrestoDB, and subsequently Trino, have gained traction in various organizations that need to employ SQL to query massive amounts of data. Workload Analyzer for Presto from Varada extracts and aggregates query metrics and other data so it can be surfaced in charts that provide greater visibility into the performance of a cluster running either PrestoDB or Trino, Varada VP Ori Reshef said. This makes it easier to identify queries that are consuming large amounts of compute resources and discover what data is being accessed most frequently and how to improve overall JOIN performance, regardless of whether organizations are employing PrestoDB or Trino.
In December, Varada made available a data virtualization platform that is designed to be deployed in a virtual public cloud. The Varada Data Platform is based on indexing technology based on Presto code that breaks data up into nano blocks optimized based on the type of data content and structure. That approach allows end users to query data where it resides without having to move it into a central repository. “Our IP is the indexing technology,” Reshef said.
As part of that platform, Varada makes available an enterprise-grade instance of Trino that ensures high availability. Varada has also pledged to contribute code to both the PrestoDB and Trino projects.
It’s not clear to what degree the divide between proponents of PrestoDB and Trino is having any impact on organizations that need a high-speed query engine to access big data repositories based on Hadoop and other NoSQL platforms. SQL is the lingua franca for querying databases on enterprise IT environments. Most enterprise IT organizations are going to prefer to employ a database platform that is SQL-compatible whenever possible because most of the tools employed by end users are dependent on SQL interfaces.
In the meantime, there’s nothing to prevent any open source project from ever being forked. The biggest immediate impact of such an event is how the contributors to one project versus another split their time and resources. It’s not all that uncommon for a fork of the original open source project to become incompatible with the original parent project over time.
The level of discord between the two consortiums may give some organizations trying to decide whether to employ PrestoDB or Trino pause. But for most organizations, the overall benefits of relying on open source software continue to outweigh any unfortunate dissension among contributors.
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.
Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access: