by: Jack Plotkin

A Brief History of Database Virtualization

First, the History 

Relational databases were invented in the 1970s and commercialized in the 1980s. Today, they are at the heart of how both Fortune 500 enterprises and startups are storing and organizing data. Every piece of digital data in your life – from your telephone records and emails to your bank accounts and tax returns to your online purchases and travel bookings – are sitting in one or more relational databases.  

As the volume of data and the number of technology companies exploded in the 1990s, large enterprises had to manage hundreds of data silos. To do this effectively, enterprises installed data warehouses that would collect data from across many different departments and applications. In turn, these data warehouses were made possible by the next generation of large-scale databases that could support thousands of transactions and terabytes of data. 

 

Next, The Challenge 

Data warehouses swept the corporate landscape in the 2000s. Over the course of that decade, data warehouse vendors became a billion-dollar industry and penetrated virtually every major enterprise across the globe.  

But technology continued to evolve. With the rise of public and private clouds in the latter half of the 2010s, enterprises discovered that switching from on-premises data warehouses to cloud-native could yield significant cost, performance, and modernization benefits. There was just one challenge: they were effectively locked in. 

Each major data warehouse vendor used its own set of unique, proprietary tools, supported its own set of capabilities, and used its own version of SQL – the query language used to talk to relational databases. This meant that moving to a new, cloud-native data warehouse would mean a multi-year and multi-million dollar process of manually rewriting every application and connector. Beyond the cost, there were massive risks. The industry was rife with examples of failed migration projects. 

 

Framing the Problem 

Enterprises had invested years and millions of dollars into developing applications that could talk to one specific type of data warehouse. Moving to a new data warehouse would mean reviewing and potentially rewriting every query in every application. In addition, it would mean changing the way these applications processed data because the new data warehouse would have a different set of capabilities. 

In the mid-2010s, a group of renegade database engineers asked a simple question: what if applications could be moved from one data warehouse to another without requiring any changes? Founding a company named Datometry, they set their sights on building a universal translator that would enable any application to talk to any database.  

The mainstream database industry shrugged its shoulders. In their opinion, this problem was vast, complex, and would take decades to solve. Undaunted, the Datometry team set out to solve it in just five years. 

 

Thinking Outside the Box 

Like many technological leaps forward, Datometry’s success was based on a chain of separate but interrelated innovations. First, the team recognized that all SQL queries, regardless of dialect, could be decomposed into relational algebra expressions. While this understanding existed in the industry, the team refined and optimized the process. 

Second, the team saw that SQL query translation was not enough. They would also have to consider feature mismatches between databases, metadata management, and other structural challenges. In other words, when an application is written to take advantage of a specific capability of a database, what happens when the same application has to talk to a database that lacks this capability? 

Third, the team faced a large scalability and concurrency challenge. Data warehouses were required to support hundreds of applications and thousands of daily transactions. Any translation layer would have to be highly performant, completely reliable, and real-time. 

 

Taking a Leap 

The challenge of thousands of simultaneous transactions predicated on the same underlying syntax had long existed in the world of telecommunications: Telephone companies must support thousands of concurrent phone calls in the same geographic area. In the late 1980s, a team of engineers at Ericsson Telecom developed the Erlang programming language to solve this problem. 

In a flash of insight, the Datometry team realized that Erlang could provide the performance, concurrency, and fault tolerance required for real-time query translation. Notably, Erlang’s design would expedite development and simplify the code by an order of magnitude. At the same time, Erlang was well-suited to support the network protocol and traffic components of communications between applications and databases. 

 

Building for the Future 

Having understood the problem and identified the shortest path to a viable solution, the Datometry team developed a highly decoupled architecture. The idea was a Lego block design where processing could be sent along a broad spectrum of paths depending on the specific characteristics of sources and destinations. 

This approach made the solution extensible, allowing new source and destination databases to be supported with ease. Five years of challenging engineering work, patent applications, and sleepless nights later, the Datometry team unveiled Hyper-Q: the world’s first database virtualization engine. Shortly thereafter, the team landed its first commercial client which was, aptly enough, a major telecommunications company. 

 

A Better World 

Today, Datometry is recognized as the pioneer and leader in database virtualization. A number of the world’s largest companies and government organizations, including multinational banks, airlines, retailers, shippers, and technology companies use Hyper-Q to process millions of mission-critical transactions each month. 

Enterprises that have been locked into legacy, on-premises data warehouses for the past decade finally have a viable alternative for modernization. Importantly, this success story would not have been possible – as is so often the case in the history of technological development – without a group of engineers having the courage and conviction to tackle a problem that the rest of the industry considered unsolvable. 

 

About Jack Plotkin

Jack Plotkin is the Head of Product and Engineering at Datometry. Previously, Jack was the founding CTO and head of product at VirtualHealth, where he designed and built the next generation medical management platform that supports more than 10 million lives across all 50 states. Educated at Harvard, Jack serves as a formal advisor to startups across multiple industries and has delivered solutions to more than 100 of the Fortune 500 companies over the span of his career.