January 18, 2025

What’s the point of NoSQL Deployments Are Failing at Scale?

5 min read
rb_thumb

featured img

Why is tech dead? There’s no one answer. Sometimes it’d be better than the other. Other times, underlying need evolves. Technology that serves the needs of an emerging market may not be enough when the market matures

That’s what many businesses are discovering about NoSQL. And it’s why so many NoSQL implementations are struggling today.

It was not long ago, in the early days of big data, Hadoop was the name on everyone’s lips. Traditional SQL-based data stores were believed to be obsolete; every venture-funded startup had a NoSQL key-value store under the hood: tech giants such as Google, Facebook and Yahoo who developed NoRQl technology to manage their rapid growth — it was only natural for startups to reach for the tools that led them to global success.

But a curious thing happened. The startups that succeeded started tossing their NoSQL databases overboard.

Take the path of Hbase, a database that is part of the standard Apache Hadoop package.HBase’s popularity rose for several years and then steadily declined as it was modelled on Google’S popular BigTable.

A new database may be added to supersede HBase in 2017 (perhaps one that retrieved data faster or could address more information) but this is not the case, and Hbase still stores and recovers the best of them. It has no power with its raw power; it’ll only be about the complexity of problems users are trying hard to solve.

Early SaaS and big data startups were focusing on their hands only in the pursuit of customers. They needed an inexpensive way to store and manage large volumes of high-speed data. NoSQL tools like HBase filled that role admirably, but asked questions about data? How do they keep it consistent?

It followed that, eventually, when it came to the conclusion that companies built on NoSQL had a serious problem with maintenance. They were trouble writing queries. Data became unreliable; new applications were harder and harder to build; NosQR, which was so cost-effective initially, began charging costs as the business got more complicated.

Many of the companies running HBase were no longer startups. They had developed platforms that others used to build businesses.They were hiring data analysts, they were thinking in terms of downtime and SLAs.I didn’t just want to keep data anymore.We were trying to use it.

That was when NoSQL’s limitations became evident — and a real concern.

For HBase, those included:

For example, * Lack of transaction support: This means users are not required to have any ACID properties typical of a modern relational database. Data can become corrupt or logically inconsistent; the more data you find by brute force when data quality decays (* *HBase’s lack of secondary indexes) it is harder to find the problem through brutE-force scan—not if you don’t need to search for data when you only have relatively small amounts of data.

However, over time these fundamental problems with running NoSQL at scale became impossible to ignore. Some responded by attempting to find a compromise solution; Newer NosQl databases tried layer structure in place of HBase’s key-value architecture, adding transactions using SQL or SQL-like capabilities.

The MIT’s Michael Stonebreaker wrote: ‘At the end of the 2010s, almost every NoSQL DBMS added a SQL interface.’ He adds: “Many of all the other NoRQLS has also been adding strongly consistent (ACID) transactions; as such, the NoSSQl message has evolved from ‘Do not use SQL – it is too slow!’ to ‘Not only SQL’ (i.e., SQL is fine for some things).

NoSQL products became similar to their RDBMS counterparts over time. But the key differences remained: by definition no schema; NoNoSqL solutions are not weak and they don’t have a data schema, which allows for fast storage and retrieval (and also analytics and transactions) If the schema isn’T realized within the database, it must be instantiated in the query. For example, when data needs to be sharded onto different servers, then the change has to appear within an app.

This is a problem because the difficulty of changing an existing database discourages new application development. It also makes innovation harder, with little companies willing to accept that for too long.

One of the early adopters for HBase was Pinterest Engineering blog posts, which reads “50 clusters, 9000 AWS EC2 instances and over 6 PBs of data” on Hbase. And as time went by with Pinterest’s growth it decided that HBA had no advantage in its favor: too light features and cost too much to manage; while other businesses began to come together at the same conclusion, it became harder and harder to find more than half-heartedly successful.

But that may be a surprise for some. SQL workers have long been accused of being slower and less efficient than NoSQL (NoSqL) inherently more closely related to raw performance parity with their counterparts, but the recent developments in cloud computing and horizontal scale-out has brought all the benefits of an RDBMS; rather than focus on one dimension of database functionality — storage and retrieval – distributed SQL seeks to provide high performance across dozens of transactional and transactionals as well as providing many applications using advanced SQL queries

Ironically, as they have migrated from NoSQL to distributed SQL, Pinterest and companies like it are following Google’s footsteps when the first of these companies adopted NoCQl. TiDB and other distributed databases follow those that were developed by Google Spanner: software Google created to solve BigTable (the technology which led to HBase).

In a way, the SaaS industry simply recapitulates Google’s journey and other tech giants that has been in the last two decades: here is essentially one technology (SQL/RDBMS) which was apparently obsolete by another technology (“NoSqL”) — now being replaced with largely more modern version of the technology it demolished.

Who is to say the wheel will not turn? To quote Stonebreaker one last time, “What goes around”. Another wave of developers will claim that SQL and the [relational model] aren’t enough for emerging application domains. People will propose new query languages and data models to solve these problems. But none, he notes, have ever seriously threatened displacement of the SQL-based RDBMS.

Hopefully, it’s an important reminder that over the years, the traditional relational database has been so far proven to be extremely effective at absorbing innovation: clustering in cloud and vector search; trends of databases are coming-and-going but when the dust settles SQL always appears to remain standing.

Group Created with Sketch.

Leave a Reply

Your email address will not be published. Required fields are marked *