cat /dev/random meh

It’s been a long time since I wrote anything, so I’ll share what I’ve been thinking about more recently. Just consider it a record of work.

Let’s start with an important question: If we were to redesign a new database today from the ground up, what would the architecture look like?

Before I get into the details, let me share my understanding of what developers today (and in the future) expect from a database.

A modern database architecture would likely aim to simplify and unify the underlying technology, making it easier for developers to build and maintain applications. This could involve using a single database technology that is versatile enough to handle a wide range of data types and workloads, as well as having the ability to scale seamlessly as the needs of the application change. Additionally, the architecture would aim to provide a consistent and intuitive developer experience, allowing developers to focus on building the features and functionality of their applications rather than worrying about the underlying technology.

For these trends and pain points above, mapped to the system design:

With these assumptions in mind, basically, for TiDB, over the past year, it has focused on re-architecting the database to run as a cloud service, leveraging the advantages of the cloud infrastructure to improve performance and scalability. The recent release of the TiDB Serverless Tier is an important milestone in this process, representing the first iteration of the new engine. From a technical and engineering perspective, this work likely involved significant redesign and restructuring of the underlying technology, as well as careful testing and optimization to ensure that the database can deliver the performance and reliability expected of a modern database.

Here are some takeaways from this journey:

Be a service, not a software

When you do a new database today, it’s obvious you should offer a cloud service around it (soon, it will be Serverless). There are many people (especially database kernel developers) who underestimate the complexity of making a cloud service, some classic arguments: ‘Isn’t it just an automated deployment on the cloud?’ Or ‘Basically, it’s a Kubernetes Operator?’… Actually, it’s not. Building a database as a cloud service involves more than just automating deployment on the cloud. Creating a cloud service involves a deeper understanding of the needs and requirements of the users and the applications that will be using the database, as well as the ability to design and implement a robust and scalable architecture that can support a wide range of workloads. This cognitive shift towards thinking of the database as a service rather than just software is essential in order to deliver a user-friendly experience that meets the needs of modern applications on the cloud. Additionally, building a cloud service also requires expertise in areas such as cloud infrastructure (basically, the cloud services like S3/EBS, as the building blocks), automation, and DevOps, which can be challenging for traditional database developers who may not have experience in these areas.

The software has been developed with the assumption of a relatively predictable and deterministic environment. This is particularly true for single-machine software, where all the resources of the computer are available to the program, and the execution environment is well-defined. Even with the rise of distributed systems, many software designs still follow this model, using remote procedure calls (RPCs) to connect multiple computers together, but still assuming a relatively predictable and controlled environment. However, the emergence of cloud computing has introduced new challenges and complexities, particularly when it comes to managing and scaling resources:

These changes in assumptions bring about technical changes: I think a database on the cloud should, first of all, be a network of multiple autonomous microservices. The microservices here are not necessarily on different machines. They may be physically on one machine, but they need to be accessible remotely, and they should be stateless (no side effects) to facilitate rapid elastic scaling. Let’s look at some examples:

First, the storage and compute separation has been talked about so much in recent years🙂 . On the cloud, computing is much more expensive than storage, and if computing and storage are tied, there is no way to take advantage of the price of storage, plus for some specific requests, the demand for computing is likely to be completely unequal to the physical resources of the storage nodes (think heavy OLAP requests for Reshuffle and distributed aggregation). In addition, for distributed databases, the scaling speed is an important user experience indicator. When the storage and computation are separated, in principle, the scaling speed can be extremely fast because the scaling becomes: 1. starting new compute nodes 2. cache warm-up, and vice versa for scaling-down.

Second, for databases, some internal components will be separated as services, e.g., DDL becomes DDL-as-a-Service. traditional database DDL sometimes has an impact on online performance, e.g., when adding indexes, there is an inevitable need for data backfill, which causes a jitter for the OLTP load storage nodes being served. If we think about the DDL, we see that it is a global, episodic, recomputable, offline, reloadable module, and if there is a shared storage tier (e.g. S3), this type of module is ideal for stripping out into a stateless service that shares data with the OLTP storage engine via S3. The benefits are undeniable.

There are many similar examples: logging (low CPU usage but high storage requirements), LSM-Tree storage engine’s compaction process, data compression, meta information, connection pooling, CDC, etc., are all targets that can and are well suited to be separated of. In the new Cloud-native version of TiDB, we use Spot Instances for Remote Compaction of the storage engine, and the cost reduction is amazing.

Another important issue to think about when designing a cloud database is QoS (Quality of service), which in detail is probably

What cloud services can be relied on

Another important topic is: Which services on the cloud can you rely on? This is because for a third-party database vendor, the cross-cloud experience is your natural advantage, and relying too deeply and tightly on a specific cloud service will cost you that flexibility. So you need to be very careful when choosing dependencies, and here are some principles.

Here are a few examples. For Cloud-Native TiDB, the following choices are made when selecting dependencies:

Storage: S3 is the key. As mentioned above, every cloud will have an object storage service for the S3 protocol. The benefit of using tiered storage like LSM-Tree in the database is the ability to leverage different levels of storage media through a set of APIs, e.g. local disks for hot data on the upper level and S3 for data on the lower level, with Compaction process to exchange the upper-level data to S3. This is the basis of TiDB storage and computation separation, and only after the data is in S3, optimizations such as Remote Compaction can be unlocked. But the problem is that S3’s high latency is destined to prevent it from appearing on the main read/write routine (upper-level cache failures can introduce extremely high long-tail latency), but for which I am more optimistic:

I mentioned in a talk in 2020 about building a cloud-native database. There’s a point: how well S3 could be leveraged would be key. I think this point is still valid today.

Compute: Containers and Kubernetes, like S3. Each cloud has managed K8s services, just like Linux, K8s is the operating system of the cloud. Although the separation of storage and computing is done, computing is relatively better managed (compared to the old day), but sometimes it still painful, e.g. the management of resource pools, such as Serverless clusters, need fast launch (or hibernation wakeup), starting from 0 to create new node must be too slow, need to have some reserved resources, and for example, using Spot Instance to handle Compaction tasks. If a Spot Instance is recycled, can you quickly find another Spot Instance to continue working, and the same story for load balancing and micro-services mesh … Although S3 helps you solve the most difficult state problem, the scheduling of these pure compute nodes is also painful? If you choose to build your own wheels, then most likely, you will end up reinventing a K8s. So, why not just embrace it.

On the cloud, there is also a big design question: Is the File System a good abstraction? This question comes from the infrastructure that shields the cloud under which layer of abstraction. Before S3 became popular, various large distributed system storage systems, especially Google’s: BigTable, Spanner, etc., all chose a distributed file system as their foundation (I think there is a deep trace of Plan9 here, after all, many of these Infra Guru inside Google are from Bell Labs😄). So the question is if we have S3, do we still need a layer of file system abstraction? I have not yet thought clearly, but I tend to have. The reason is still the Caching. If there is a layer of file system, in the file system layer can be based on the file access heat to do a layer of cache to enhance the expansion of the warm-up speed; another benefit is based on the file system. Many UNIX tools can be directly reused, and the complexity of operations and maintenance can be reduced.

End-user experience is the direction of optimization

One of the things I mentioned in my Keynote at TiDB DevCon this year was: how do databases on the cloud fit into the modern developer experience? This is a very interesting topic because the database has been around for so many years. It’s still almost the same, SQL is still the king, but on the other hand, the applications and tools developers are using now are very different from decades ago, as an old programmer from the UNIX era, seeing the dazzling advanced development tools and concepts used by the younger generation of developers now, I can only lament that one generation is better than the next. Although SQL is still the standard for operating data, can database software do more to integrate into these modern application development experiences? I believe some database vendors are doing a great job on that, like SupaBase. In the context of TiDB Cloud, we also have done some work on this:

Serverless, as many people think of it, is a technical term, but I think it’s not. Serverless is more about defining what a better product on the cloud from a user experience perspective is. Or maybe that’s the way it should be: why should user care about how many nodes you have? Why do I need to care about your database’s internal configuration? Why do I have to wait for another half hour after I click launch? …and so on. These things in our industry in the past seem to take things for granted. In fact, when you think about it, all feel quite ridiculous. For example: suppose you go to buy a car, the car seller first gives you an engine repair guide, tells you to read it before you can go on the road, the car does not run fast, and then tells you an engine tuning configuration need you to adjust. Every time you start the car, you have to wait, isn’t that strange? For Serverless products, the biggest implications in terms of user experience are three things.

With these three points, the database can be well embedded into other application development frameworks, which is the basis for building a larger ecosystem.

In addition to Serverless, the modern developer experience (DX) includes many other key elements, such as:

Challenges of the future

I mentioned above is basically no man’s land, and it’s hard to foresee all the challenges in advance. This paragraph concludes with a list of some interesting, though certainly incomplete, challenges that I hope will inspire you:

Well, there will always be problems and challenges. But I would say the process of making this system is also the process of understanding the system, just like Richard Feynman said

“What I cannot create, I do not understand。”