DB Scholar

YottaDB is a new cloud native, multi-model, multi-tenant NoSQL database built in house here at Tencent. It offers high performance, low cost, flexible data model and rich features. What’s unique about YottaDB is that it give customers the flexibilities to fine tune the trade offs among consistency, performance, availability and cost. In this opening blog, I will discuss YottaDB’s customer value propositions, scenarios and its key architectures and technologies.

Value Proposition

In the last decade, NoSQL industry has mostly fulfilled its original promises in scalability, performance, and easy to develop applications. Many leading vendors also offered rich set of enterprise features such as auto scale, serverless, secondary index, SQL like query, multi-tenancy, and integration with search, AI, and analytics. YottaDB’s goal is to meet and exceed them. It has some unique value propositions that set it apart from others. They are:

Provide tunable flexibilities among consistency, performance, availability and cost.
Can easily deploy anywhere without locking customers to a specific cloud provider or proprietary ecosystem.
Can support multiple models of data APIs and storage mechanisms through a developer friendly extensibility framework while providing a unified management system.
Can handle both large number of small tenants and large tenants while still maintaining strict tenant resource isolation.
Can truly scale compute and storage independently, and horizontally without limit.
Performance still matters. For customers with stringent performance requirements, YottaDB leverages the latest hardware and software technologies to offer unmatched latency and throughput.

Customer tunable at multiple dimensions

In large distributed systems, technology providers often talk about their products as being either a CP or a AP system under the famous CAP theorem. But in real world, with customers and operations in the mix, we actually are constantly juggling among 4 common dimensions: performance, availability, consistency and cost. Such flexibilities are missing in the industry. For example, a product may tout its high availability by using 5 replicas; but fails to mention a high fixed cost associated with it. We don’t believe all the customers need ultra high availabilities. The cost is can be an important concern. By making replica numbers configurable, and by using a mix of SSD and HDD as storage medium, we meet the lowers availability requirements while significantly reduce the cost in storage. YottaDB allows the customer to do trade off among all 4 dimensions. Availability is expressed as multiple 9s SLA; consistency is expressed as 4 discreet levels: 1) eventual, 2) bounded eventual, 3) session and 4) strong. Performance is defined as 3 sub dimensions: a) 99% latency, b) 1% tail latency and c) 99.9% throughput in request per second. The cost is defined in two categories: throughput pricing and storage pricing. Imagine each of these dimensions is presented as a slider in the management console. A customer can freely explore the trade-off while fixing one or two dimensions. He or she will have the transparency and flexibilities for price to performance ratio. It is also a great product education opportunity to understand these trade-offs.

Deploy anywhere easily

Data is 21st century’s oil. Customers want to have tight controls over their data. To diversify the risks or to meet legal requirements, customers may not want to put all their data into a single public cloud provider. They either leverage distributed cloud (for risk, cost and locality reasons) or deploy directly on premise for complete control.

YottaDB can be deployed anywhere easily. YottaDB was built from scratch with cloud native compatible architecture. The ability to deploy anywhere with cloud native infrastructure is baked into the deign at the very beginning. Unlike many legacy data systems who either have a monolithic service architecture or have to undergo retrofitting to cloud native, YottaDB is 100% micro-service based. At the very top, YottaDB is comprised of data plane and control plane. Each can be in its own cluster or be combined. Each micro-service is encapsulated in a container. Multiple related containers then build into Kubernetes artifact such as POD, service, and deployment. YottaDB provides customers with “K8s Cluster Authoring API” that takes your hardware environment’s parameters and automatically invoke Kubernetes APIs in customer environment; or it can generate entire data plane and control plane’s K8s deployment script/yaml files as a intermediate step.

Multiple API and storage models in one product

YottaDB is multi-modeled in two dimensions: data API and data store. This architecture benefits customers in operational efficiency, cost, and extensibility. We converted different data models and APIs into a single model called “Universal Row” and a single set of operators called “Universal Commands”. The design allows for a single implementation of compute (query) engine. Compute’s interface with storage layer is also abstracted as unified APIs. It gives us the flexibility of substituting different indexing and storage strategies

By having a unified data and resource model, YottaDB’s cluster can scale to hundreds or even thousands of nodes with much higher compute and storage densities. Bigger scale and higher densities translate into lowered total cost of ownership. Multi-model also means we share the implementation and support for the replication, backup, resource governance, internal over-the-wire protocol, distributed workflow, billing, and customer portal. This has huge benefits for development and operational efficiencies. Finally, extensibility is realized by the fact that adding a new data model means only having to write a new API converter without touching other components. This conversion happens at the earliest stage at our gateways. From that point of on, in-memory data structure, over-the-wire protocol, and operations are all speaking the same languages. Compared to the “Polyglot persistence” approach, our approach reduces the risk and complexities by not allowing paralleled, heterogeneous code path to exists throughout the API stacks.

Limitless scalability with multi-tenant isolation

Limitless horizontal scalability is one of the fundamental promises of NoSQL movement. Almost all the NoSQL products in the market have claimed to support such capability. YottaDB takes this promise to a higher level by adding high density multi-tenancy support and the ability to scale compute and storage independently.

YottaDB has the following resources in hierarchical order: account, DB, collection, partition(aka shard), sub-partition. Among them, account, DB and collection are logical containers that have limitless horizontal scalability. The partition and sub-partitions are physical containers. A partition must fit into a single Kubernetes’ POD. A POD maps to a physical disk. It contains vCores that are a subset of a NUMA node. A sub-partition must fit into a single vCore (i.e. a vCore can host multiple sub-partitions). So the size of a customer tenant can scale from a single collection of a few hundred megabytes to a large set of databases that takes up entire cluster’s resources. The key to achieve such high degree of elasticity and scalability here is our resource governor in both control plane and inside the POD itself. Resource governor at the control plane has the visibility of resources at the partition level. It ensures a partition is placed at a POD and a node that have sufficient resource to handle its load. By relying on real time resource usage monitor in each POD, control plane makes real time decisions on splitting, load-balancing, and merging of partitions within the data plane cluster. Once a partition is deployed at the right POD, the resource governor at the POD level will further split the partition into sub-partitions if necessary. These two resource governors ensures small, medium and large partitions can coexists in a single cluster while maintaining a high resource density and effective isolation.

Ultra high performance and dynamic price/perf ratio

YottaDB treats performance as a first class citizen. We believe performance and cost is closely related. Often, solutions today only talk about high benchmark without also discussing about the trade-off on cost, consistency, durability and scalability. YottaDB is unique because it is transparent about the performance and its implication on these others areas. It offers a range of performance and cost combinations to choose from; and customers can adjust them on multiple levels.

First, let’s examine YottaDB’s key architectures that enable it to achieve ultra high performance. YottaDB adopts a share-nothing at vCore level design. When request reaches the DB service instance, it is routed to the corresponding vCore that is assigned to the partition. If partition’s load is too much for a single vCore to handle, we further divide it up into sub-partitions for compute tasks. A single thread, and a set of lock-free task queues separated by priorities are also bound to a vCore. Having one thread per core greatly minimized the overhead of context switches. It also improves the performance through having better CPU cache locality. However, having an one-thread-per-vCore design means we must ensure all the tasks must not block by IO. Here, YottaDB leverages Linux’s IO uring to handle all storage and network IOs asynchronously. POD Resource Governor dynamically adjusts partitions to vCore binding for maximum utilization. As an option, we will also support DPDK as a network plugin. DPDK provides ultra high network performance by completely bypassing the Linux kernel.

Having ultra high performance is just half of the story here. YottaDB also allows customers to adjust price and performance ratio. One can not only adjust number of replicas, but also replica’s storage type such as SSD, HDD or mixing the two. For caching, YottaDB offers the flexibility of storing the hottest data in DRAM while others in persistent memory (which still out performs SSD but is cheaper than DRAM). For ultra low latency, YottaDB offers a cache service container that can be deployed closer to customer’s client, or a direct connection mode with smart client that routes request directly to the backend. These options do incur higher cost and complexities though.

Conclusion

In this post, I presented a high level introduction to Tencent’s newly built, cloud native, distributed NoSQL service. At this point, it is being used at Tencent internally. Tencent handles some of the largest internet traffic in the world through its WeChat platform (with over a billion active users), online gaming, media streaming, email communications etc. It is probably safe to assert that the requirements on performance, scalability, quality and cost are higher serving Tencent internal customers than many public cloud customers. For example, on Chinese new year or the national day, the throughput of a single web API may exceed 1 billion concurrent viewers. Chinese internet service market is also extremely competitive. The balance of competition often hangs by just a few milliseconds in latency. Against this background, YottaDB was created as a unified NoSQL platform with modern architectures to meet these challenges in the years ahead. In future posts, I will dive deeper into how we built the system, the engineering challenges encountered and learnings acquired along the way. It is my hope to engage with database engineering communities to share the knowledge and exchange ideas.

yottadb