Logo

dev-resources.site

for different kinds of informations.

Amazon Aurora DSQL: Which PostgreSQL Service Should I Use on AWS ?

Published at
12/3/2024
Categories
aws
distributed
database
yugabytedb
Author
franckpachot
Author
12 person written this
franckpachot
open
Amazon Aurora DSQL: Which PostgreSQL Service Should I Use on AWS ?

The Amazon Aurora innovations announced at re:Invent 2024 may leave you feeling overwhelmed when choosing the right database. If you're looking for a PostgreSQL-compatible option, you're making a sound choice since PostgreSQL is one of the most popular databases. Building your application with PostgreSQL provides access to a large pool of developers. It allows you to choose among on-premises, self-managed, or cloud-managed services from various vendors.

You might be wondering if there's a one-size-fits-all solution for your application or if you need to explore the various options available. These include RDS PostgreSQL, RDS Aurora (PostgreSQL-compatible), Aurora Serverless v2, Aurora Limitless, and the new Aurora DSQL. Additionally, you can run PostgreSQL as a self-managed option on EC2 or opt for a PostgreSQL fork available in the marketplace, such as YugabyteDB.

The capabilities of these services vary, and each aspect plays a crucial role:

  • The degree of PostgreSQL compatibility determines whether the system suits your existing applications or is only appropriate for new ones. It also impacts how easily your application can run on other platforms if you switch providers.
  • The scope of scalability will determine how well the system can evolve to support multi-AZ or multi-region configurations without sacrificing features, consistency, and performance.
  • The class of resilience will influence your application's uptime throughout the infrastructure's lifecycle: failures, maintenance operations, or upgrades.
  • The level of serverless deployment will impact the overall cost based on your expected workloads and desired performance predictability.

PostgreSQL on EC2

Image description

If you want full compatibility with all features, you must self-manage the open-source PostgreSQL from the community repository. This allows you to do almost anything, as you have superuser access. You can also compile it with different options and install various extensions. However, it is not designed for cloud environments. High availability relies on disaster recovery, and failover typically results in complete downtime, often lasting several minutes. No incremental backups are available, and point-in-time recovery is not straightforward. Additionally, several operations, such as database upgrades, can lead to downtime.

For a managed service, the cloud vendor must restrict certain features to ensure clear segregation of responsibilities, often involving a customized version of PostgreSQL.

RDS PostgreSQL

Image description

The managed service that closely resembles the community version of PostgreSQL is RDS PostgreSQL. It is a fork with minimal changes, allowing you to replicate the same behavior using the same version of community PostgreSQL. However, this means there are no additional features within the database engine itself, only supplementary features around it. For instance, RDS PostgreSQL supports encryption and multi-AZ replication at the storage level.

RDS PostgreSQL competes with various managed PostgreSQL offerings from providers like Aiven, Google Cloud SQL, Heroku, DigitalOcean, EDB, Crunchy, and StackGres.

RDS Aurora PostgreSQL-Compatible

Image description

This service uses a fork of PostgreSQL running on a single-writer instance with disaggregated storage. The shared buffers in the compute instance are ephemeral. The WAL is sent to storage servers distributed in the region's availability zones, and those servers can serve the pages that are not in the compute instance shared buffers. This storage is resilient to failures and provides additional features like snapshots and point-in-time recovery. Additional compute instances are available for failover, reducing downtime during planned maintenance or failures. These instances can also serve stale reads, receiving the WAL from the primary to invalidate their cache and reading pages for the distributed storage.

RDS Aurora with PostgreSQL compatibility is used for applications that need a high PostgreSQL compatibility but with more operational efficiency than the community PostgreSQL.

The competition for RDS Aurora is the similar architecture, Google AlloyDB or Neon.

RDS Aurora PostgreSQL-Compatible Serverless

Image description

Serverless v2 is similar to RDS Aurora but runs the compute instances with dynamic resource allocation. You define a minimum and maximum ACU (Aurora Capacity Unit). The hypervisor dynamically scales CPU and RAM based on demand. This is cost-efficient for databases idle for a long time or with short peaks of high activity, and you favor cost reduction over performance predictability.

The competition for Aurora Serverless is mostly Neon.


The services mentioned above use the same monolithic architecture as PostgreSQL for the compute instance, which provides full PostgreSQL compatibility from the application's point of view but is limited to a single instance. All consistent reads and writes must go through a single instance, with one shared buffer memory segment protected by a single stream of Write-Ahead Logging (WAL). The consistency of those services, with all ACID properties, is guaranteed by running on a single server so that all sessions share the same memory structures and system clock.

Data must be distributed to multiple nodes to scale horizontally. Two techniques are used: database sharding and distributed SQL. A bounded clock from Amazon Time Sync ensures the reliability of these services to get consistent reads and writes with all ACID properties. This clock provides precise time with minimal and guaranteed clock skew. What differentiates the following services is their PostgreSQL compatibility for global transactions (transactions reading or writing to more than one shard)

RDS Aurora PostgreSQL-Compatible Limitless

Image description

When an application is designed to operate on multiple databases using a sharding key, Aurora Limitless simplifies the management of sharding to facilitate operations. It is deployed across multiple compute instances that utilize Aurora Serverless hypervisor. You connect to routers that offer a global view of the database, and queries are distributed to the relevant shards containing data based on your defined sharding key. For single-shard queries, PostgreSQL compatibility is equivalent to that of Aurora. This service is ideal for applications that are already designed for sharding.

The competition for RDS Aurora Limitless is Citus managed services, or Nile.

RDS Aurora DSQL

Image description

AWS has announced a new Aurora service to offer SQL developers a truly serverless experience. This service aims to have all the advantages of DynamoDB (no downtime, no upgrades, on-demand pricing) but with SQL features. It uses the PostgreSQL dialect, which closely adheres to the SQL standard. However, runtime behavior differs, relying on optimistic concurrency control and IAM authentication instead of traditional user-password mechanisms.

DSQL is the recommended service for new applications designed for serverless and running on AWS only: no connection pool, optimistic concurrency control with retry logic.

The competition for DSQL is distributed SQL with a PostgreSQL dialect, like Google Spanner, CockroachDB, or the NoSQL services Microsoft Cosmos and Amazon DynamoDB. It is a trade-off between both approaches.

I do not consider YugabyteDB a competitor because, while it is also a Distributed SQL-managed service, YugabyteDB is a PostgreSQL fork with PostgreSQL compatibility, including concurrency control behavior, is open source and multi-cloud, so those are often different use cases.

YugabyteDB in the marketplace

Image description

YugabyteDB is a fork of PostgreSQL. It is runtime-compatible with PostgreSQL (same behavior) and license-compatible as an open-source database. It is available as a managed service in the AWS marketplace. Compared with the other services, it provides higher PostgreSQL compatibility than the horizontally scalable services (Limitless and DSQL) while distributed to availability zones or regions like Aurora DSQL. It is a good choice for a drop-in replacement of PostgreSQL without changing the application. YugabyteDB is optimized for OLTP systems requiring predictable performance, which requires keeping servers up with a warm cache, so it doesn't compete with Aurora Serverless.

Three distributed databases use the AWS precision time clock bound to provide high performance with strong consistency: Aurora Limitless, Aurora DSQL, and YugabyteDB, and Amazon exposes this to provide the most efficient distributed databases, managed my Amazon or third-parties:
Image description


Here is another great comparison by Rehan Van Der Merwe

Aurora DSQL - A NEW boring(?) AWS Serverless Postgres compatible database

Aurora DSQL is a serverless, distributed database with PostgreSQL compatibility. Serverless and relational databases don't usually mix, but AWS has modernized this traditional classic car(PostgresSQL), with a brand-new interior powered by a Formula 1...

favicon blog.datachef.co
yugabytedb Article's
30 articles in total
Favicon
PostgreSQL plan_cache_mode
Favicon
Index Filtering in PostgreSQL and YugabyteDB (Index Scan instead of Index Only Scan)
Favicon
Unique Index on NULL Values in SQL & NoSQL
Favicon
More details in pg_locks for YugabyteDB
Favicon
Large IntentsDB MemTable with Many Small SST Files
Favicon
Aurora DSQL: How it Compares to YugabyteDB
Favicon
Document data modeling to avoid write skew anomalies
Favicon
When to replace IN() with EXISTS() - correlated and uncorrelated subqueries
Favicon
2024.2: Faster with Shared Memory Between PostgreSQL and TServer Layers
Favicon
Aurora DSQL - Simple Inserts Workload from an AWS CloudShell
Favicon
Amazon Aurora DSQL: Which PostgreSQL Service Should I Use on AWS ?
Favicon
YugabyteDB MVCC and Updates: columns vs. JSON
Favicon
No Gap Ordered Numbering in SQL: A Unique Index to Serialize In Read Committed
Favicon
Starting a YugabyteDB lab cluster with AWS CLI
Favicon
Speeding Up Foreign Key Constraints During Migrations
Favicon
Indexing for a Scalable Serialization Isolation Level
Favicon
The Doctor's On-Call Shift example and a Normalized Relational Schema to Avoid Write Skew
Favicon
You Probably Don't Need Serializable Isolation
Favicon
A brief example of an SQL serializable transaction
Favicon
YugabyteDB as a Graph database with PuppyGraph
Favicon
Native GLIBC instead of Linuxbrew since 2.21
Favicon
pgSphere and Q3C on Distributed SQL
Favicon
IN() Index Scan in PostgreSQL 17 and YugabyteDB LSM Tree
Favicon
Frequent Re-Connections improved by Connection Manager
Favicon
Maintaining Throughput With Less Physical Connections
Favicon
YugabyteDB Connection Manager: a Database Resident Connection Pool with Shared Processes
Favicon
ERROR: index row size 3056 exceeds btree version 4 maximum 2704 for index
Favicon
Write Buffering to Reduce Raft Consensus Latency in YugabyteDB
Favicon
Asynch replication for Disaster Recovery, Read Replicas, and Change Data Capture
Favicon
Fast PITR and MVCC reads with Key-Value LSM Tree

Featured ones: