Logo

dev-resources.site

for different kinds of informations.

Choosing the right, real-time, Postgres CDC platform

Published at
12/6/2024
Categories
postgres
database
eventdriven
dataengineering
Author
Eric Goldman
Choosing the right, real-time, Postgres CDC platform

Change Data Capture (CDC) has become a critical component of modern data architectures. Teams use CDC to build event-driven workflows that react to database changes. Or to maintain state across services and data stores.

As a critical component of the stack, teams need a CDC solution that's fast and reliable.

In this guide, we'll compare the leading real-time CDC solutions across three key dimensions that matter most: technical capabilities, required expertise, and budget considerations. As we were building Sequin to address many of the challenges teams face with CDC, we couldn’t find a resource that laid out all the options. Since we did the work to try each tool available today, we thought we’d share our findings.

What is real-time CDC?

In real-time CDC, the moment a row changes in your database (insert, update, delete) you want to deliver that change to another system right away. Usually, you want real-time CDC for operational use cases, not analytics. So a cron job won't do.

You may set up a real-time CDC pipeline for a number of reasons. Those reasons fall into two broad categories:

Event-driven workflows: When you insert, update, or delete a row in your database you want to send an event to one or more services or processors.

Replication: The state of your database needs to be synchronized with other services - be it a database, a cache, search index, third-party API, or a materialized view. Unlike analytics (per above) you need the two systems to be in-sync fast so, for example, a search doesn’t return an unavailable product, etc.

For both kinds of use cases, guarantees are critical. In particular, you want the guarantee of exactly-once processing. These systems are way easier to build and maintain if you know that a downstream service will receive a change when it happens. And that if the downstream service fails to process the change, it will be retried until it succeeds.

The framework

I’ll highlight three characteristics of each platform to help you make the right choice:

Capabilities: I’ll boil this down to three primary considerations that should help you quickly determine if the tool is even close to your needs:

  1. What kinds of changes can the system detect in Postgres?
  2. What kind of deliverability guarantees does the system provide?
  3. What destinations does the system support?

Technical expertise: How easy is the system to set up and maintain? Do you need specific skills to use the tool? How much time will you need to dedicate to it?

Budget constraints: In addition to time, how much will the system cost to operate?

With that, let’s dig in.

Free, open source

There are a handful of open source projects that you can run yourself. You get the benefits of using a widely adopted tool with the ability to tailor it to your needs.

Debezium

TL;DR: Debezium is a widely deployed CDC platform, but it’s notoriously hard to use and requires Kafka expertise.

Capabilities: Debezium captures all inserts, updates, and deletes. Because it uses Kafka, it comes with exactly-once processing guarantees. Debezium's only destination is Kafka, but you can use Kafka Connect to route from Kafka to other tools and services.

  • Logical replication: inserts, updates, and deletes in real-time
  • Exactly-once processing
  • Kafka

Technical expertise: High. Self-hosting Debezium requires deploying and managing the JVM, ZooKeeper, and Kafka. Each requires configuration and ongoing maintenance.

Budget: High. Debezium is free and open source, but the real cost is in complexity. With all the dependencies and configuration the total cost of ownership is significant.

Note: Debezium launched Debezium Server several years ago - which removed the Kafka dependency, and supports more streams/queues. However, it seems pretty basic and has received limited investment - and is hardly being used from what I can tell (only 75 stars on GitHub). It’s worth mentioning, but I don’t see it as a viable offering right now.

Sequin

TL;DR: Sequin is a new Postgres CDC solution that leverages your existing database infrastructure instead of requiring Kafka. It delivers changes directly to popular streams and queues through native connectors.

Capabilities: Sequin captures every database change using Postgres's native replication capabilities, without requiring additional infrastructure like Kafka. The platform provides exactly-once processing guarantees by leveraging Postgres itself for reliable message delivery. Sequin includes native connectors that deliver changes directly to destinations like Kafka, SQS, NATS, RabbitMQ, and Sequin streams, with built-in transformation capabilities.

  • Logical replication: inserts, updates, and deletes in real-time
  • Exactly-once processing guarantees
  • Native connectors for streams and queues like Kafka, NATS, RabbitMQ, and more

Technical expertise: Low. Sequin runs as a single Docker image. Sequin comes with a web console, CLI, and config files to simplify set up and management. Sequin offers a fully hosted option as well.

Budget: Low. Sequin is a free, open source platform. It’s more cost effective compared to alternatives because it uses your existing Postgres infrastructure.

Hosted

While there are many hosted, closed source CDC providers, few offer real-time data capture with stream or queue destinations. Here’s the set to consider:

Decodable

TL;DR: Decodable provides a hosted Debezium plus Flink stack - ideal if you're already invested in these technologies but want them managed.

Capabilities: Decodable provides a hosted version of Debezium paired with Apache Flink. It’ll capture every change in your database with strong deliverability guarantees, monitoring, and schema management tools.

  • Logical replication: inserts, updates, and deletes in real-time
  • Exactly-once processing guarantees
  • Kafka and Pulsar

Technical expertise: Medium to low. The hard part is learning how to configure pipelines and apply SQL transforms in the Decodable dashboard. Flink pipelines definitely come with a learning curve - and the slow feedback loop as you wait for changes to apply doesn’t help.

Budget: Medium. Decodable is fairly expensive. One pipeline can have several different “tasks” that can each cost several hundred dollars per month.

Confluent Debezium

TL;DR: A fully hosted version of Debezium with enterprise features. A reasonable choice if you're already using Confluent.

Capabilities: A fully hosted version of Debezium that comes with enterprise security, compliance, monitoring, and scaling tools built in.

  • Logical replication: inserts, updates, and deletes in real-time
  • Exactly-once processing
  • Kafka

Technical expertise: Medium. While you won’t need to directly manage the deployment, JVM, and dependencies - you’ll still be responsible for the complex configuration of Debezium and Kafka topics

Budget: High, enterprise pricing. You’ll trade engineering time for an expensive enterprise infrastructure product that still requires Kafka expertise.

Confluent Direct JDBC Postgres

TL;DR: If you need Kafka and are paying for Confluent, and you just need to replicate postgres rows to a topic - this is for you!

Capabilities: Unlike every other solution thus far, this connector uses polling to capture changes. This means you won’t capture deletes (which may be a dealbreaker). It’s easier to set up but less powerful.

  • Polling: inserts and updates only with delay
  • Exactly-once processing
  • Kafka

Technical expertise: Medium / low. The Postgres replication is simplified (but less powerful), and your still managing Kafka’s topic configurations, consumer groups, schema evolution, etc.

Budget: Very High. As with other Confluent offerings, you’ll pay a high price for their enterprise tooling. You’ll pay individually for the Postgres connector in addition to your required Kafka instance.

Striim

TL;DR: Striim is known as an enterprise solution with a high cost - but it delivers a reliable CDC solution. It’s intended for very large companies.

Capabilities: Striim provides CDC with transformations, schema management, delivery guarantees, and security. It's designed for large Fortune 1,000 companies.

  • Logical replication: inserts, updates, and deletes in real-time
  • Exactly-once processing
  • Multiple destinations

Technical expertise: Medium. You'll need to learn Striim's proprietary TQL (Transform Query Language) for data transformations and their StreamApps framework for pipeline configuration. While well-documented, these tools have a steep learning curve and are unique to Striim's ecosystem.

Budget: Very high. Striim requires an all-in contract designed for large enterprises.

Cloud provider tools

The major infrastructure providers offer CDC products that work within their ecosystem. Tools like AWS DMS, GCP Datastream, and Azure Data Factory can be configured to stream changes from Postgres to other infrastructure.

TL;DR: Setting up CDC through your cloud provider can certainly work if you are all-in on one provider and comfortable with their tools.

Capabilities: AWS, GCP, and Azure offer CDC capabilities integrated into their infrastructure. For instance with AWS DMS, you can configure your AWS RDS with CDC to send events to AWS SQS.

  • WAL: Inserts, Updates, Deletes in Real-time
  • Variable by configuration and provider
  • Destinations within the provider

Technical expertise: Medium. Setting up this kind of CDC is all about navigating through your provider’s web console, permissions, tooling, and logging to set up pipelines. You’ll need to be familiar with all the potential settings and ready to guess and check with Claude Sonnet.

Budget: Medium. Often you’ll be paying for extra compute hours and data transfer which can add up quickly and are hard to predict. Additionally, these setups can be brittle and hard to maintain as settings and dependencies are strewn about.

ETL providers (Fivetran, Airbyte, Etc)

If you need real-time CDC, then these platforms are immediately disqualified because they only offer batch ETL. It may take a minute or two for a change in your database to appear in your stream or queue. And even then, depending on your set up, it may not have atomic changes.

That said, they are worth mentioning because they do provide CDC to a handful of streams and queues.

TL;DR: if you just need analytics, these are a good option.

Capabilities: ETL tooling provides schedule, batch delivery of changes in your database to streams like Kafka. It’s primarily intended for non-operational, analytics use cases.

  • Batch CDC on a schedule
  • Variable - often at-most-once guarantees.
  • Multiple destinations

Technical expertise: Low. ETL tools are easy to set up - but not very configurable when used for CDC.

Budget: High. You’ll pay for every row which can get expensive.

Build your own

Of course, you can definitely build this on your own real-time CDC. We’ve written several pieces on this topic to help you get started.

Creating a custom CDC solution offers the ultimate in flexibility and optimization for specific use cases. You can build exactly what they need, with precisely the delivery guarantees and transformation capabilities you require. And Postgres comes with some helpful capabilities to make this relatively easy

However, this path demands the highest level of technical expertise. You’ll be in the weeds of the WAL, Postgres, and some sort of approach to buffer and deliver changes. It’s honestly very fun work - but hard to get right. Especially if you require backfills, strict ordering, exactly-once processing, monitoring, and redundancy.

Conclusion

While CDC is not a new idea, it’s becoming more common as apps become more data intensive. While the options are a little sparse (especially compared to other tooling) - the space is maturing. I’d recommend taking action from here by picking one solution in each category - try an open source solution (like Sequin!), start a free trial of a hosted option, and see if your cloud provider might do the trick. Then you’ll have working knowledge of your option space to make a good decision.

Featured ones: