Logo

dev-resources.site

for different kinds of informations.

A Tale of Two Connection Pools

Published at
2/13/2023
Categories
cockroachdb
connectionpooling
distributedsql
Author
jhatcher9999
Author
12 person written this
jhatcher9999
open
A Tale of Two Connection Pools

It was the best of pools; it was the worst of pools...
Ok, enough with the Dickensian references.

Connection Pools

I have recently had two contrasting experiences with application-side, connection pooling frameworks when connecting to CockroachDB, and I thought I would share some of my findings.

I am a Solutions Engineer at Cockroach Labs, so I often work with customers and prospects on testing out their codebase running against a CockroachDB cluster. CockroachDB is a distributed SQL database (meaning a CockroachDB cluster is made up of many nodes). This distributed DNA comes with obvious advantages like being about to do a rolling restart of the cluster nodes and not having to take the database down. However, it's a good idea to make sure that your application is configured to handle a node restarts.

For you application to gracefully handle node restarts, we typically recommend:

  1. Load Balancer - Point to a load balancer that is round-robin-ing traffic among the CRDB nodes and monitoring for DB node health (and taking dead nodes out of the spray). This can be a simple L4, TCP load balancer. If you're running CockroachDB in the our managed service offering, this load balancer is provided for you.
  2. Connection Pool - Use a connection pool in your code (HikariCP is great for Java). We would recommend a few settings:

  3. Include Retry Logic in your code - We recommend catching and retrying serialization retry (i.e., 40001) errors in code. You can add a similar class of errors for 57* errors and 08* errors (see: https://www.postgresql.org/docs/current/errcodes-appendix.html)

Here's an example of a project I wrote in C# that shows this retry logic in action.

These three elements (LB + CP + Retry Logic) interact with each other, and with all three in place, you will have a fool-proof approach to handling restarts in your code with zero impact.

Here's a diagram to shows how this interaction works.

Image description

How to handle leaking state across session

Connection pools are great in that they save us the latencies associated with opening new connections to our database, and they allow us to multiplex lots of queries across a fewer number of connections to the database.

Now, let's talk about one risk associated with leveraging them.

Each database connection instance has various metadata associated with it. In the CockroachDB world (which is Postgres compatible), one such example of state is "session variables." For instance, there is a session variable called statement_timeout which, by default, has a value of 0 (which means unlimited). Suppose you set this variable to 60s, do something with the connection and then return it to the pool. Then, the next process to retrieve that particular connection from the pool will have the statement_timeout value set to 60s and will have no notice that they're not getting a "clean" session/connection.

To battle this idea of state leakage, some connection pool instances automatically run commands on connections when they're returned to the pool whose purpose is to restore all the state back to defaults.

I was working on a proof-of-concept recently with a company who writes their code in C#/.NET and they were using NPGSQL (which is a Postgres driver library for .NET) to connect to CockroachDB. When you use the connection pool builtin to this project and close a connection, NPGSQL runs several commands against the connection:

  • CLOSE ALL;
  • UNLISTEN *;
  • SELECT pg_advisory_unlock_all();
  • DISCARD SEQUENCES;
  • DISCARD TEMP;

It makes sense that a Postgres-specific driver would run Postgres-specific commands against a data source that is known to be Postgres (or Postgres-compatible).

However, in the Java world, developers have many connection pooling toolsets from which to choose. A popular choice is HikariCP. HikariCP is not specific to a particular DB (like Postgres), but can be used against any JDBC Data Source (Oracle, MySQL, etc.). I have always assumed that Hikari does similar "resetting" logic but I tested that assumption this week and learned that it does not. I suppose this makes sense since it would have to have knowledge about the "right" commands for each DB platform in order to reset state properly.

So, I went about trying to make Hikari do this resetting for me.

Code Testing

First, I setup some simple code to show the problem:



    public void showBroken() throws SQLException  {

        try {
            HikariDataSource ds = new HikariDataSource();
            ds.setJdbcUrl("jdbc:postgresql://localhost:26257/default?sslmode=require");
            ds.setUsername("roach");
            ds.setPassword("roach");
            ds.setMaximumPoolSize(1);

            Connection conn1 = ds.getConnection();
            Statement stmt1 = conn1.createStatement();
            ResultSet rs1 = stmt1.executeQuery("show statement_timeout;");
            System.out.println("value of statement_timeout before messing with it: ");
            while (rs1.next()) {
                System.out.println(rs1.getString(1));
            }
            Statement stmt2 = conn1.createStatement();
            stmt2.execute("set statement_timeout = '60s';");
            ResultSet rs2 = stmt1.executeQuery("show statement_timeout;");
            System.out.println("value of statement_timeout after setting it: ");
            while (rs2.next()) {
                System.out.println(rs2.getString(1));
            }
            conn1.close();

            Connection conn2 = ds.getConnection();
            Statement stmt3 = conn2.createStatement();
            ResultSet rs3 = stmt3.executeQuery("show statement_timeout;");
            System.out.println("value of statement_timeout in new connection without changing anything: ");
            while (rs3.next()) {
                System.out.println(rs3.getString(1));
            }
            conn2.close();

            ds.close();
        }
        catch(Exception ex) {
            System.out.println(ex.toString());
        }


    }


Enter fullscreen mode Exit fullscreen mode

When I run this code, I get this output:



value of statement_timeout before messing with it: 
0
value of statement_timeout after setting it: 
60000
value of statement_timeout in new connection without changing anything: 
60000


Enter fullscreen mode Exit fullscreen mode

As you can see, we're "leaking" the 60s value between sessions. Not good!

I found one suggestion from the author of HikariCP on how to address this, which I implemented and it worked. However, there are additional classes involved, and it feels a little clunky and hard to follow.

So, I reached out to my colleague, Kai Niemi, who is a seriously smart dude, and he gave me two other suggestions which I found to be much cleaner.

Option 1 - Use a proxy

In this example, we use some Java Proxy + Reflection magic to create a HikariCP pool instance but we give ourselves a "hook" into the pooling events where we can insert a call to "DISCARD ALL;"



    public void runTest2() throws SQLException  {

        try {
            HikariDataSource ds = new HikariDataSource() {
                @Override
                public Connection getConnection() throws SQLException {
                    Connection delegate = super.getConnection();
                    return (Connection) Proxy.newProxyInstance(
                            DataSource.class.getClassLoader(),
                            new Class[] {Connection.class},
                            (proxy, method, args) -> {
                                if (method.getName().equals("close")) {
                                    Connection c = (Connection) proxy;
                                    try (Statement s = c.createStatement()) {
                                        s.execute("DISCARD ALL;");
                                    } catch (SQLException e) {
                                        // not much to do, proceed with close
                                    }
                                }
                                return method.invoke(delegate, args);
                            });
                }
            };
            ds.setJdbcUrl("jdbc:postgresql://localhost:26257/default?sslmode=require");
            ds.setUsername("roach");
            ds.setPassword("roach");
            ds.setMaximumPoolSize(1);

            Connection conn1 = ds.getConnection();
            Statement stmt1 = conn1.createStatement();
            ResultSet rs1 = stmt1.executeQuery("show statement_timeout;");
            System.out.println("value of statement_timeout before messing with it: ");
            while (rs1.next()) {
                System.out.println(rs1.getString(1));
            }
            Statement stmt2 = conn1.createStatement();
            stmt2.execute("set statement_timeout = '60s';");
            ResultSet rs2 = stmt1.executeQuery("show statement_timeout;");
            System.out.println("value of statement_timeout after setting it: ");
            while (rs2.next()) {
                System.out.println(rs2.getString(1));
            }
            conn1.close();

            Connection conn2 = ds.getConnection();
            Statement stmt3 = conn2.createStatement();
            ResultSet rs3 = stmt3.executeQuery("show statement_timeout;");
            System.out.println("value of statement_timeout in new connection without changing anything: ");
            while (rs3.next()) {
                System.out.println(rs3.getString(1));
            }
            conn2.close();

            ds.close();
        }
        catch(Exception ex) {
            System.out.println(ex.toString());
        }


    }


Enter fullscreen mode Exit fullscreen mode

The output we get from this code shows that we're not leaking anymore - yay!



value of statement_timeout before messing with it: 
0
value of statement_timeout after setting it: 
60000
value of statement_timeout in new connection without changing anything: 
0


Enter fullscreen mode Exit fullscreen mode

Option 2 - Use a library

In this option, we wrap our HikariCP instance in another object. The nice thing about this option is that it can be used with any connection pooling library (not just HikariCP); but, it does require that you add an additional dependency to your project: https://github.com/jdbc-observations/datasource-proxy



    private void runTest3() {

        try {

            HikariConfig config = new HikariConfig();
            config.setJdbcUrl("jdbc:postgresql://localhost:26257/default?sslmode=require");
            config.setUsername("roach");
            config.setPassword("roach");
            config.setMaximumPoolSize(1);

            DataSource ds = ProxyDataSourceBuilder
                    .create(new HikariDataSource(config))
                    .listener(new JdbcLifecycleEventListenerAdapter() {
                        @Override
                        public void afterGetConnection(MethodExecutionContext executionContext) {
                            Connection c = (Connection) executionContext.getResult();
                            try (Statement s = c.createStatement()) {
                                s.execute("DISCARD ALL;");
                            } catch (SQLException e) {
                                // whatever
                            }
                        }
                    })
                    .build();

            Connection conn1 = ds.getConnection();
            Statement stmt1 = conn1.createStatement();
            ResultSet rs1 = stmt1.executeQuery("show statement_timeout;");
            System.out.println("value of statement_timeout before messing with it: ");
            while (rs1.next()) {
                System.out.println(rs1.getString(1));
            }
            Statement stmt2 = conn1.createStatement();
            stmt2.execute("set statement_timeout = '60s';");
            ResultSet rs2 = stmt1.executeQuery("show statement_timeout;");
            System.out.println("value of statement_timeout after setting it: ");
            while (rs2.next()) {
                System.out.println(rs2.getString(1));
            }
            conn1.close();

            Connection conn2 = ds.getConnection();
            Statement stmt3 = conn2.createStatement();
            ResultSet rs3 = stmt3.executeQuery("show statement_timeout;");
            System.out.println("value of statement_timeout in new connection without changing anything: ");
            while (rs3.next()) {
                System.out.println(rs3.getString(1));
            }
            conn2.close();

        }
        catch(Exception ex) {
            System.out.println(ex.toString());
        }

    }


Enter fullscreen mode Exit fullscreen mode

Again, we see that the output proves that our leak has been plugged.



value of statement_timeout before messing with it: 
0
value of statement_timeout after setting it: 
60000
value of statement_timeout in new connection without changing anything: 
0


Enter fullscreen mode Exit fullscreen mode
cockroachdb Article's
30 articles in total
Favicon
CockroachDB: fast-start configuration on a fresh cluster
Favicon
CockroachDB on OpenShift: Separate your logs from data!
Favicon
Building an API with Go, PostgreSQL, Google Cloud and CockroachDB
Favicon
From CockroachDB to AWS SNS via AWS API Gateway
Favicon
Generate multiple, large, sorted CSV files with pseudo-random data
Favicon
From Scratch to Storage, Note App with AnalogJs, tRPC, Prisma and CockroachDB
Favicon
Building a REST API using NodeJS, ExpressJS, and CockroachDB
Favicon
Migrate Your Schema into CockroachDB Cloud with Prisma Migrate
Favicon
CockroachDB SSO login to the SQL prompt via JWT
Favicon
CockroachDB: Multi-Region OpenShift using Azure Virtual WAN
Favicon
Repaving CockroachDB in AWS EC2
Favicon
How to Build a Server with Hanami and CockroachDB | Ruby
Favicon
Experiment workload performance impact by number of Connections
Favicon
CockroachDB Integration with Superset
Favicon
Online Schema Changes with CRDB
Favicon
CockroachDB: row-level TTL to simulate Redis
Favicon
Jim's Guide to CockroachDB Naming Standards
Favicon
Memory Management in CockroachDB
Favicon
Efficiently deleting data
Favicon
Kafka 2 CockroachDB via JDBC Sink Connector Blueprint
Favicon
Running Multi-region CockroachDB on k8s -- the internals
Favicon
Running CockroachDB on k8s - with tweaks for Production
Favicon
Spark Update Optimizations
Favicon
A Tale of Two Connection Pools
Favicon
Repaving CockroachDB cluster node VMs the easy way
Favicon
CockroachDB: trace logging with Datadog
Favicon
Ingesting data from Kafka to CockroachDB via Kafka Connect
Favicon
Display CockroachDB metrics in Splunk Dashboards
Favicon
Build a CockroachDB Control Plane using Ansible Tower
Favicon
CockroachDB vs PostgreSQL

Featured ones: