Logo

dev-resources.site

for different kinds of informations.

Shrink UUIDs with PostgreSQL or Ruby

Published at
2/23/2024
Categories
uuid
postgres
ruby
primarykey
Author
Sven Schwyn
Categories
4 categories in total
uuid
open
postgres
open
ruby
open
primarykey
open
Shrink UUIDs with PostgreSQL or Ruby

For a pet project of mine, I'm experimenting with alternatives to the classic SERIAL primary keys which are simple, local and performant. However, you don't want to expose them in URLs because this does away with one layer of protection against forced browsing.

Weak serial booking codes were also at the root of the white hat hack of airline passenger details back in 2016.

Let me share my findings with you, but note: This is not a one-size-fits-all thing, you have to pick your primary keys according to your specific needs.

UUIDv7

An interesting alternative to SERIAL are UUIDv7. They are less performant and more memory hungry given the length, but unless you're dealing with a really huge number of records, the advantages might outweigh these downsides: They are time ordered and therefore INDEX-friendly (unlike fully random UUIDv4), yet their entropy is way large enough for external use and to safely expose them in URLs. (Even more so if some rate limiting is in place to slow down brute force attacks.)

However, UUIDs are the IBAN of IDs, a whopping 36 characters in length and therefore not exactly a healthy diet for your URLs. Say, you have unsafe restful routes with URLs looking as follows:

# Unnested route
https://example.org/users/533

# Nested route
https://example.org/users/533/transactions/743

Now make these routes safe by using UUIDv7 as primary key:

# Unnested route
https://example.org/users/26d45f0c-1fa4-4875-aac7-01a5ec424aac

# Nested route
https://example.org/users/26d45f0c-1fa4-4875-aac7-01a5ec424aac/transactions/b436b1f2-711c-4a50-bb44-4e41acb7e5d6

Boom! Even unnested routes are no longer handy. Is it really necessary for those UUIDs to take up so much space? Not really.

Shrink them UUIDs

It's quite easy to reduce the length of UUIDs by almost 40% from 36 to 22 characters without loosing any data or entropy. Here's how to do it in Ruby:

require 'base64'

def uuid_encode(uuid)
  [uuid.gsub('-', '')].
    pack('H*').
    bytes.
    pack('C*').
    then { Base64.urlsafe_encode64(_1) }.
    slice(0, 22)
end

def uuid_decode(encoded_uuid)
  Base64.urlsafe_decode64(encoded_uuid).
    bytes.
    pack('C*').
    unpack1('H*').
    unpack('a8a4a4a4a12').join('-')
end

(I'm looking forward to Ruby 3.4 which will allow to rewrite the above as then { Base64.urlsafe_encode64(it) }, neat!)

Let's quickly do some roundtrips to check whether this really works:

require 'securerandom'

1000000.times do
  uuid = SecureRandom.uuid
  fail 'mismatch' unless uuid == uuid_decode(uuid_encode(uuid))
end

The resulting URLs are quite a bit shorter now:

# Unnested route
https://example.org/users/JtRfDB-kSHWqxwGl7EJKrA

# Nested route
https://example.org/users/JtRfDB-kSHWqxwGl7EJKrA/transactions/tDax8nEcSlC7RE5BrLfl1g

They still don't win a beauty contest, but if that is what you're after, you'd better use slugs where it's safely possible.

PostgreSQL – take over

You might not want to encode and decode UUIDs on the application layer but do this on the database itself:

CREATE OR REPLACE FUNCTION uuid_encode(uuid uuid) RETURNS text AS $$
  SELECT translate(
    encode(
      decode(
        replace(
          uuid::text,
          '-', ''
        ),
        'hex'
      ),
      'base64'
    ),
    '+/=', '-_'
  );
$$ LANGUAGE sql;

CREATE OR REPLACE FUNCTION uuid_decode(encoded_uuid text) RETURNS uuid AS $$
  SELECT regexp_replace(
    encode(
      decode(
        translate(
          encoded_uuid,
          '-_', '+/'
        ) || '==',
        'base64'
      ),
      'hex'
    ),
    '(.{8})(.{4})(.{4})(.{4})(.{12})', '\1-\2-\3-\4-\5'
  )::uuid
$$ LANGUAGE sql;

Again, let's check whether this works:

DO
$do$
DECLARE
  uuid uuid;
BEGIN
 FOR i IN 1..1000000 LOOP
   uuid := gen_random_uuid();
   IF uuid_decode(uuid_encode(uuid)) != uuid THEN
     RAISE EXCEPTION 'mismatch';
   END IF;
 END LOOP;
END
$do$;

Unfortunately, as of PostgreSQL 16, UUIDv7 are not yet supported out of the box. For the time being, use an extension such as pg_uuidv7 or pg_idkit to generate UUIDv7 e.g. as default primary key when you CREATE new records.

Update December 2024

Still unfortunately, UUIDv7 didn't quite make it to PostgreSQL 17 neither but is en route to be merged before PostgreSQL 18 due in fall 2025. Meanwhile, you can either compile the pg_uuidv7 extension yourself or use one of my packages:

More chit chat on Mastodon

(Photo by Magda Ehlers)

Featured ones: