Logo

dev-resources.site

for different kinds of informations.

Reading Parallel Plans Correctly

Published at
12/23/2024
Categories
postgres
sql
postgressql
query
Author
saby_explain
Categories
4 categories in total
postgres
open
sql
open
postgressql
open
query
open
Author
12 person written this
saby_explain
open
Reading Parallel Plans Correctly

Historically, the PostgreSQL server operating model looks like many independent processes with partially shared memory. Each process serves only one client connection and handles one query at a time, meaning no multithreading occurs.
That’s why, within each individual process, you won’t find any typical oddities with parallel code execution, blocks, race condition, etc. The development of the DBMS itself becomes a pleasant and simple task.
But this very simplicity also brings a significant limitation. Since there is only one worker thread per process, it can use no more than one CPU core to execute a query, which means the server's speed depends directly on the frequency and architecture of a separate core.
In today's world where multicore and multiprocessor systems have stolen the limelight, such behavior is an unacceptable luxury and waste of resources. For this reason, starting with PostgreSQL 9.6, a number of operations in a query can be parallelized, i.e. performed by several processes simultaneously.
Image description
You can find a number of cases of parallelism described in the article “Parallelism in PostgreSQL” by Ibrar Ahmed, from where this image is taken.
If you're using one of the latest versions of PostgreSQL, there is a good chance you'll see Parallel… in the plan when dealing with Seq Scan, Join, Aggregate, Append, etc.
Parallelism brings certain complications, but don't worry — we've already studied these peculiarities and have you covered.

As we keep sharing more features of our PostgreSQL query analysis service Saby Explain, below you’ll find our experience of addressing these challenges.

Time oddities

Let’s have a look at a plan from PostgreSQL 9.6:
Image description
Parallel Seq Scan alone was running 153.621 ms within a subtree, while Gather with all its subnodes — only 104.867 ms.
Why do we get less time for a parent with its children?
Let's have a closer look at the Gather node:

Gather (actual time=0.969..104.867 rows=333333 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared hit=4425
Enter fullscreen mode Exit fullscreen mode

Workers Launched: 2 shows us that 2 more processes down the tree have been added to the main process. That means the Gather time is the summary of all the 3 processes altogether.
Now let's see what’s happening in Parallel Seq Scan:

Parallel Seq Scan on tst (actual time=0.024..51.207 rows=111111 loops=3)
  Filter: ((i % 3) = 0)
  Rows Removed by Filter: 222222
  Buffers: shared hit=4425
Enter fullscreen mode Exit fullscreen mode

loops=3 indicates that we’re dealing with the data on all the 3 processes here. On average, each loop took 51.207 ms, meaning the server needed 51.207 x 3 = 153.621 milliseconds of CPU time to process this node in total. This is exactly the number we need to get a real picture of what the server was busy with.
Note that to understand the real execution time, we need to divide the total time by the number of workers— i.e., [actual time] x [loops] / [Workers Launched].
In our example, each worker executed only one cycle per node, so 153.621 / 3 = 51.207. It's no longer strange that the only Gather in the parent process was executed in sort of less time.
The main takeaway is that you actually need the total execution time of the node (all processes summed up) to find what was keeping your server busy and identify the query part for further optimization.
From this perspective, explain.depesz.com, for example, showing the “averaged real” time, looks less useful for debugging purposes:
Image description
Disagree? Feel free to comment below.

Gather Merge loses it all

Let’s run the same query with PostgreSQL 10:
Image description
Note that we now have the Gather Merge node instead of Gather in the plan. Here is what the manual says about it:

When the node at the top of the parallel portion of the plan is Gather Merge rather than Gather, it indicates that each process executing the parallel portion of the plan is producing tuples in sorted order, and that the leader is performing an order-preserving merge. In contrast, Gather reads tuples from the workers in whatever order is convenient, destroying any sort order that may have existed.

But it seems something is rotten in the state of Denmark:

Limit (actual time=110.740..113.138 rows=10000 loops=1)
  Buffers: shared hit=888 read=801, temp read=18 written=218
  I/O Timings: read=9.709
  ->  Gather Merge (actual time=110.739..117.654 rows=10000 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=2943 read=1578, temp read=24 written=571
        I/O Timings: read=17.156
Enter fullscreen mode Exit fullscreen mode

When the Buffers and I/O Timings attributes were transmitted upwards through the tree, some data was lost. We can estimate the size of this loss as about 2/3, which have been formed by auxiliary processes.
Unfortunately, there is no way to get this information in the plan itself. That’s why we see negative values for the top node. And if we look at the further evolution of this plan in PostgreSQL 12, it remains unchanged, except for some stats added for each worker on the Sort node:

Limit (actual time=77.063..80.480 rows=10000 loops=1)
  Buffers: shared hit=1764, temp read=223 written=355
  ->  Gather Merge (actual time=77.060..81.892 rows=10000 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=4519, temp read=575 written=856
        ->  Sort (actual time=72.630..73.252 rows=4278 loops=3)
              Sort Key: i
              Sort Method: external merge  Disk: 1832kB
              Worker 0:  Sort Method: external merge  Disk: 1512kB
              Worker 1:  Sort Method: external merge  Disk: 1248kB
              Buffers: shared hit=4519, temp read=575 written=856
              ->  Parallel Seq Scan on tst (actual time=0.014..44.970 rows=111111 loops=3)
                    Filter: ((i % 3) = 0)
                    Rows Removed by Filter: 222222
                    Buffers: shared hit=4425
Planning Time: 0.142 ms
Execution Time: 83.884 ms
Enter fullscreen mode Exit fullscreen mode

There is one more takeaway I’d like to leave you with. Do not rely blindly on the data you see above the Gather Merge node. Get Saby Explain

postgressql Article's
30 articles in total
Favicon
GraphDB for CMDB
Favicon
Not able to connect to PostgreSQL server on Fedora
Favicon
Master Test Data Generation With dbForge Studio for PostgreSQL
Favicon
Exploring the Power of Full-Stack Development with Next.js and Prisma
Favicon
Bringing PostgreSQL Query Issues to Light with Insightful Visuals
Favicon
POSTGRESQL - ÍNDICE GIN NA PRÁTICA
Favicon
Reading PostgreSQL Query Plans Brought to a New Level
Favicon
Understanding PostgreSQL Isolation Levels
Favicon
How to Activate and Enable the PostgreSQL Service on Your Kali Linux System
Favicon
Mastering charts and database visualization with ChartDB
Favicon
The Best Ways to Connect to a PostgreSQL Database
Favicon
Hey, welcome to my blog
Favicon
How to Create a Database and Always Connect to It in PostgreSQL Without Needing Superuser Access
Favicon
03. ரிலேஷனல் டேட்டாபேஸ் மாடல் என்றால் என்ன? What is Relational Database Model ? (RDBMS)
Favicon
04. தரவு ஒருங்கிணைவு (Data Integrity)
Favicon
02. DBMS என்றால் என்ன? What is a DBMS?
Favicon
How To Use Materialized Views
Favicon
PostgreSQL Secrets You Wish You Knew Earlier
Favicon
Reading Parallel Plans Correctly
Favicon
New PostgreSQL ORM for Golang: Enterprise
Favicon
Migrate 🪳Coackroach DB into Postgres🐘
Favicon
💡 Database Development: It’s Not Just About Querying!
Favicon
Building Real-Time Data Pipelines with Debezium and Kafka: A Practical Guide
Favicon
01. தரவுத்தளம் எவ்வாறு உருவானது, அதன் தேவை என்ன? How did the database come about, What is its need?
Favicon
Data inconsistency in AWS Amazon Aurora Postgres solved with Local Write Forwarding?
Favicon
PostgreSQL vs. MySQL
Favicon
How To Handle Custom S/DQL Queries On Different Database Engine with DoctrineExpression
Favicon
Deploying PostgreSQL on Kubernetes: 2024 Guide
Favicon
Step-by-Step Guide to Installing PostgreSQL on Arch Linux
Favicon
Remedy for Poor-Performing SQL Queries

Featured ones: