Logo

dev-resources.site

for different kinds of informations.

Best Practices for GanosBase Real-Time Heatmap Aggregation Query

Published at
10/12/2024
Categories
database
cloudcomputing
polardb
postgressql
Author
Lara Lee
Best Practices for GanosBase Real-Time Heatmap Aggregation Query

This article introduces GanosBase's real-time heatmap aggregation query and dynamic heatmap tile output capabilities, built on Alibaba Cloud's PolarDB for PostgreSQL.

About GanosBase

Best Practices for GanosBase Real-Time Heatmap Aggregation Query
About GanosBase
GanosBase is a next-generation cloud-native spatial intelligence engine jointly developed by Alibaba Cloud Database Product Division and the Feitian Database and Storage Laboratory. It integrates spatiotemporal data processing capabilities into core products such as the cloud-native relational database PolarDB for PostgreSQL, cloud-native multi-model database Lindorm, cloud-native data warehouse AnalyticDB for PostgreSQL, and cloud database ApsaraDB RDS for PostgreSQL. GanosBase currently features ten core engines: Geometry, Raster, Trajectory, Surface Mesh, Volume Mesh, 3D Scene, Point Cloud, Networking, Geometry Grid, and Fast Display, providing databases with integrated capabilities for storage, query, analysis, and service of new physical world multi-modal and polymorphic data.
This document introduces GanosBase's real-time heatmap aggregation query and dynamic heatmap tile output capabilities, built on Alibaba Cloud's PolarDB for PostgreSQL.

About Heatmap Tiles

What are Heatmap Tiles

Heatmap Tiles (HMT) are based on GanosBase's pioneering real-time heatmap aggregation query technology for large-scale vector/trajectory data. They are used to immediately return query results to the client, changing the traditional method of heatmap statistical analysis that required pre-coding for aggregation and pre-slicing for display. HMT can aggregate and render data on a scale of millions, tens of millions, and even hundreds of millions in seconds. HMT supports various commonly used aggregation functions and algebraic expressions, allowing customers to choose indicators of interest for their business and dynamically compute and render at different levels as the map zooms in and out. This greatly enhances business efficiency and offers more possibilities for customer products. At this year's Yunqi Conference, GanosBase released this capability and demonstrated a case of real-time query aggregation of large-scale transportation trajectories using HMT, helping customers to fully online data products that previously required offline preprocessing. This feature has received significant industry recognition and approval.

Use Cases for Heatmap Tiles

The key feature of Heatmap Tiles is real-time aggregation and rendering of spatial data, primarily applied in business scenarios with massive amounts of vector data requiring real-time statistical analysis, such as:

1. Transportation: Aggregate real-time heatmaps across entire regions based on the historical trajectory lines of transportation tools (cars, ships, etc.) and filter them by conditions such as time (winter, summer), start/end points, and type (freight, passenger) to generate corresponding heatmaps in real-time.

2. Urban Management: Based on building footprint data, aggregate single indicators such as building density, average building height, and total building area across entire regions and combine them with land parcel information to calculate composite indicators like floor area ratio.

3. Shared Mobility: Aggregate heatmaps of equipment docking areas across entire regions based on the trajectory points of shared mobility devices, and analyze shared mobility equipment's dispatch and operations strategies based on events (unlocking/locking, boarding/alighting, accidents, damage), flow directions, etc.

Technical Advantages of Heatmap Tiles

Compared to the pre-coding and aggregation method based on H3 or S2 grids, HMT Heatmap Tiles offer the following advantages:

1. Extremely efficient, no pre-coding required, and no additional storage costs.
HMT's aggregation technology differs from grid aggregation methods like H3/S2 in technical characteristics and application scenarios. Grid aggregation methods are often used when grid coding is needed as a retrieval condition because they require a predetermined precision level. The vector data is then coded at that precision level and aggregated based on the coding. However, HMT does not require pre-coding of data and can aggregate based on the current viewport range. As the viewport zooms in and out, aggregation is performed in real time. The entire process is equally efficient for various types of geometries, capable of aggregating and rendering data on a scale of hundreds of millions in seconds.

2. Convenient and easy to use, aggregation results can be directly visualized.
HMT offers the capability to quickly tile the aggregation results, allowing direct visualization integration with front-end rendering engines to ensure what you see is what you get. HMT also provides a series of statistical functions to help users quickly and automatically generate the best rendering color tables, ensuring optimal front-end performance.

Tested in real-world scenarios by multiple customers, HMT's aggregation efficiency is extremely high, typically achieving full map aggregation of hundreds of millions of data points in seconds:

Image description

Note: The above data represents full aggregation efficiency at a global display scale; efficiency increases as the map zooms in.

Functionality of Heatmap Tiles

Heatmap Tiles include a series of SQL functions for generating and calculating heatmap tiles, including:
â—Ź ST_AsHMT: Converts a set of geometric or trajectory objects into heatmap matrix tiles according to the specified range and resolution.
â—Ź ST_HMTAsArray: Converts heatmap tiles into an array matrix representation for easy viewing.
â—Ź ST_HMTStats: Calculates statistical information for heatmap tiles.
â—Ź ST_HMTAsRaster: Converts heatmap tiles into Raster objects for viewing and computation.

Best Practices for Heatmap Tiles

Operational Steps

  1. Import geometric or trajectory data into the database, preferably using the FDW method. Ensure that all objects share the same spatial reference system (this can be confirmed with the ST_Srid function).

  2. Create spatial indexes for the geometry or trajectory columns:
    CREATE INDEX index_name ON table_name USING GIST(column_name)

  3. Query heatmap tiles based on the spatial range:
    Aggregate the number of objects within the grid:

SELECT ST_AsHMT(column_name, --geometry type
ST_MakeEnvelope(0, 0, 10, 10, 4326), -- Extent
512, -- Width
512 -- height
)
FROM table_name
WHERE column_name && ST_MakeEnvelope(0, 0, 10, 10, 4326);

You can also aggregate the values within a grid by summing the values in the value column:

SELECT ST_AsHMT(column_name, --geometry type
ST_MakeEnvelope(0, 0, 10, 10, 4326), -- Extent
512, -- Width
512, -- height
value -- value column
)
FROM table_name
WHERE column_name && ST_MakeEnvelope(0, 0, 10, 10, 4326);

The ST_MakeEnvelope function can use the ST_TileEnvelope function to obtain the tile range. Additional filtering conditions can also be added:
SELECT ST_AsHMT(column_name, --geometry type
ST_MakeEnvelope(0, 0, 10, 10, 4326), -- Extent
512, -- Width
512, -- height
value -- value column
)
FROM table_name
WHERE column_name && ST_MakeEnvelope(0, 0, 10, 10, 4326);
AND name like 'xxxx%' AND value > 100;

Usage Tips

â—Ź Improve performance with parallel processing when dealing with large datasets, as shown in the example below with a parallelism level of 16:
SET max_worker_processes = 300; -- Maximum number of background processes
set max_parallel_workers = 260; -- maximum number of workers
set max_parallel_workers_per_gather = 16; --maximum number of workers that can be started by a single Gather or Gather Merge node
alter table table_name set (parallel_workers=16);
set force_parallel_mode = on;

In practice, you can adjust settings based on the viewport range, such as using 16 parallel processes at higher levels and no parallelism at lower levels.

If sufficient CPU resources are available, to ensure that each query can utilize parallel processing, set max_worker_processes and max_parallel_workers to the product of parallelism and concurrency. Refer to the official PostgreSQL documentation.

â—Ź Tile Size
In most cases, use 512x512 tiles and resample to 256x256 to avoid aliasing issues. In special cases, such as when data volume is extremely large and each tile calculation is time-consuming, using larger tiles (1024x1024) can reduce the number of tile fetches and improve performance.

â—Ź Use the Operator for Spatial Filtering
Since ST_AsHMT computation is much faster than ST_Intersects, use && instead of ST_Intersects for index filtering.
SELECT ST_AsHMT(column_name, --geometry type
ST_MakeEnvelope(0, 0, 10, 10, 4326), -- Extent
512, -- Width
512, -- height
value -- value column
)
FROM table_name
WHERE column_name && ST_MakeEnvelope(0, 0, 10, 10, 4326);

â—Ź Perform Spatial Reference Conversion on Query Range
When the query range and the geometry object's spatial range are inconsistent, perform spatial reference conversion on the query range before querying. Otherwise, automatic conversion may result in lower performance. After obtaining the tile, convert the image to the specified spatial reference for display.
SELECT ST_AsHMT(column_name, -- srid = 4326
ST_Transform(ST_TileEnvelope(6, 48, 32), 4326), -- Extent
512, -- Width
512, -- height
value -- value column
)
FROM table_name
WHERE column_name && ST_Transform(ST_TileEnvelope(6, 48, 32), 4326));

â—Ź Perform VACUUM FULL and CLUSTER Operations on Spatial Tables
VACUUM operations reclaim free space and reduce disk file size, reducing IO during queries. CLUSTER operations align data organization with indexes, storing adjacent spatial data in adjacent data pages, and reducing disk access by the database.

VACUUM full table_name;
Cluster table_name using index_name;

Publishing Services for Viewing

We will use Node.js to write a simple application demonstrating the actual use case of heatmap tiles.
File Structure

└── hmt_server
├── app.js
├── hmt. proto
├── index.html
└── package.json

Where hmt. proto is the proto file introduced in the ST_AsHMT section above, and the contents of the other files will be provided below.

Backend Code

{
"name": "hmt_server",
"version": "1.0.0",
"main": "app.js",
"license": "ISC",
"dependencies": {
"chroma-js": "^2.4.2",
"express": "^4.18.2",
"lru-cache": "^10.1.0",
"pg": "^8.11.3",
"protobufjs": "^7.2.5",
"sharp": "^0.32.6"
}
}

const express = require('express');
const { Pool } = require('pg');
const chroma = require('chroma-js');
const sharp = require("sharp");
const protobuf = require('protobufjs');
const { LRUCache } = require('lru-cache');

// Set up database connection
const CONNECTION = {
  user: 'YOUR_USER',
  password: 'YOUR_PWD',
  host: 'YOUR_HOST',
  database: 'YOUR_DB',
  port: YOUR_PORT
};

// Target table name
const TABLE_NAME = 'YOUR_TABLE';

// Target geometry column name
const GEOMETRY_COLUMN = 'YOUR_GEOM_COLUMN';

// Set no data value
const NO_DATA_VALUE = 0;

// Target geometry column spatial reference
const SRID = 4326

// Set color map
const COLOR_MAP = [
  ['#536edb', 1],
  ['#5d96a5', 3],
  ['#68be70', 5],
  ['#91d54d', 7],
  ['#cddf37', 9],
  ['#fede28', 11],
  ['#fda938', 13],
  ['#fb7447', 15],
  ['#f75a40', 17],
  ['#f24734', 19],
  ['#e9352a', 21],
  ['#da2723', 23],
  ['#cb181d', 25]
];

// Create a database connection pool, default is 10 connections
const pool = new Pool(CONNECTION);

// Configure color conversion
const [colors, domains] = COLOR_MAP.reduce(([c, d], [colors, domains]) =>
  [[...c, colors], [...d, domains]], [[], []]);
const colorMap = chroma.scale(colors).domain(domains).mode('rgb')

// Load protobuf
const hmtDecoder = protobuf.loadSync('./hmt.proto').lookupType('HMT');

// Create a 1x1 transparent PNG to return as an empty tile
const emptyPng = Buffer.from('iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAACXBIWXMAAA7EAAAOxAGVKw4bAAAADUlEQVQImWP4//8/AwAI/AL+hc2rNAAAAABJRU5ErkJggg==', 'base64');

// For small-scale tiles (z<5), where updates are relatively minor, set a 24-hour cache expiration
const globalCache = new LRUCache({ max: 1000, ttl: 1000 * 3600 * 24 });

// For larger-scale tiles (z>=5), set a 12-hour cache expiration, adjustable based on actual conditions
const localCache = new LRUCache({ max: 2000, ttl: 1000 * 3600 * 12 });

// Register Express routes
express()
  // Serve the HTML page
  .get("/", (_, res) => res.sendFile('index.html', { root: __dirname }))
  // Serve heatmap tile service
  .get('/hmt/:z/:x/:y', async ({ params: { z, x, y } }, res) => {
    const cache = z < 5 ? globalCache : localCache;
    const key = `${z},${x},${y}`
    if (!cache.has(key)) {
      // Set parallelism and call ST_AsHMT function to request a 256x256 heatmap tile for the area
      const parallel = z <= 5 ? 10 : 5;
      const sql = `
  set max_parallel_workers = ${parallel};
  set max_parallel_workers_per_gather = ${parallel};
  WITH _PARAMS(_BORDER) as (VALUES(ST_Transform(ST_TileEnvelope(${key}),${SRID})))
  SELECT ST_AsHMT(${GEOMETRY_COLUMN},_BORDER,256,256) tile
  FROM ${TABLE_NAME},_PARAMS
  WHERE _BORDER && ${GEOMETRY_COLUMN};`
      // Skip the set statement and get the result of the ST_AsHMT function
      const { rows: [{ tile }] } = (await pool.query(sql))[2];

      // If there is no data in the area, return an empty tile
      if (!tile) cache.set(key, emptyPng);
      else {
        // Parse protobuf result
        const { type, doubleValues, intValues } = hmtDecoder.decode(tile);
        const { values } = type == 1 ? doubleValues : intValues;

        // Convert values to corresponding colors and remove no data values
        const pixels = values.reduce((_pixels, value) => {
          _pixels.push(...colorMap(value).rgb());
          _pixels.push(value <= NO_DATA_VALUE ? 0 : 255);
          return _pixels;
        }, [])

        // Render as PNG tile
        const rawConfig = { raw: { width: 256, height: 256, channels: 4 } };
        const renderedPng = await sharp(Uint8Array.from(pixels), rawConfig)
          .png().toBuffer();
        cache.set(key, renderedPng);
      }
    }
    const tile = cache.get(key)
    res.set("Content-Type", "image/png").send(tile);
  })
  // Listen on port 5500
  .listen(5500, () => console.log('HMT server started.'));

Frontend Code
We use Mapbox as the front-end map SDK. You can apply for a token here. Since the heatmap tiles are finally rendered in PNG format, they are compatible with most other map SDKs.

<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  <title>HMT Viewer</title>

  <meta name="viewport" content="initial-scale=1,maximum-scale=1,user-scalable=no">
  <link href="https://api.mapbox.com/mapbox-gl-js/v2.14.1/mapbox-gl.css" rel="stylesheet">
  <script src="https://api.mapbox.com/mapbox-gl-js/v2.14.1/mapbox-gl.js"></script>

</head>

<body>
  <div id="map" style="position: absolute;left:0; top: 0; bottom: 

0; width: 100%;"></div>

  <script>
    let CENTER = [YOUR_LONGITUDE, YOUR_LATITUDE]
    mapboxgl.accessToken = YOUR_MAPBOX_TOKEN;
    const map = new mapboxgl.Map({
      container: 'map',
      style: "mapbox://styles/mapbox/navigation-night-v1",
      center: CENTER,
      zoom: 5
    })
    map.on("load", () => {
      map.addSource('hmt_source', {
        type: 'raster',
        minzoom: 3,
        tiles: [`${window.location.href}hmt/{z}/{x}/{y}`],
        tileSize: 256,
      });
      map.addLayer({
        id: 'hmt',
        type: 'raster',
        source: 'hmt_source',
      });
    });
  </script>

</body>

</html>

Installation and Deployment

# Navigate to the hmt_server directory
cd ./hmt_server
# Install dependencies
npm i
# Run the heatmap tile service
node .
# You can then open your browser and log in to http://localhost:5500/ to see the effect

Preview

Real-time Aggregation of Ship Trajectory Lines

Image description

Image description

  • 31 million trajectory points, 450,000 trajectory lines real-time aggregation

Summary

Currently, GanosBase has evolved to version 6.0, supporting thousands of application scenarios across dozens of industries. Stability, cost-effectiveness, performance, and ease of use have always been GanosBase's long-term goals. HMT Heatmap Tiles represent GanosBase's core competitive edge in the efficient aggregation and visualization of large-scale spatial data. It provides customers with truly efficient and easy-to-use solutions for large-scale data analysis and mining, and we welcome all users to experience it.

Next Step

Start to build your heatmap application with PolarDB for PostgreSQL always a free program. Or you can follow our best practice to enable one in just a few clicks.

Originally published at https://www.alibabacloud.com.

Featured ones: