Logo

dev-resources.site

for different kinds of informations.

How to reduce Snowflake costs: A five-point checklist

Published at
5/23/2023
Categories
snowflake
datawarehouse
cloudcosts
Author
acossta
Categories
3 categories in total
snowflake
open
datawarehouse
open
cloudcosts
open
Author
7 person written this
acossta
open
How to reduce Snowflake costs: A five-point checklist

Facing high Snowflake costs?

Check out our practitioner's five-point checklist to reduce Snowflake costs and optimize your data warehouse usage.

Snowflake is an incredibly powerful and scalable data warehouse, but without proper management, your costs can skyrocket and get out of control. In this checklist, we will explore five key strategies for optimizing Snowflake costs, ensuring that your data warehouse remains efficient and cost-effective.

1. Set the correct warehouse auto-suspend

What is the Auto-Suspend feature?

Auto-Suspend is a Snowflake feature that allows your warehouse to automatically suspend itself after a specified period of inactivity. This helps to reduce costs by preventing unnecessary credit usage when the warehouse is not in use.

The default value and why it's inefficient

The default auto-suspend value is 10 minutes, but this is often too long for many workloads. For example, if a query lasts 20 seconds, the warehouse will still be charged for 10 minutes of usage, leading to unnecessary costs.

Example query for querying and modifying auto-suspend

To query the auto-suspend settings for all warehouses, use the following query:

SHOW WAREHOUSES;
Enter fullscreen mode Exit fullscreen mode

The auto_suspend column specifies how long a running warehouse can remain inactive, in seconds, before automatically suspending and stopping credit usage. A null value means that the warehouse will never automatically suspend. You’ll want to avoid having any warehouse with a null auto-suspend. Typically, 60 , meaning an auto-suspend of a minute is a good place to start.

To modify the auto-suspend setting for a specific warehouse to 1 minute (60 seconds), you can run the following:

ALTER WAREHOUSE "<warehouse_name>"
  SET AUTO_SUSPEND = 60;
Enter fullscreen mode Exit fullscreen mode

2. Right-size your warehouses: monitor remote disk spillage

What is remote disk spillage, and why it causes costs to increase

Remote disk spillage occurs when a virtual warehouse runs out of memory and begins spilling intermediate results to remote storage. This can lead to increased query times and higher Snowflake costs due to increased I/O operations and additional credit usage.

Identify the warehouses that have remote disk spillable

To monitor remote disk spillage, you can use Snowflake's QUERY_HISTORY function. Here's an example query to identify the warehouses with remote disk spillage over the last 30 days:

SELECT  
    WAREHOUSE_NAME,
    SUM(BYTES_SPILLED_TO_REMOTE_STORAGE) as TOTAL_BYTES_SPILLED_TO_REMOTE_STORAGE
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE START_TIME >= DATEADD('day', -30, CURRENT_TIMESTAMP())
GROUP BY WAREHOUSE_NAME;
Enter fullscreen mode Exit fullscreen mode

Fixing remote disk spillage by increasing warehouse size

To fix remote disk spillage, you can increase the size of your warehouse. By doing so, you will allocate more memory to the warehouse and reduce the need for remote storage, ultimately lowering costs. With a larger warehouse, your queries will complete more than twice as fast. Make sure to monitor and adjust warehouse sizes as needed to prevent spillage.

3. Use a serving layer for high-concurrency sub-second queries

Sometimes, you require high-concurrency sub-second queries. These are critical for customer-facing dashboards, data APIs, and usage metering, where fast response times are essential.

How a serving layer on top of Snowflake works

A serving layer on top of Snowflake syncs data to high-speed storage, optimizes it, and serves it via an API without consuming Snowflake credits. This reduces Snowflake costs while providing high-performance analytics to your end-user applications.

It is a better alternative to traditional embedded analytics because it reduces the load on Snowflake. Furthermore, by providing an API, it offers greater flexibility and control.

Propel's data API platform provides engineering teams with a unified platform for delivering high-performance customer-facing analytics. Its serving layer solution offers an analytics backend with a GraphQL API and React UI component library, requiring no infrastructure scaling or management. To learn more about Propel, read the docs.

4. Identify and fix inefficient queries

Inefficient queries are the source of a lot of waste. In this section, we show you how to identify them and share some tips to optimize them.

How to identify inefficient queries

Inefficient queries can significantly contribute to higher Snowflake costs. To identify them, you can monitor the QUERY_HISTORY view and look for queries with long execution times or high resource consumption.

You can use the following query to identify your slowest queries in the last 30 days in Snowflake:

SELECT  
    QUERY_TEXT,       
    SUM(TOTAL_ELAPSED_TIME) AS TOTAL_ELAPSED_TIME,
    SUM(BYTES_SCANNED) AS BYTES_SCANNED
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE START_TIME >= DATEADD('day', -30, CURRENT_TIMESTAMP())
GROUP BY QUERY_TEXT
ORDER BY TOTAL_ELAPSED_TIME DESC
Enter fullscreen mode Exit fullscreen mode

Tips to improve query efficiency

To make queries more efficient, consider the following tips:

  1. Use selective filtering and avoid SELECT *.
  2. Optimize JOIN operations.
  3. Make sure tables have an ORDER BY.
  4. Limit the number of rows returned using LIMIT.
  5. Use materialized views for pre-computed results.
  6. Leverage clustering keys to improve query performance.

5. Set up Resource Monitors

What are Resource Monitors?

Resource Monitors in Snowflake allow you to track and manage your credit usage, helping to prevent unexpected costs. You can set thresholds that, when exceeded, trigger actions such as suspending a warehouse or sending notifications.

Example: Creating a Resource Monitor

To create a Resource Monitor that suspends a warehouse when credit usage exceeds a specified amount, you can use the following SQL command:

CREATE RESOURCE MONITOR "monitor_name"
  WITH CREDIT_QUOTA = <quota>
  TRIGGERS
    ON 100 PERCENT DO SUSPEND;
Enter fullscreen mode Exit fullscreen mode

Conclusion

Optimizing Snowflake costs is crucial for maintaining an efficient and cost-effective data warehouse. By following this five-point checklist, you can optimize warehouse usage by preventing remote disk spillage, adjusting auto-suspend settings, leveraging a serving layer for high-concurrency queries, and monitoring resource consumption. So, start implementing these strategies today and make the most of your Snowflake investment.

Further reading

If you don’t have a Propel account yet, you can try Propel for free and start building data apps.

datawarehouse Article's
30 articles in total
Favicon
Uses of Snowflake Schema
Favicon
Snowflake vs. Databricks vs. AWS Redshift
Favicon
Understanding Data Schemas
Favicon
Mastering Scalable Data Warehousing on AWS: From S3 to Semantic Layers with AtScale
Favicon
High-Effective Business-Approach Data Layers in Warehousing
Favicon
Building a Scalable Data Platform: Addressing Uncertainty in Data Requirements with AWS
Favicon
Celebrating My Achievement: Snowflake Badge 1 Completion 🎉
Favicon
Best Practices for Migrating Your Data to the Cloud
Favicon
Essential Best Practices for Data Warehousing
Favicon
The Untold Truth: Data Quality Issues in Your Data Warehouse Nobody Will Tell You About
Favicon
Best Practices for Implement Data Lake in Data Management
Favicon
10 Reasons to Make Apache Iceberg and Dremio Part of your Data Lakehouse Strategy
Favicon
Embracing the Future of Database Management: A Deep Dive into Amazon Aurora Limitless Database
Favicon
Unlocking Business Potential with Data Warehouse Services: A Comprehensive Overview
Favicon
A major culprit in the slow running and collapse of a database
Favicon
Breaking Free from Proprietary Clouds (Snowflake, RedShift, BigQuery): Top Open Source Alternatives to OLAP Databases
Favicon
🚀 Exciting Developments in Enterprise Data Warehouses! 🌐
Favicon
Data Warehouse Concepts, focusing on the Kimball vs. Inmon methodologies
Favicon
Data Modeling
Favicon
CDP vs Data Warehouse
Favicon
A Comprehensive Guide to AWS DynamoDB vs. Redshift for Databases and Data Warehouses
Favicon
Snowflake: Revolutionizing data warehousing
Favicon
Powering Rapid Data Applications Using Your Data Warehouse With VulcanSQL
Favicon
Prescrição SQL: A Linguagem SQL Ajudando na Gestão Hospitalar
Favicon
ByteDance Open Sources Its Cloud Native Data Warehouse : ByConity
Favicon
How to reduce Snowflake costs: A five-point checklist
Favicon
DataWarehouse and BigQuery
Favicon
AWS DMS and Prefect: The Key to Building a Robust Data Warehouse
Favicon
Unleash the Power of Chaos Genius to Reduce Data Warehouse Costs and Boost Data ROI
Favicon
SelectDB is originated from Apache Doris so when processing, we SHARE THE SAME SPEED!!!

Featured ones: