Logo

dev-resources.site

for different kinds of informations.

Securing Amazon Redshift - Best Practices for Access Control

Published at
1/12/2025
Categories
aws
redshift
security
beginners
Author
escosiakyle
Categories
4 categories in total
aws
open
redshift
open
security
open
beginners
open
Author
11 person written this
escosiakyle
open
Securing Amazon Redshift - Best Practices for Access Control

Introduction

Not long ago, I had the chance to conduct a knowledge transfer session focused on access management in Amazon Redshift for our partner client. As I started researching the topic, I realized something surprising - while there’s plenty of material explaining Redshift’s features, finding clear best practices or structured approaches for managing access was a challenge. Most of the information available felt basic and lacked the depth I was looking for. Yes, I also watched re:Invent videos.

This realization motivated me to dive deeper and build a better understanding of how to effectively secure Redshift. In this blog, I want to share what I’ve learned - not just the technical details, but practical strategies and insights I’ve gained from working hands-on with Redshift access management. Whether you’re setting up a new cluster or refining an existing one, I hope this guide gives you the tools and confidence to tackle access management with clarity and purpose.

Understanding Redshift Access Management

Of course, before we'll be able to apply these settings in Redshift, we need to understand how its security framework works first.

Redshift Built-In Security and Compliance

Let’s review the core components of Redshift security.

For this blog, I’ll skip VPC isolation since, if you’ve been using AWS for a while, you’re likely familiar with VPC-supported services.

1. Authentication

Redshift supports multiple authentication methods to verify user identities:

2. Authorization

Once users are authenticated, Redshift manages what they can access through robust authorization features:

  • Default Permissions - By default, only the owner of a database object (schema, tables, views) can modify or delete it, ensuring secure defaults. THIS CONCEPT IS IMPORTANT.
  • Users and Groups - Permissions can be granted to individual users or groups, allowing for more efficient access control. Groups are always advisable so that you won't have the pain of managing many users.
  • Role-Based Access Control (RBAC) - Roles simplify managing permissions by grouping privileges and assigning them to users.

A Review on Authentication vs. Authorization:
Authentication verifies who a user is, ensuring only valid individuals or systems can connect to your Redshift cluster through methods like credentials, MFA, or federated SSO. Authorization, on the other hand, determines what authenticated users can access and do within the system, such as querying specific tables, viewing certain rows, or accessing sensitive columns.

3. Data Encryption

Security isn’t just about controlling access; it’s also about protecting the data itself:

  • Data at Rest - Encrypted using AWS Key Management Service (KMS) or custom-managed keys.
  • Data in Transit - Protected with SSL to ensure secure communication.
  • Load Data Encryption - Ensures sensitive data remains protected even during data loading.

4. Advanced Access Management Features

  • Column-Level Security (CLS) - Restricts access to sensitive columns (e.g., SSNs, mobile number, credit cards) while allowing access to non-sensitive data.
  • Row-Level Security (RLS) - Ensures users see only the rows they are authorized to access, based on attributes or roles.
  • Dynamic Data Masking - Masks sensitive information for users without full permissions, providing an extra layer of data protection.

5. Compliance

Third-party auditors assess the security and compliance of Amazon Redshift as part of multiple AWS compliance programs. These include SOC, PCI, FedRAMP, HIPAA, and others.

Here's a link for the list: Compliance validation for Amazon Redshift

Understanding these foundational components is the first step toward mastering Redshift access management as each of these works together.

Default Permissions

I wanted to expand more about the Default Permissions in Redshift. In my experience, overlooking how these default settings interact with specific roles or use cases can lead to prolonged back-and-forth conversations between users and administrators, especially when working across time zones. This is why it’s essential to understand how these permissions work and how to adjust them as needed.

New users are assigned a set of default permissions that determine their initial access rights. These permissions provide a basic level of access to the database, allowing users to perform common actions while restricting access to sensitive areas.

The Principle of Object Ownership

  • When a user creates an object (e.g., a table, view, or schema) in Redshift, they automatically become its owner.
    • Key Rule - Only the owner has permission to modify or drop the object unless they explicitly grant those permissions to others.
  • This principle ensures that no unauthorized user can tamper with critical objects, even if they have general access to the schema.

GRANT and ALTER DEFAULT PERMISSIONS

The GRANT statement provides access to existing objects in Redshift, such as tables and views, for users, groups, or roles.

The ALTER DEFAULT PRIVILEGES statement is used to manage permissions for future objects that will be created in a schema. This is especially useful in collaborative environments where you want new objects to automatically inherit specific access rules.

Regranting Permissions After Object Modification - If a table or view is recreated (e.g., dropped and re-created), all previously granted permissions are lost and must be reapplied. Use ALTER DEFAULT PERMISSIONS statement to solve this or execute GRANT statement again.

Advanced Access Management Features

One of the requirements of a modern data platform is the ability to provide granular access control. Amazon Redshift delivers this through features such as Column-Level Security (CLS), Row-Level Security (RLS), and Dynamic Data Masking, ensuring users see only what they’re authorized to see. Let's go through them one-by-one.

1. Column-Level Security (CLS)

Column-Level Security restricts access to sensitive columns within a table while allowing users to interact with non-sensitive data. This prevents unnecessary exposure of information such as personal identifiers or payment details.

GRANT SELECT (customer_id, name, address) ON customers TO
GROUP customer_service_representatives;
Enter fullscreen mode Exit fullscreen mode

Creating a view is also an option and can sometimes be a straightforward and effective way to control column-level access.

CREATE VIEW customer_info AS
SELECT customer_id, name, address
FROM customers;
Enter fullscreen mode Exit fullscreen mode

You can also change this into a MATERIALIZED VIEW.

2. Row-Level Security (RLS)

RLS restricts access to specific rows within a table based on user roles or attributes, ensuring that users see only the data relevant to them.

This feature is essential for scenarios where data segregation is required, such as multi-tenant environments or organizations with hierarchical roles (e.g., regional managers, department leads).

CREATE RLS POLICY region_a_policy
USING (region = 'Region A');

ATTACH RLS POLICY region_a_policy ON sales_data TO ROLE
sales_manager;

ALTER TABLE sales_data ROW LEVEL SECURITY ON;
Enter fullscreen mode Exit fullscreen mode
  • The CREATE RLS POLICY statement defines the filtering condition (region = 'Region A').
  • The policy is attached to the sales_data table and applies only to users assigned the regional_manager_a role.
  • Row-level security is enabled for the table with ALTER TABLE.

Again, you can always use a VIEW for this.

row-level-security

An example of a row-level security policy

3. Dynamic Data Masking

Dynamic Data Masking (DDM) is a powerful feature in Amazon Redshift that protects sensitive data by replacing it with masked values when accessed by users without full permissions. Unlike traditional encryption, which hides data entirely, masking obfuscates sensitive fields while maintaining their usability for tasks like reporting and analysis.

How it works:

Data Obfuscation - Sensitive information is replaced with masked values when accessed by users without full permissions.

Conditional Masking - Masking can be applied based on user roles or attributes, ensuring the right users see appropriate levels of data.

The masking is applied at query runtime, ensuring that the underlying data remains unchanged.

dynamic-data-masking-redshift

Dynamic Data Masking in Redshift

How to implement:

1. Create a MASKING POLICY

Define how sensitive data should be masked.

Example: Masking a credit card number to show only the last four digits:

CREATE MASKING POLICY mask_credit_card_full
WITH (credit_card VARCHAR(256))
USING ('000000XXXX0000'::TEXT);
Enter fullscreen mode Exit fullscreen mode

2. Attach the MASKING POLICY

Apply the masking policy to the desired column and restrict access based on ROLES. You can also attach this to USERS.

Example: Attaching the policy to the credit_card column:

ATTACH MASKING POLICY mask_credit_card
ON credit_cards(credit_card)
TO ROLE customer_support_role;
Enter fullscreen mode Exit fullscreen mode

Step 3: Test the Policy

When users with the customer_support_role query the credit_card column, they'll see masked values:

Result:
****-****-****-1234
Enter fullscreen mode Exit fullscreen mode

Though, users with elevated permissions (e.g., administrators) will see the full credit card number.

Advanced: Custom Masking Logic

In cases where more complex masking logic is needed, you can define custom functions using Python as a language.

CREATE FUNCTION REDACT_CREDIT_CARD(credit_card TEXT)
RETURNS TEXT IMMUTABLE
AS $$
    import re
    regexp = re.compile("^([0-9]{6})[0-9]{5,6}([0-9]{4})")
    match = regexp.search(credit_card)
    if match:
        first = match.group(1)
        last = match.group(2)
    else:
        first = "000000"
        last = "0000"
    return f"{first}XXXXX{last}"
$$ LANGUAGE plpythonu;

CREATE MASKING POLICY custom_mask_credit_card
WITH (credit_card VARCHAR(256))
USING (REDACT_CREDIT_CARD(credit_card));

ATTACH MASKING POLICY custom_mask_credit_card
ON credit_cards(credit_card)
TO ROLE data_analyst_role;
Enter fullscreen mode Exit fullscreen mode

dynamic-data-masking-policies-redshift

Example of Masking Policies

Best Practices for Redshift Access Management

While Amazon Redshift provides many options for different use cases, it can sometimes feel overwhelming to decide which one to use. The key is to simplify your approach by focusing on your specific needs and objectives, rather than trying to use every available feature. Remember your KISS principle, people :) - keep it simple, stupid and straightforward.

In this section, let’s use the example of a Sales Report System where different teams, like sales analysts and regional managers, interact with data in various ways.

1. Role-Based Access Control (RBAC)

Instead of assigning permissions to individual users, define roles that are related to their job responsibilities.

CREATE ROLE sales_read_only;
GRANT SELECT ON ALL TABLES IN SCHEMA sales TO ROLE sales_read_only;
GRANT ROLE sales_read_only TO USER sales_analyst1;
Enter fullscreen mode Exit fullscreen mode

RBAC can only be assigned to USERS not GROUPS.

A USER can have multiple ROLES

2. Least Privilege Principle

If you’ve been in the cloud domain for some time now, you’ve likely heard this phrase countless times - and for good reason. The Least Privilege Principle means granting users only the permissions they need to perform their specific tasks. This approach minimizes risks by limiting access to resources, ensuring users can do their job without accidentally compromising security.

For instance, a Sales Analyst needs access to view the monthly_sales table but shouldn’t be able to modify it.

3. Use Groups for Database-Level Security

Group users into logical categories to manage permissions more efficiently. If you've worked with IAM in AWS, you'd be familiar with this concept.

4. Secure Authentication

Strengthen how users authenticate to your Redshift cluster to prevent unauthorized access. A secure authentication process is the first line of defense against security breaches.

Store and manage database credentials securely using AWS Secrets Manager.

5. Implement Advanced Access Management Features

Leverage Redshift’s built-in capabilities for fine-grained access control.

  • Column-Level Security (CLS) - Restrict access to sensitive fields

  • Row-Level Security (RLS) - Ensure users see only the rows they are authorized to access.

Both of these features are very useful when creating reports, though a key consideration is to whether to apply this within Redshift side vs. BI tools like Power BI, Tableau, QuickSight, and etc. Each approach has its trade-offs and should align with your overall data governance strategy.

  • Dynamic Data Masking (DDM) - Mask sensitive fields like credit card numbers for users without full permissions:

6. Use available Access Monitoring with Admin Scripts and Tools

Managing and monitoring access in Amazon Redshift becomes much easier when you have the right tools and scripts in place. To ensure a secure and well-governed environment, it’s a game-changer to use tools like Redshift Utils, which provide a collection of administrative views and scripts for monitoring and managing your Redshift clusters effectively.

Final Thoughts

Managing access in Amazon Redshift is about keeping your data safe while making sure users can do their jobs effectively. Simple practices like creating roles for specific tasks, giving only the necessary permissions, and using features like column and row restrictions have worked well for us and for our partners.

This approach has been effective in our case, but I’m always eager to learn from others. If you have similar practices, ideas, or constructive feedback, I’d love to hear them!


This blog is authored solely by me and reflects my personal opinions and experiences, not those of my employer. All references to products, including names, logos, and trademarks, belong to their respective owners and are used for identification purposes only.

redshift Article's
30 articles in total
Favicon
Securing Amazon Redshift - Best Practices for Access Control
Favicon
Migrate from Native Google to AWS Redshift: Benefits and Best Practices
Favicon
A Comprehensive Guide to Establishing a Successful Connection to Amazon Redshift Using the ODBC Driver
Favicon
Cloud Data Warehouse Comparison: Who’s the Real MVP?
Favicon
Move Data from DynamoDB to Redshift Using Estuary
Favicon
Amazon Redshift Workload Management (WLM): A Step-by-Step Guide
Favicon
Building a Scalable Data Platform: Addressing Uncertainty in Data Requirements with AWS
Favicon
Step by Step process to setup Redshift datashare across Redshift clusters
Favicon
Amazon Redshift guia de estudio (Comunidad AWS ML Latam)
Favicon
Data Governance on AWS using DataZone
Favicon
How to Migrate Amazon Redshift to a Different Account and Region: Step-by-Step Guide
Favicon
Channel Your Inner Scrooge with Redshift Reserved Instances: Slash Your Cloud Bill Like a Boss
Favicon
Streamline SSO Access to AWS Redshift Query Editor with Okta and Terraform
Favicon
Two ways to manage secrets for AWS Redshift Serverless with AWS Secrets Manager !!
Favicon
Three Ways to Retrieve Row Counts in Redshift Tables and Views
Favicon
Optimising Sentiment Analysis Workflows: AWS Zero-ETL and Amazon Redshift Synergy-Part 1
Favicon
Leveraging AWS Redshift for Your Organization's Needs
Favicon
From Relational to Analytical: The Power of Redshift Data Warehousing and Analytics
Favicon
Interactions Tracker, Part 3: Why I stopped and Lessons Learned
Favicon
A Comprehensive Guide to AWS DynamoDB vs. Redshift for Databases and Data Warehouses
Favicon
Understanding Redshift
Favicon
Unlearning what you know about relational databases to unlock the power of Redshift
Favicon
4 reasons why your lambda function cannot communicate with RedShift
Favicon
Query Amazon Redshift from YugabyteDB though PostgreSQL Foreign Data Wrapper and VPC peering
Favicon
Run analytical queries on SAP data with Redshift Serverless powered by Amazon AppFlow
Favicon
AWS Redshift: Robust and Scalable Data Warehousing
Favicon
How to build your own data platform. Episode 2: authorization layer. Data Warehouse implementation.
Favicon
Data Analysis with Redshift Serverless and Quicksight - Part 1
Favicon
DBT + REDSHIFT = ❤
Favicon
Analytics on AWS — Amazon Redshift

Featured ones: