Understanding Alter Default Privileges Redshift Simplified 101

Pratik Dwivedi • Last Modified: December 29th, 2022

Alter Default Privileges Redshift_FI

Amazon Redshift is a petabyte-scale Cloud-based Data Warehouse service. It is optimized for datasets ranging from a hundred gigabytes to a petabyte can effectively analyze all your data by allowing you to leverage its seamless integration support for Business Intelligence tools.

In this article, you will be introduced to the concept of ALTER DEFAULT PRIVILEGES Redshift Command. Moreover, you will also be introduced to Amazon Redshift, its key features, Privilges in Redshift and different uses of ALTER DEFAUKT PRIVILEGES Redshift Command. Read along to learn more about ALTER DEFAULT PRIVILEGES Redshift Command.

Table of Contents

What is Amazon Redshift?

Alter default privileges redshift - Redshift Logo
Image Source

Amazon Web Services (AWS) is a subsidiary of Amazon saddled with the responsibility of providing a cloud computing platform and APIs to individuals, corporations, and enterprises. AWS offers high computing power, efficient content delivery, database storage with increased flexibility, scalability, reliability, and relatively inexpensive cloud computing services.

Alter Default Privileges Redshift - AWS
Image Source

Amazon Redshift, a part of AWS, is a Cloud-based Data Warehouse service designed by Amazon to handle large data and make it easy to discover new insights from them. Its operations enable you to query and combine exabytes of structured and semi-structured data across various Data Warehouses, Operational Databases, and Data Lakes.

Amazon Redshift is built on industry-standard SQL with functionalities to manage large datasets, support high-performance analysis, provide reports, and perform large-scaled database migrations. Amazon Redshift also lets you save queried results to your S3 Data Lake using open formats like Apache Parquet from which additional analysis can be done on your data from other Amazon Web Services such as EMR, Athena, and SageMaker.

For further information on Amazon Redshift, you can follow the Official Documentation.

Key Features of Amazon Redshift

The key features of Amazon Redshift are as follows:

1) Massively Parallel Processing (MPP)

Massively Parallel Processing (MPP) is a distributed design approach in which the divide and conquer strategy is applied by several processors to large data jobs. A large processing job is broken down into smaller jobs which are then distributed among a cluster of Compute Nodes. These Nodes perform their computations parallelly rather than sequentially. As a result, there is a considerable reduction in the amount of time Redshift requires to complete a single, massive job.

2) Fault Tolerance

Data Accessibility and Reliability are of paramount importance for any user of a database or a Data Warehouse. Amazon Redshift monitors its Clusters and Nodes around the clock. When any Node or Cluster fails, Amazon Redshift automatically replicates all data to healthy Nodes or Clusters.

3) Redshift ML

Amazon Redshift houses a functionality called Redshift ML that gives data analysts and database developers the ability to create, train and deploy Amazon SageMaker models using SQL seamlessly.

4) Column-Oriented Design

Amazon Redshift is a Column-oriented Data Warehouse. This makes it a simple and cost-effective solution for businesses to analyze all their data using their existing Business Intelligence tools. Amazon Redshift achieves optimum query performance and efficient storage by leveraging Massively Parallel Processing (MPP), Columnar Data Storage, along with efficient and targeted Data Compression Encoding schemes.

Simplify Redshift ETL and Analysis with Hevo’s No-code Data Pipeline

A fully managed No-code Data Pipeline platform like Hevo Data helps you integrate and load data from 100+ different sources (including 40+ free sources) to a Data Warehouse such as Amazon Redshift or Destination of your choice in real-time in an effortless manner. Hevo with its minimal learning curve can be set up in just a few minutes allowing the users to load data without having to compromise performance. Its strong integration with umpteenth sources allows users to bring in data of different kinds in a smooth fashion without having to code a single line. 

Get Started with Hevo for Free

Check out some of the cool features of Hevo:

  • Completely Automated: The Hevo platform can be set up in just a few minutes and requires minimal maintenance.
  • Transformations: Hevo provides preload transformations through Python code. It also allows you to run transformation code for each event in the Data Pipelines you set up. You need to edit the event object’s properties received in the transform method as a parameter to carry out the transformation. Hevo also offers drag and drop transformations like Date and Control Functions, JSON, and Event Manipulation to name a few. These can be configured and tested before putting them to use.
  • Connectors: Hevo supports 100+ integrations to SaaS platforms, files, Databases, analytics, and BI tools. It supports various destinations including Google BigQuery, Amazon Redshift, Snowflake Data Warehouses; Amazon S3 Data Lakes; and MySQL, SQL Server, TokuDB, DynamoDB, PostgreSQL Databases to name a few.  
  • Real-Time Data Transfer: Hevo provides real-time data migration, so you can have analysis-ready data always.
  • 100% Complete & Accurate Data Transfer: Hevo’s robust infrastructure ensures reliable data transfer with zero data loss.
  • Scalable Infrastructure: Hevo has in-built integrations for 100+ sources (including 40+ free sources) that can help you scale your data infrastructure as required.
  • 24/7 Live Support: The Hevo team is available round the clock to extend exceptional support to you through chat, email, and support calls.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Live Monitoring: Hevo allows you to monitor the data flow so you can check where your data is at a particular point in time.
Sign up here for a 14-Day Free Trial!

How does the structure of Redshift affect the privileges & commands used? 

Amazon Redshift organizes data in multiple nodes, the first node to be commissioned acts as the leader node.  It allows for Massively Parallel Processing (MPP), using multiple compute nodes where each compute node is assigned a part/chunk of the data to be processed and all these nodes perform the same operation(s) on their slice of data. 

Amazon Redshift also allows Columnar data storage, Targeted data compression encoding schemes, and fine-grained Performance Tuning

The aforementioned facts, combined with the fact that many groups of users, each with different privileges, can access Redshift to complete their specific tasks. For example: the Data creators would create and populate the data tables, whereas the Data analysts would query and aggregate the data given to them. Moreover, a user could have access to only a subset of the data owing to Security/Sensitivity/Performance and need to know the concerns associated with it.  

What are Privileges in Amazon Redshift? 

Privileges in Amazon Redshift are Rights to create tables and perform operations like insert, update, delete data, grant, revoke further rights to other users, execute functions or procedures, and drop or modify tables.  

What is the need to discuss ALTER DEFAULT PRIVILEGES Redshift command in particular? 

The reason we discuss the ALTER DEFAULT PRIVILEGES Redshift command is that the default documentation does not discuss some of its intricacies in detail. For example, it’s always assumed that theALTER DEFAULT PRIVILEGES Redshift command is being run for objects owned by the current user, unless specified otherwise. 

So, practically our syntax becomes 

Alter default privileges for CURRENT_USER/Specified_User

What does the ALTER DEFAULT PRIVILEGES Redshift command do? 

ALTER DEFAULT PRIVILEGES Redshift command defines the default set of Privileges that will be applied to objects created in the future, by the specified user. It’s different from the standalone GRANT command, which is used to define Privileges on existing objects. So, ALTER DEFAULT PRIVILEGES Redshift command defines what Privileges will automatically be granted to objects created in the future, whereas the GRANT command defines Privileges that will be granted to existing objects only. Therefore, ALTER DEFAULT PRIVILEGES Redshift Command is never used on existing tables or objects. 

Typically, you would first use the GRANT command to assign Privileges on current objects, and then use ALTER DEFAULT PRIVILEGES Redshift Command to assign Privileges to the user for future objects created in the schema. It should be kept in mind that permissions you grant apply to existing schemas only. 

In the future, if you create new schemas, you will have to assign the Privileges for those new schemas manually i.e. you can assign Privileges for the future objects created in existing schemas, but not for totally new schemas that may be created in the future.  

What is the Syntax of the ALTER DEFAULT PRIVILEGES Redshift command? 

For users who love to read user manuals to know details about a command, in order to get fine-grained control over what they’re doing, below is the full command syntax of ALTER DEFAULT PRIVILEGES Redshift Command. 

ALTER DEFAULT PRIVILEGES
    [ FOR USER target_user [, ...] ]
    [ IN SCHEMA schema_name [, ...] ]
    grant_or_revoke_clause
where grant_or_revoke_clause is one of the following:

Option 1

GRANT { { SELECT | INSERT | UPDATE | DELETE | DROP | REFERENCES } [,...] | ALL [ PRIVILEGES ] } 
	ON TABLES 
	TO { user_name [ WITH GRANT OPTION ]| GROUP group_name | PUBLIC } [, ...]	 

Option 2

GRANT { EXECUTE | ALL [ PRIVILEGES ] } 
	ON FUNCTIONS 
	TO { user_name [ WITH GRANT OPTION ] | GROUP group_name | PUBLIC } [, ...]

Option 3

GRANT { EXECUTE | ALL [ PRIVILEGES ] } 
	ON PROCEDURES 
	TO { user_name [ WITH GRANT OPTION ] | GROUP group_name | PUBLIC } [, ...]  

Option 4

REVOKE [ GRANT OPTION FOR ] { { SELECT | INSERT | UPDATE | DELETE | REFERENCES } [,...] | ALL [ PRIVILEGES ] } 
	ON TABLES 
	FROM user_name [, ...] [ CASCADE | RESTRICT ]

Option 5

REVOKE  { { SELECT | INSERT | UPDATE | DELETE | REFERENCES } [,...] | ALL [ PRIVILEGES ] } 
	ON TABLES 
	FROM { GROUP group_name | PUBLIC } [, ...] [ CASCADE | RESTRICT ]

Option 6

REVOKE [ GRANT OPTION FOR ] { EXECUTE | ALL [ PRIVILEGES ] } 
	ON FUNCTIONS 
	FROM user_name [, ...] [ CASCADE | RESTRICT ]

Option 7

REVOKE { EXECUTE | ALL [ PRIVILEGES ] } 
	ON FUNCTIONS 
	FROM { GROUP group_name | PUBLIC } [, ...] [ CASCADE | RESTRICT ]       

Option 8

REVOKE [ GRANT OPTION FOR ] { EXECUTE | ALL [ PRIVILEGES ] } 
	ON PROCEDURES 
	FROM user_name [, ...] [ CASCADE | RESTRICT ]            

Option 9

REVOKE { EXECUTE | ALL [ PRIVILEGES ] } 
	ON PROCEDURES 
	FROM { GROUP group_name | PUBLIC } [, ...] [ CASCADE | RESTRICT ]

Typical Usage: ALTER DEFAULT PRIVILEGES Redshift command 

Some of the typical uses of ALTER DEFAULT PRIVILEGES Redshift Command are as follows:

  • ALTER DEFAULT PRIVILEGES FOR USER staging_user IN SCHEMA schema_one GRANT INSERT, SELECT, UPDATE ON TABLES TO rolegroup; 

This will grant the privileges on all tables created in that schema by staging_user in the future. 

  • ALTER DEFAULT PRIVILEGES FOR USER user_creator IN SCHEMA myschema GRANT SELECT ON TABLES TO GROUP my_group; 

Here, all future objects created in myschema, by the user named user_creator, will have SELECT privilege assigned to all users in the group called my_group. Whereas, if you skip the FOR USER part then you can refer the following command:

  • ALTER DEFAULT PRIVILEGES IN SCHEMA myschema GRANT SELECT ON TABLES TO GROUP my_group; 

Here,  all future objects created in myschema, by the current user, will have SELECT privilege assigned to all users in my_group

The current user is default, if you don’t use FOR USER then the privileges will be granted ONLY for objects created by the current user in the specified schema. 

Now, if you skip the part IN SCHEMA then you can refer the following command:

  • ALTER DEFAULT PRIVILEGES GRANT SELECT ON TABLES TO GROUP my_group; 

Here, If you fire this command as yourself (You are the current user, as FOR USER specified), you will grant SELECT privileges on all new tables you create, to all users in my_group

  • Also, ALTER DEFAULT PRIVILEGES GRANT SELECT ON TABLES TO public

Here, you will grant SELECT privileges on all new tables you create, to the public.  Public here means any verified user in your Redshift instance, only. 

Corollary: You can’t use WITH GRANT OPTION to a group or to PUBLIC

So, ALTER DEFAULT PRIVILEGES GRANT SELECT ON TABLES TO GROUP my_group WITH GRANT OPTION – is an Invalid statement. But, you can use it with a specific username. For example:

  • ALTER DEFAULT PRIVILEGES GRANT SELECT ON TABLES TO xyz WITH GRANT OPTION – is valid and will work. 

Similarly, you can grant privileges for Functions and Procedures. 

What are some Exclusions to the above rules ?

  • DROP: Only the table owner or the superuser can drop a table. No other user, even if he has all privileges to a table/object, cannot drop the table. 
  • ALTER TABLE can be run by those, and only those, who can drop a table. So, only the table owner or the superuser can alter a table. This is so because the mechanism to ascertain whether a user can ALTER a table works in a similar fashion to the mechanism to ascertain whether a user can DROP a table. 

According to AWS Redshift documentation, “The right to modify or destroy an object is always the privilege of the owner only.” 

The above rules and practices are in place to enable proper data governance. These rules help in assigning responsibility for, and authority over, Objects created in Redshift. Adhering to these rules will minimize surprises, erroneous alterations and undesired changes. 

Amazon Redshift does all this to ensure Data SecurityTransparent governance, and preserve your data’s semantic usability. 

Conclusion

In this article, you have learned about Alter Default Privileges Redshift Command. This article also provided information on Apache Spark, Amazon Redshift, and the typical uses of Alter Default Privileges Redshift Command.

Hevo Data, a No-code Data Pipeline provides you with a consistent and reliable solution to manage data transfer between a variety of sources and a wide variety of Desired Destinations with a few clicks.

Visit our Website to Explore Hevo

Hevo Data with its strong integration with 100+ sources (including 40+ free sources) allows you to not only export data from your desired data sources & load it to the destination of your choice such as Amazon Redshift, but also transform & enrich your data to make it analysis-ready so that you can focus on your key business needs and perform insightful analysis using BI tools. 

Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You may also have a look at the amazing price, which will assist you in selecting the best plan for your requirements.

Share your experience of understanding the comparative study of Apache Spark vs Redshift for Big Data in the comment section below! We would love to hear your thoughts.

No-code Data Pipeline for Amazon Redshift