Truncating a table happens on every relational database and can be done for different reasons. Redshift is not an exception to platforms that host these types of databases. However, the deletion of data from tables in Redshift can be done with both the DELETE and TRUNCATE commands. The difference is;
- The DELETE command deletes rows from a table based on specified conditions.
- The TRUNCATE command would delete all the data in the table.
- The TRUNCATE command is a faster alternative to the DELETE.
- The TRUNCATE operation cannot be undone(presently).
Upon a complete walkthrough of this article, you would get a brief overview of what Amazon Redshift is, SQL commands supported by Redshift, and how to use the Redshift TRUNC commands along with its syntax some example queries.
Table of Contents
What is Amazon Redshift?
Image Source
Amazon Redshift is a fully-managed, Petabyte-scale Data Warehouse service on the Cloud that leverages SQL to analyze Structured and Semi-Structured Data. It handles the analytic workload on large datasets and provides a level of abstraction for an analyst such that they see just tables and schemas to interact with. Its type of processing is called Online Analytical Processing(OLAP).
Redshift consists of Nodes that are referred to as Clusters. These clusters contain multiple databases for use. In terms of processing, Redshift uses Parallel Processing for enhanced Data Management and Performance(in terms of execution time). It also uses SQL-based tools for in-house data analytics as well as ML-based optimizations on query performance.
Key Features of Amazon Redshift
Some of the key features of Amazon Redshift are as follows:
- Reshift leverages Parallel Processing and Compression to decrease the Command Execution Time.
- Redshift supports integration with a wide range of useful services e.g S3. Data can be copied from S3 to Redshift and vice versa.
- Data access and Reliability are paramount to any Database or Data Warehouse user. Amazon Redshift monitors your Clusters and Nodes 24 hours a day. If a Node or Cluster fails, Amazon Redshift automatically replicates all the data into a fully functional Node or Cluster.
- Amazon Redshift hosts a feature called Redshift ML that enables Data Analysts and database developers to seamlessly build, train, and deploy Amazon SageMaker models using SQL
For further information on Redshift, click here to check out their official website.
SQL Commands Supported by Redshift
Redshift is based on PostgreSQL with some significant differences. It supports traditional SQL commands with additions for in-house manipulations. These commands are grouped into Data Manipulation Language(DML), Data Definition Language(DDL), Data Control Language(DCL), Data Query Language(DQL), and Transaction Control Language(TCL).
- DDL: They change the structure of a table. Examples include:
- CREATE
- ALTER
- DELETE
- DROP
- TRUNCATE
- DML: They are responsible for the modification of the database. Examples include:
- INSERT
- UPDATE
- DELETE
- DCL: They control the designation of authority to database users. Examples include:
- GRANT
- REVOKE
- DQL: used to fetch data from the database. It has only one command.-SELECT.
- TCL: They are used alongside DML commands to handle operations like committing and rollback in the database.
- ROLLBACK
- COMMIT
- SAVEPOINT
TRUNCATE Command Syntax
The TRUNCATE command can be written in two different ways:
- TRUNCATE TABLE_NAME.
- TRUNCATE TABLE TABLE_NAME.
The TABLE keyword is an optional parameter to run the TRUNCATE command. The TABLE_NAME on the other hand specifies the TABLE you need to truncate. However, you can only truncate a table if you are the owner of the table or a superuser.
Unlike the DELETE command, truncating a table does not require you to VACUUM it. A vacuum process is used to recover space occupied by deleted data. When data is deleted from tables with the DELETE command, these deleted data(usually called ‘dead tuples’) occupy the space. In the same manner, the VACUUM process marks the occupied spaces as reusable for other incoming data.
During the vacuum process, a user can access tables(provided he/she has the privileges), perform queries, and perform write operations. However, if a data manipulation language(DML) command should be executed concurrently with a vacuum process it would affect the speed of execution. Examples of DML commands as such include: UPDATE, INSERT, MERGE, etc.
In the same manner, if an UPDATE operation is executed together with a DELETE operation during a vacuum, it would affect system performance.
Furthermore, the TRUNCATE command apart from being a Data Definition Language(DDL) command cannot be rolled back after execution with the rollback command. It commits the changes made to the database automatically.
A fully managed No-code Data Pipeline platform like Hevo helps you integrate data from 100+ data sources (including 40+ Free Data Sources) to a destination of your choice such as Amazon Redshift in real-time in an effortless manner. Hevo with its minimal learning curve can be set up in just a few minutes allowing the users to load data without having to compromise performance. Its strong integration with umpteenth sources provides users with the flexibility to bring in data of different kinds, in a smooth fashion without having to code a single line.
Get Started with Hevo for Free
Check Out Some of the Cool Features of Hevo:
- Completely Automated: The Hevo platform can be set up in just a few minutes and requires minimal maintenance.
- Real-Time Data Transfer: Hevo provides real-time data migration, so you can have analysis-ready data always.
- 100% Complete & Accurate Data Transfer: Hevo’s robust infrastructure ensures reliable data transfer with zero data loss.
- Scalable Infrastructure: Hevo has in-built integrations for 100+ sources that can help you scale your data infrastructure as required.
- Connectors: Hevo supports 100+ data sources and integrations to SaaS platforms, files, databases, analytics, and BI tools. It supports various destinations including Google BigQuery, Amazon Redshift, Snowflake, Firebolt Data Warehouses; Amazon S3 Data Lakes; and MySQL, MongoDB, TokuDB, DynamoDB, PostgreSQL databases to name a few.
- 24/7 Live Support: The Hevo team is available round the clock to extend exceptional support to you through chat, email, and support calls.
- Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
- Live Monitoring: Hevo allows you to monitor the data flow so you can check where your data is at a particular point in time.
Sign up here for a 14-Day Free Trial!
Examples for Redshift TRUNC
1) The command to truncate the commission paid for a given sales transaction is as follows:
select commission, trunc(commission)
from sales where salesid=784;
Output:
commission | trunc
-----------+-------
111.15 | 111
(1 row)
2) To truncate the same commission value to the first decimal place, you can use the following command.
select commission, trunc(commission,1)
from sales where salesid=784;
Output:
commission | trunc
-----------+-------
111.15 | 111.1
(1 row)
3) In this case, a negative number for the second parameter might be used to truncate the commission; 111.15 is rounded down to 110.
select commission, trunc(commission,-1)
from sales where salesid=784;
Output:
commission | trunc
-----------+-------
111.15 | 110
(1 row)
4) The Trunc Command can be used to retrieve the date component of the SYSDATE function result (which returns a timestamp):
Output:
timestamp
----------------------------
2011-07-21 10:32:38.248109
(1 row)
Output:
trunc
------------
2011-07-21
(1 row)
5) The TRUNC function can be used on a TIMESTAMP column. In this case, the return type is date.
select trunc(starttime) from event
order by eventid limit 1;
Output:
trunc
------------
2008-01-25
(1 row)
How To Use the Redshift TRUNC Commands
In this section, you’ll learn about two of the most important use cases of Redshift Trunc Commands:
- Truncate a Single Table
- Truncate Multiple Tables
1) Truncate a Single Table with the Table Keyword
TRUNCATE TABLE CUSTOMERS
SELECT * FROM CUSTOMERS
This query runs through the Customer table, empties its content, and commits the changes to the Database Server. On subsequent requests to this table, it would return an empty result.
Truncate a Single Table without the Table Keyword
TRUNCATE CUSTOMERS
SELECT * FROM CUSTOMERS
This Redshift Trunc Command would have the same effect as the previous one. The TABLE keyword is optional.
Note: The TRUNCATE Keyword will truncate the tables with Foreign Key references to the one in question. This means, if other tables are referenced/related to the table you want to truncate via a Foreign key, the Truncate operation will empty them as well.
2) Truncate Multiple Tables
The TRUNCATE keyword in Redshift cannot empty multiple tables in a single operation. To truncate multiple tables, you have to run the command on different lines.
TRUNCATE table CUSTOMERS
Endnote table USERS
Endnote on TRUNCATE
You’ve seen the effect of the TRUNCATE operation and how it affects the relationships between tables. One of the key things to make sure of before execution is the type of relationship that the foreign key references hold.
A TRUNCATE operation without a proper check can confuse the existing relationship between database tables.
Redshift Trunc Date Function & Example
This function truncates a timestamp and returns a date.
1) Syntax
TRUNC(timestamp)
2) Arguments
timestamp: It denotes a timestamp column or an expression that translates to a timestamp implicitly.
3) Return Type
The return type of the Redshift Trunc Function is Date.
4) Example
- The following example returns the date component of the SYSDATE function output (which returns a timestamp).
Output:
trunc
------------
2011-07-21
(1 row)
Conclusion
The Redshift Trunc Commands are used to delete all data records in a table. Unlike its alternative, the DELETE command, the TRUNCATE command is faster in terms of execution time, and its effect cannot be undone with the ROLLBACK command.
With your Data Warehouse, Amazon Redshift live and running, you’ll need to extract data from multiple platforms to carry out your analysis. However, integrating and analyzing your data from a diverse set of data sources can be challenging and this is where Hevo Data comes into the picture.
Visit our Website to Explore Hevo
Hevo Data is a No-code Data Pipeline and has awesome 100+ pre-built integrations that you can choose from. Hevo can help you integrate data from numerous sources and load it into a Data Warehouse/Database like Amazon Redshift to analyze real-time data with a BI tool and create your Dashboards. It will make your life easier and make data migration hassle-free.
Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.
Share your experience of learning about Redshift Trunc Commands. Tell us in the comments below!
Teniola Fatunmbi excels in freelance writing within the data industry, skillfully delivering informative and engaging content related to data science by integrating problem-solving ability.