The Snowflake COPY command is used to load data from staged files on internal or external locations to an existing table or vice versa.

In this article, you will understand the need for Snowflake COPY data from one table to another, the syntax to use, the required parameters, and also a few examples.

Introduction to Snowflake

Snowflake Logo
Image Source

Snowflake runs entirely on cloud infrastructure and provides a fully managed data warehouse system. Snowflake runs and manages its infrastructure on a public cloud and cannot be installed or hosted on a private cloud.

Snowflake has a hybrid architecture of traditional shared-disk database architectures and shared-nothing database architectures. 

Key features of Snowflake

  • Integrated Platform: Snowflake is a Software-as-a-Service model that can be accessed online via paid subscriptions.
  • Central Repository: Snowflake uses a central repository to store data that is accessible to all compute nodes in the data warehouse.
  • Performance: Snowflake uses MPP (Massive Parallel Processing) technique to process the queries in its compute clusters. Each node of the cluster stores the data locally to provide excellent data management.
  • No Hardware: With Snowflake, there is no need to set up, configure, install, and manage hardware.
  • Scalable: Snowflake has extraordinary scaling capabilities. You can easily scale up or down depending on the volume of data.
  • Fault-Tolerant: Snowflake provides exceptional fault-tolerant capabilities to recover the accidentally dropped Snowflake objects.
  • Standard and Extended SQL Support: Snowflake has the best support for ANSI SQL along with advanced SQL functionalities like Merge, Lateral view, statistical functions, and many more.
  • Improved Data-Driven Decision Making: Snowflake allows you to break down data silos and provide access to actionable insights across the organization. This is an essential first step to improve partner relationships, optimize pricing, reduce operational costs, drive sales effectiveness, and much more. 
  • Improved User Experiences and Product Offerings: With Snowflake in place, you can better understand user behavior and product usage. You can also leverage the full breadth of data to deliver customer success, vastly improve product offerings, and encourage data science innovation.  
  • Customized Data Exchange: Snowflake allows you to build your data exchange, securely sharing live, governed data. It also provides an incentive to build better data relationships across your business units and with your partners and customers. It does this by achieving a 360-degree view of your customer, which provides insight into key customer attributes like interests, employment, and many more. 
  • Robust Security: You can adopt a secure data lake as a single place for all compliance and cybersecurity data. Snowflake Data Lakes guarantee a fast incident response. This allows you to understand the complete picture of an incident by clubbing high-volume log data in a single location, and efficiently analyzing years of log data in seconds. You can now join semi-structured logs and structured enterprise data in one data lake. Snowflake lets you put your foot in the door without any indexing and easily manipulate and transform data once it is in Snowflake.  

In this blog post, you’ll learn all about the Snowflake COPY command and its usage in real scenarios. You will also look at Hevo Data – an alternate and easy way to extract data.

Snowflake Architecture
Image Source

What is Snowflake COPY?

The Snowflake COPY command allows you to load data from staged files on internal/external locations to an existing table or vice versa.

Snowflake offers two types of COPY commands:

  1. COPY INTO <location>: This will copy the data from an existing table to locations that can be:
    • An internal stage table. 
    • An external stage table pointing to an external site, i.e., Amazon S3, Google Cloud Storage, or Microsoft Azure.
    • An external location like Amazon cloud, GCS, or Microsoft Azure.
  1. COPY INTO <table>: This command will copy the data from staged files to the existing table. The staged files can be present at:
    • The internal stage table.
    • The external stage table points to an external location like a public cloud provided by Amazon, Google, or Microsoft. (or)
    • An external location like Amazon cloud, GCS, or Microsoft Azure.

The Need for Snowflake COPY

In Analytics, there is often a requirement to copy data from one table to another table/location to share, collaborate, and perform data analytics within the team.

The Snowflake COPY command can also be used to create a clone of the table or to copy the data for failover recoveries. 

There might be several other use cases to use the COPY command, let’s briefly discuss the usage of the COPY command with examples.

COPY Command Usage

You can use the COPY command by using the following syntax:

COPY INTO [<namespace>.]<table_name>
FROM { internalStage | externalStage | externalLocation }
[ FILES = ( '<file_name>' [ , '<file_name>' ] [ , ... ] ) ]
[ PATTERN = '<regex_pattern>' ]
[ FILE_FORMAT = ( { FORMAT_NAME = '[<namespace>.]<file_format_name>' |
TYPE = { CSV | JSON | AVRO | ORC | PARQUET | XML } [ formatTypeOptions ] } ) ]
[ copyOptions ]
[ VALIDATION_MODE = RETURN_<n>_ROWS | RETURN_ERRORS | RETURN_ALL_ERRORS ]

Required Parameters  

[<namespace>].<table_name>: Specifies the name of the target table into which data is loaded.

Namespace is the combination of <database_name>.<schema_name> or <schema_name>. If the schema and database have already been selected, then this parameter is optional.

FROM: This specifies the internal or external location that contains the data file to load into the table.

Optional Parameters

FILES: This specifies the list of file names to load into the table. The file must be staged in either Snowflake’s internal or external location. 

PATTERN: A regular expression pattern to match the specific file names.

FILE_FORMAT: Specifies the data file format (CSV, JSON, AVRO, ORC, PARQUET, etc.).

VALIDATION_MODE: This instructs the command to validate the data files instead of loading them into target tables and allows you to perform the dry run to ensure the fail-safe delivery of data. 

Examples

Let’s consider a few examples to demonstrate the Snowflake COPY command.

Loading files using Basic Command

Stages

copy into table1 from '@mystage/path 1/file 1.csv';
copy into table2 from '@%mytable/path 1/file 1.csv';
copy into table3 from '@~/path 1/file 1.csv';

S3 bucket

copy into table4 from 's3://mybucket 1/prefix 1/file 1.csv';

Azure container

copy into table5 from 'azure://myaccount.blob.core.windows.net/mycontainer/encrypted_files/file 1.csv';

Loading files from Named External Stage

copy into mycsvtable
from @my_ext_stage/tutorials/dataloading/contacts1.csv;

Loading files from Amazon S3

copy into mytable
from s3://mybucket/data/files
storage_integration = myint
encryption=(master_key = 'eSxX0jzYfIamtnBKOEOwq80Au6NbSgPH5r4BDDwOaO8=')
file_format = (format_name = ‘csv’);

Loading files Using Patterns

copy into mytable
file_format = (type = 'CSV')
pattern='.*/.*/.*[.]csv[.]gz';

Conclusion

Snowflake is a powerful, fully managed platform used for data warehousing and analytics.

In this blog post, you have learned about the Snowflake COPY command in great detail, its use, along with some examples.

Sign Up for a 14-day free trial and experience the feature-rich Hevo suite firsthand.

Vishal Agrawal
Technical Content Writer, Hevo Data

Vishal Agarwal is a Data Engineer with 10+ years of experience in the data field. He has designed scalable and efficient data solutions, and his expertise lies in AWS, Azure, Spark, GCP, SQL, Python, and other related technologies. By combining his passion for writing and the knowledge he has acquired over the years, he wishes to help data practitioners solve the day-to-day challenges they face in data engineering. In his article, Vishal applies his analytical thinking and problem-solving approaches to untangle the intricacies of data integration and analysis.