An organization’s ability to make data-driven decisions is backed by timely and accurate access to information. But as a business scales, so does its data and the challenges of storing, accessing, and analyzing it. Google Cloud Storage is a service for storing objects of any format from text, video, audio, etc., in Google cloud.
The objects are stored in buckets associated with a particular project, and the projects are grouped under an organization account where they can store all the data securely and reliably. For better cloud-based data storage and analytics service, businesses can connect Google Cloud Storage to Snowflake. It is a cloud data platform that offers Warehouse-as-a-Service.
In this blog, you’ll discuss the features of Google Cloud Storage and Snowflake, along with methods to connect Google Cloud Storage to Snowflake.
Prerequisites
Understanding of cloud computing
What is Google Cloud Storage?
- Google Cloud Storage (GCS) is a fully managed service that makes storing and accessing files in the cloud easier. It offers unified, scalable, and durable object storage.
- GCS is one of the largest public cloud storage. It has several storage classes so enterprises and developers can determine the right pricing that matches their data storage requirements.
- Google cloud storage is different from Google One, cloud storage for consumers, which comes with Google Drive, Google Photos, and Gmail.
What is Snowflake?
- Snowflake is an all-in-one cloud data warehouse with the primary goal of providing complete data management for data scientists and analysts.
- It can store data in SQL and NoSQL databases and a data analytics engine that helps in data exploration and analysis.
- It is a unified platform that allows you to access your cloud data, data lake, and in-memory data in a single place.
- It also has a fully integrated analytical engine for fast data analysis. Snowflake’s flexible architecture enables you to build and scale analytical applications in the cloud without worrying about infrastructure or data management.
- Snowflake has been designed with ease of administration in mind, and it has a simple user interface that allows quick access to your data.
Why integrate Google Cloud Storage to Snowflake?
Manually storing your company’s critical data on Google Cloud Storage is unsustainable in a fast-paced e-commerce environment. If your company uses Google Cloud Storage to manage a lot of data, integrating that data with Snowflake can give you more flexibility and useful insights.
This will enable you to have a strong data backup solution for the upcoming analytical workload. Additionally, moving your data to Snowflake will enable faster analysis by allowing you to compare this Google Cloud Storage data with other tools and applications.
How to Connect Google Cloud Storage to Snowflake?
Method 1: Using Hevo to Set Up Google Cloud Storage to Snowflake
Using Hevo Google Cloud Storage to Snowflake Migration can be done in the following 2 steps:
Step 1: Configure GCS as your Source.
Step 2: Configure Snowflake as the destination.
Method 2: Using CREATE STORAGE INTEGRATION Command to Move Data from Google Cloud Storage to Snowflake
This method creates a new storage integration in the Snowflake account. A storage integration is an object that stores an access management (IAM) entity for external cloud storage.
Load Data from Google Cloud Storage to Snowflake
Load Data from Google Cloud Storage to Redshift
Follow these steps to configure a Snowflake Storage Integration:
Step 1: Create a Cloud Storage Integration in Snowflake
CREATE STORAGE INTEGRATION <integration_name>
TYPE = EXTERNAL_STAGE
STORAGE_PROVIDER = GCS
ENABLED = TRUE
STORAGE_ALLOWED_LOCATIONS = ('gcs://<bucket>/<path>/', 'gcs://<bucket>/<path>/')
[ STORAGE_BLOCKED_LOCATIONS = ('gcs://<bucket>/<path>/', 'gcs://<bucket>/<path>/') ]
Step 2: Retrieve the Cloud Storage Account for your Snowflake Account
DESC STORAGE INTEGRATION <integration_name>;
Step 3: Grant the Service Account Permissions to Access Bucket Objects
3.1 Creating a Custom IAM Role
- Log into Google Cloud Platform. You should have the role of project editor to access GCP for this step.
- Choose IAM & admin > Roles.
- Click on create role.
- Enter the name and description for the custom role.
- Click on Add Permissions.
- Add the following permission for data loading, data loading with purge, data loading and unloading, and data unloading.
- storage.buckets.get
- storage.objects.create
- storage.objects.delete
- storage.objects.list
- Click on Create.
3.2 Assign the Custom Role to the Cloud Storage Service Account
- Choose Storage > Browser in GCP.
- Select a bucket.
- Click SHOW INFO PANEL.
- Go to Describe Integration and search for the service account name from Step 2.
- Select Storage > Custom > Role from Select a Role dropdown options.
- Click the Add button.
3.3 Grant the Cloud Storage Service Account Permissions on the Cloud Key Management Service Cryptographic Keys
- Go to Security > Cryptographic keys in GCP.
- Select the key that is assigned to your GCS bucket.
- Click on the SHOW INFO PANEL.
- Go to Describe Integration and search for the service account name from Step 2.
- Select the Cloud KMS CrytoKey Encryptor/Decryptor role.
- Click on the Add button.
Step 4: Create an External Stage
Create an external stage in Google Cloud Storage using the CREATE STAGE command.
Integrate Google Cloud Storage with Snowflake in minutes!
No credit card required
Method 3: Using COPY INTO Command to Move Data from Google Cloud Storage to Snowflake
If you use Cloud Storage buckets for storing and managing your data files in your Google Cloud Storage account, you can use your existing buckets for bulk loading into Snowflake. You will use the COPY INTO <table> command to load data from buckets to tables.
- Step 1: Stage the data files in a Cloud Storage bucket.
- Step 2: Use the COPY INTO <table> command to load the staged files’ content to a Snowflake database table.
Limitations of Connecting Google Cloud Storage to Snowflake
The two methods mentioned above of connecting Google cloud storage to Snowflake have limitations users cannot access real-time data or automatically refresh their data. However, You can use a no/low code platform like Hevo for real-time access to Google Cloud Storage data in Snowflake. With a platform like Hevo, users can schedule refreshes and monitor their data in real time in Snowflake.
Learn More About:
Conclusion
In this blog, you learn about the features of Google Cloud Storage and Snowflake. You also saw two methods of connecting Google Cloud Storage to Snowflake. Connecting Google Cloud Storage to Snowflake will help businesses better store, manage, and analyze their data to derive meaningful insights from various data sources.
Osheen is a seasoned technical writer with over a decade of experience in the data industry. She specializes in writing about B2B, technology, finance, and SaaS domains. Her passion for simplifying intricate technical concepts has established her as a respected expert in the field, making her an invaluable resource for those looking to deepen their understanding of data science.