In today’s data-driven world, smooth flow and data analysis are the engines for success. Here comes Confluent Console, a magnificent tool that makes the management of Apache Kafka a lot easier and boosts capabilities related to data streaming. What if one wishes to go a bit deeper into these results? That is when Snowflake starts doing the magic. It is one of the most robust and efficient data warehouses, specifically engineered for rapid analytics scaling.
Imagine how easy it is to stream your Kafka data into Snowflake, where all those powerful analytic capabilities can come into play. Connect the Confluent Cloud to Snowflake for data centralization and lay solid ground for intelligent business intelligence.
Here in this blog, I will take you through the steps of setting up a connection between Confluent Console and Snowflake, unlocking your data to its full potential. Let’s get you ready to take it to the next level with your data strategy!
What is Confluent Cloud?
Developed by the co-creators of Apache Kafka, Confluent Cloud is a fully managed user interface to simplify processes such as transferring records from Kafka topics, managing cluster resources, and more. Since these processes require a lot of commands and time during executions, Confluent Cloud was built to provide an automated way of delivering Apache Kafka facilities.
With Confluent Cloud, organizations can drive business value from their data without worrying about the mechanism behind processes like data transportation and data integration between different services. Check out the integration between Confluent Cloud and BigQuery.
Key Features of Confluent Cloud
- Cluster Linking: Cluster linking in Confluent Cloud is a fully managed service for seamlessly moving data from one Confluent cluster to another. It creates perfect copies of Kafka topics and programmatically keeps data in sync across clusters.
- Self-balancing: The Confluent Cloud platform can build several brokers, manage many Kafka topics, and handle millions of messages per hour. Sometimes, you can lose brokers due to network issues. As a result, new topics are created, and partitions are reassigned to balance the workload.
- Confluent Control Center: Confluent Cloud consists of a GUI-based application to manage and monitor Kafka, called Confluent Control Center. It helps businesses to manage Kafka Connect and easily create, edit, and manage connections to other systems. With Confluent Control Center, companies can monitor data streams from producers and consumers, ensure every message is delivered, and measure how long it takes to deliver messages.
What is Snowflake?
Developed in 2012, Snowflake is a fully managed data warehouse, which can be hosted on any cloud service like Amazon Cloud, Microsoft Azure, and Google Data Storage. It is a single platform that offers a wide range of services such as data engineering, data warehousing, and data lakes.
Snowflake provides flexibility in the pricing model that helps businesses to pay independently for computation and storage. It also allows companies to leverage on-demand storage where they can pay a monthly upfront payment.
See how Snowflake acts as a data lake and compare it with the data warehouse.
Key Features of Snowflake
- Connectors and Drivers: Snowflake allows users to use a wide range of connectors and drivers, which consists of Python and Spark connectors and drivers like Go Snowflake, .NET, JDBC, Node.js, ODBC, PHP, and more.
- Result Caching: Snowflake has a unique feature that can cache results at different levels. The cached results can last for 24 hrs after the query is executed. Therefore, the results are quickly delivered if the same query is executed.
- Data Sharing: The data sharing feature in Snowflake allows businesses to share items from one database account to another without duplicating. This feature helps in having more storage space with less computation, resulting in faster data accessibility.
Method 1: Using Hevo to Set Up Confluent Cloud to Snowflake
Using Hevo, you can connect Confluent Cloud to Snowflake in the following two steps.
Step 1: Configure Confluent Cloud as the Source
- Step 1.1: Select “Confluent Cloud” on the Select Source Type page.
- Step 1.2: Enter the information on the Configure your Confluent Cloud Source page with the help of the image
- Step 1.5: Simply press TEST & CONTINUE.
- Step 1.6: Configure the data ingestion and establish the destination after that.
Step 2: Set up Snowflake as a destination in Hevo
- Step 2.1: Select Snowflake from the Add Destination page.
- Step 2.2: Set the following parameters on the Configure your Snowflake Destination page using the image
- Step 2.3: Click Test Connection to test connectivity with the Snowflake warehouse.
- Step 2.4: Once the test is successful, click SAVE DESTINATION.
Connect Confluent Cloud to Snowflake effortlessly
No credit card required
Method 2: Using Custom Code to Move Data from Confluent Cloud to Snowflake
Prerequisites
Follow the steps below to connect Confluent Cloud to Snowflake.
Step 1: Generate the Snowflake Key Pair
Before the connector can connect Confluent Cloud to Snowflake, you must generate a key pair. Snowflake authentication needs 2048-RSA. Add the public key to the Snowflake user account and the private key to the connector configuration.
Step 1.1: Create the Key Pair
Complete the below steps to generate the key pair.
- Generate the private key using the below command.
openssl genrsa -out snowflake_key.pem 2048
openssl rsa -in snowflake_key.pem -pubout -out snowflake_key.pub
- View the list of the generated Snowflake key files using the below command
ls -l snowflake_key*
Output:
-rw-r--r-- 1 1679 Jun 8 17:04 snowflake_key.pem
-rw-r--r-- 1 451 Jun 8 17:05 snowflake_key.pub
- You can view the public key file by running the below command.
cat snowflake_key.pub
Output:
-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA2zIuUb62JmrUAMoME+SX
vsz9KUCp/cC+Y+kTGfYB3jRDQ06O0UT+yUKMO/KWuc0dUxZ8s9koW5l/n+TBfxIQ
... omitted
1tD+Ktd/CTXPoVEI2tgCC9Avf/6/9HU3IpV0gL8SZ8U0N5ot4Uw+CSYB3JjMagEG
bBWZ8Qc26pFk7Fd17+ykH6rEdLeQ9OElc0ZruVwSsa4AxaZOT+rqCCP7FQPzKTtA
JQIDAQAB
-----END PUBLIC KEY-----
- Copy the key and add it to the new user in the Snowflake account. Copy the part between –BEGIN PUBLIC KEY– and –END PUBLIC KEY– using the below command.
grep -v "BEGIN PUBLIC" snowflake_key.pub | grep -v "END PUBLIC"|tr -d '\r\n'
Step 1.2: Create the User and Add the Public Key
Open your Snowflake project and follow the steps below to create a user account and a public key.
- Go to the Worksheets panel and then switch to the SECURITYADMIN role.
- Run the below query in Worksheets to create a user and add the public key copied earlier.
CREATE USER confluent RSA_PUBLIC_KEY='<public-key>';
- Ensure to add the public key as a single line in the statement. The below image shows how the Snowflake Worksheets should look.
- Follow the below steps to set the correct privileges for the user you have added.
- For example, you can send Apache Kafka records to a PRODUCTION database using the schema PUBLIC. The code below shows the required queries for configuring the necessary user privileges.
// Use a role that can create and manage roles and privileges:
use role securityadmin;
// Create a Snowflake role with the privileges to work with the connector
create role kafka_connector_role;
// Grant privileges on the database:
grant usage on database PRODUCTION to role kafka_connector_role;
// Grant privileges on the schema:
grant usage on schema PRODUCTION.PUBLIC to role kafka_connector_role;
grant create table on schema PRODUCTION.PUBLIC to role kafka_connector_role;
grant create stage on schema PRODUCTION.PUBLIC to role kafka_connector_role;
grant create pipe on schema PRODUCTION.PUBLIC to role kafka_connector_role;
// Grant the custom role to an existing user:
grant role kafka_connector_role to user confluent;
// Make the new role the default role:
alter user confluent set default_role=kafka_connector_role;
- Add the private key to your Snowflake configuration. Extract the key and move it to a safe place until you set up the connector.
- Get the list of generated Snowflake key files with the below command.
ls -l snowflake_key*
-rw-r--r-- 1 1679 Jun 8 17:04 snowflake_key.pem
-rw-r--r-- 1 451 Jun 8 17:05 snowflake_key.pub
- View the private key file using the below command.
cat snowflake_key.pem
Output:
-----BEGIN RSA PRIVATE KEY-----
MIIEpQIBAAKCAQEA2zIuUb62JmrUAMoME+SXvsz9KUCp/cC+Y+kTGfYB3jRDQ06O
0UT+yUKMO/KWuc0dUxZ8s9koW5l/n+TBfxIQx+24C2+l9t3TxxaLdf/YCgQwKNR9
dO9/c+SkX8NfcwUynGEo3wpmdb4hp0X9TfWKX9vG//zK2tndmMUrFY5OcGSSVJYJ
Wv3gk04sVxhINo5knpgZoUVztxcRLm/vNvIX1tD+Ktd/CTXPoVEI2tgCC9Avf/6/
9HU3IpV0gL8SZ8U0N5ot4Uw+CSYB3JjMagEGbBWZ8Qc26pFk7Fd17+ykH6rEdLeQ
... omitted
UfrYj7+p03yVflrsB+nyuPETnRJx41b01GrwJk+75v5EIg8U71PQDWfy1qOrUk/d
9u25iaVRzi6DFM0ppE76Lh72SKy+m0iEZIXWbV9q6vf46Oz1PrtffAzyi4pyJbe/
ypQ53f0CgYEA7rE6Dh0tG7EnYfFYrnHLXFC2aVtnkfCMIZX/VIZPX82VGB1mV43G
qTDQ/ax1tit6RHDBk7VU4Xn545Tgj1z6agYPvHtkhxYTq50xVBXr/xwlMnzUZ9s3
VjGpMYQANm2seleV6/si54mT4TkUyB7jMgWdFsewtwF60quvxmiA9RU=
-----END RSA PRIVATE KEY-----
- Copy the key and add it to the connector configuration. Copy only the part of the key between –BEGIN RSA PRIVATE KEY– and –END RSA PRIVATE KEY–.Use the below command for doing it.
grep -v "BEGIN RSA PRIVATE KEY" snowflake_key.pem | grep -v "END RSA PRIVATE KEY"|tr -d '\r\n’
Integrate Confluent Cloud to Snowflake
Integrate Confluent Cloud to BigQuery
Integrate Confluent Cloud to Redshift
Step 2: Using the Confluent Cloud Console
- Launch your Confluent Cloud Cluster by checking the Quick Start for Confluent Cloud for installation.
- In the left navigation menu, click on Data Integration and Connectors. If you already have connectors in your cluster, click on Add connector.
- Select your connector by clicking on the Snowflake Sink Connector Card.
- If you have populated Kafka topics, select the topics you want to connect from the topics list. Or create a new topic by clicking on the Add new topic.
- After the connector runs, verify that the messages are loaded into the Snowflake database table.
Step 3: Using the Confluent Command Line Interface
Follow the below steps to use the Snowflake Sink connector with Confluent Command Line Interface (CLI) to Confluent Cloud to Snowflake. It is assumed that you have installed Confluent CLI version 2.
- Enter the below command to list the available connectors.
confluent connect plugin list
- Use the below command to specify the required connector configuration properties.
confluent connect plugin describe <connector-catalog-name>
For example,
confluent connect plugin describe SnowflakeSink
Output:
Following are the required configs:
connector.class: SnowflakeSink
name
kafka.auth.mode
kafka.api.key
kafka.api.secret
input.data.format
snowflake.url.name
snowflake.user.name
snowflake.private.key
snowflake.schema.name
tasks.max
topics
- Create the JSON connector configuration file that contains configuration properties. The below example shows the required configuration properties.
{
"connector.class": "SnowflakeSink",
"name": "<connector-name>",
"kafka.auth.mode": "KAFKA_API_KEY",
"kafka.api.key": "<my-kafka-api-key>",
"kafka.api.secret": "<my-kafka-api-secret>",
"topics": "<topic1>, <topic2>",
"input.data.format": "JSON",
"snowflake.url.name": "https://wm83168.us-central1.gcp.snowflakecomputing.com:443",
"snowflake.user.name": "<login-username>",
"snowflake.private.key": "<private-key>",
"snowflake.database.name": "<database-name>",
"snowflake.schema.name": "<schema-name>",
"tasks.max": "1"
}
- Use the below command to load the configuration and start the connector.
confluent connect create --config <file-name>.json
For example,
confluent connect create --config snowflake-sink.json
Output:
Created connector confluent-snowflake lcc-ix4dl
- Check the connector status using the below command.
confluent connect list
Output:
confluent iam service-account list
Id | Resource ID | Name | Description
+---------+-------------+-------------------+-------------------
123456 | sa-l1r23m | sa-1 | Service account 1
789101 | sa-l4d56p | sa-2 | Service account 2
- After executing the connector, check that the records are populated in your Snowflake database.
Why Connect Confluent Cloud to Snowflake?
- Customers can ingest real-time data with event streaming, transform it, process it, and analyze it in an intuitive data platform for the cloud with Confluent Cloud to Snowflake.
- To unleash developer productivity, operate effectively at scale, and satisfy all of your architectural requirements before going into production, Confluent offers a truly cloud-native experience. It does this by completing Kafka with a comprehensive set of enterprise-grade features. With the help of Confluent Cloud to Snowflake Cloud Data Platform, businesses of all sizes can now access the full potential of their data.
- By obtaining insights from all of their business users’ data, thousands of companies use Snowflake to advance their businesses beyond what was previously possible.
- Organizations are given a single, integrated platform by Snowflake that includes a data warehouse designed for the cloud, instant, secure, and governed access to their entire network of data, and a core architecture to enable a variety of data workloads, including a single platform for creating contemporary data applications.
- In addition to producing a modern and straightforward architecture, this integration of Confluent Cloud to Snowflake drastically lowers infrastructure management and cost.
Conclusion
This article talks about connecting Confluent Cloud to Snowflake. Confluent Cloud assists businesses in streaming real-time data and uses the functionalities of Apache Kafka. Buinseese can integrate the real-time data into a centralized repository like Snowflake for further analysis. Snowflake can then be connected with powerful BI tools to gain meaningful insights from real-time data and help organizations make better business decisions.
Hevo offers a No-code Data Pipeline that can automate your data transfer process, hence allowing you to focus on other aspects of your business like Analytics, Marketing, Customer Management, etc.
This platform allows you to transfer data from 150+ sources (including 40+ Free Sources), such as Confluent Cloud, and Cloud-based Data Warehouses like Snowflake, Google BigQuery, etc. It will provide you with a hassle-free experience and make your work life much easier.
Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite firsthand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.
Frequently Asked Questions
1. Can we connect Kafka to Snowflake?
Yes, Kafka can be connected to Snowflake using various methods to stream and process data.
2. Can Kafka for Snowflake subscribe to only one topic?
Kafka for Snowflake, specifically using tools like the Snowflake Kafka Connector, can indeed subscribe to multiple topics.
3. Which connection methods can be used to connect to Snowflake?
a) Snowflake Kafka Connector
b) StreamSets Data Collector
c) Apache NiFi
Manjiri is a proficient technical writer and a data science enthusiast. She holds an M.Tech degree and leverages the knowledge acquired through that to write insightful content on AI, ML, and data engineering concepts. She enjoys breaking down the complex topics of data integration and other challenges in data engineering to help data professionals solve their everyday problems.