Confluent Cloud assists companies in simplifying the use of Apache Kafka with the help of a user-friendly interface. It can also move data from Kafka topics and save it to a centralized repository like Snowflake for in-depth data analysis.
Snowflake can then be connected to powerful BI tools like Tableau, Google Data Studio, and Power BI to gain meaningful insights into Confluent data. This process helps businesses in making effective data-driven decisions. Companies can connect Confluent Cloud data with Snowflake using Confluent Snowflake Sink Connector.
In this article, you will learn to Connect Confluent Cloud to the Snowflake using Confluent Snowflake Sink Connector.
Basics of Apache Kafka
What is Confluent Cloud?
Developed by the co-creators of Apache Kafka, Confluent Cloud is a fully managed user interface to simplify processes such as transferring records from Kafka topics, managing cluster resources, and more. Since these processes require a lot of commands and time during executions, Confluent Cloud was built to provide an automated way of delivering Apache Kafka facilities.
With Confluent Cloud, organizations can drive business value from their data without worrying about the mechanism behind the processes like data transportation and data integration between different services.
Key Features of Confluent Cloud
- Cluster Linking: Cluster linking in Confluent Cloud is a fully managed service for seamlessly moving data from one Confluent cluster to another. It creates perfect copies of Kafka topics and programmatically keeps data in sync across clusters.
- Self-balancing: The Confluent Cloud platform can build several brokers, manage many Kafka topics and handle millions of messages per hour. Sometimes you can lose brokers due to network issues. As a result, new topics are created, and partitions are reassigned to balance the workload.
- Confluent Control Center: Confluent Cloud consists of a GUI-based application to manage and monitor Kafka, called Confluent Control Center. It helps businesses to manage Kafka Connect and easily create, edit, and manage connections to other systems. With Confluent Control Center, companies can monitor data streams from producers and consumers, ensure every message is delivered, and measure how long it takes to deliver messages.
What is Snowflake?
Developed in 2012, Snowflake is a fully managed data warehouse, which can be hosted on any cloud service like Amazon Cloud, Microsoft Azure, and Google Data Storage. It is a single platform that offers a wide range of services such as data engineering, data warehousing, and data lakes.
Snowflake provides flexibility in the pricing model that helps businesses to pay independently for computation and storage. It also allows companies to leverage on-demand storage where they can pay a monthly upfront payment.
Key Features of Snowflake
- Connectors and Drivers: Snowflake allows users to use a wide range of connectors and drivers, which consists of Python and Spark connectors and drivers like Go Snowflake, .NET, JDBC, Node.js, ODBC, PHP, and more.
- Result Caching: Snowflake has a unique feature that can cache results at different levels. The cached results can last for 24 hrs after the query is executed. Therefore, the results are quickly delivered if the same query is executed.
- Data Sharing: The data sharing feature in Snowflake allows businesses to share items from one database account to another without duplicating. This feature helps in having more storage space with less computation, resulting in faster data accessibility.
Why Connect Confluent Cloud to Snowflake?
Customers can ingest real-time data with event streaming, transform it, process it, and analyze it in an intuitive data platform for the cloud with Confluent Cloud to Snowflake.
To unleash developer productivity, operate effectively at scale, and satisfy all of your architectural requirements before going into production, Confluent offers a truly cloud-native experience. It does this by completing Kafka with a comprehensive set of enterprise-grade features. With the help of Confluent Cloud to Snowflake Cloud Data Platform, businesses of all sizes can now access the full potential of their data.
By obtaining all the insights from all of their data from all of their business users, thousands of companies use Snowflake to advance their businesses beyond what was previously possible.
Organizations are given a single, integrated platform by Snowflake that includes a data warehouse designed for the cloud, instant, secure, and governed access to their entire network of data, and a core architecture to enable a variety of data workloads, including a single platform for creating contemporary data applications.
In addition to producing a modern and straightforward architecture, this integration of Confluent Cloud to Snowflake drastically lowers infrastructure management and cost.
If yours is anything like the 1000+ data-driven companies that use Hevo, more than 70% of the business apps you use are SaaS applications Integrating the data from these sources in a timely way is crucial to fuel analytics and the decisions that are taken from it. But given how fast API endpoints etc can change, creating and managing these pipelines can be a soul-sucking exercise.
Hevo’s no-code data pipeline platform lets you connect over 150+ sources in a matter of minutes to deliver data in near real-time to your warehouse. What’s more, the in-built transformation capabilities and the intuitive UI means even non-engineers can set up pipelines and achieve analytics-ready data in minutes.
Take our 14-day free trial to experience a better way to manage data pipelines.
Get started for Free with Hevo!
How to Connect Confluent Cloud to Snowflake?
You can use the Snowflake Sink Connector to connect Confluent Cloud to Snowflake. With Confluent Cloud, businesses can automate clusters and Kafka topics for storing real-time data. The connector can ingest events from Confluent Cloud to Snowflake.
Method 1: Using Hevo to Set Up Confluent Cloud to Snowflake
Hevo provides an Automated No-code Data Pipeline that helps you move your Confluent Cloud to Snowflake. Hevo is fully-managed and completely automates the process of not only loading data from your 150+ data sources(including 40+ free sources)but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code. Its fault-tolerant architecture ensures that the data is handled in a secure, consistent manner with zero data loss.
Using Hevo, you can connect Confluent Cloud to Snowflake in the following 2 steps
- Step 1: Configure Confluent Cloud as the Source in your Pipeline by following these steps:
- Step 1.1: In the Asset Palette, select PIPELINES.
- Step 1.2: In the Pipelines List View, click + CREATE.
- Step 1.3: Select “Confluent Cloud” on the Select Source Type page.
- Step 1.4: Enter the following information on the Configure your Confluent Cloud Source page:
- Pipeline Name: A distinct, 255-character-maximum name for the Pipeline.
- Bootstrap Server(s): The server(s) used for bootstrapping, taken from Kafka Confluent Cloud.
- API Key: The API key that was retrieved from your Confluent Cloud account.
- API Secret: Your Kafka Confluent Cloud account’s API secret.
- Step 1.5: Simply press TEST & CONTINUE.
- Step 1.6: Configure the data ingestion and establish the destination after that.
- Step 2: To set up Snowflake as a destination in Hevo, follow these steps:
- Step 2.1: In the Asset Palette, select DESTINATIONS.
- Step 2.2: In the Destinations List View, click + CREATE.
- Step 2.3: Select Snowflake from the Add Destination page.
- Step 2.4: Set the following parameters on the Configure your Snowflake Destination page:
- Destination Name: A unique name for your Destination.
- Snowflake Account URL: This is the account URL that you retrieved.
- Database User: The Hevo user that you created in the database. In the Snowflake database, this user has a non-administrative role.
- Database Password: The password of the user.
- Database Name: The name of the Destination database where data will be loaded.
- Database Schema: The name of the Destination database schema. Default value: public.
- Warehouse: SQL queries and DML operations are performed in the Snowflake warehouse associated with your database.
- Step 2.5: Click Test Connection to test connectivity with the Snowflake warehouse.
- Step 2.6: Once the test is successful, click SAVE DESTINATION.
Using manual scripts and custom code to move data into the warehouse is cumbersome. Changing API endpoints and limits, ad-hoc data preparation, and inconsistent schema makes maintaining such a system a nightmare. Hevo’s reliable no-code data pipeline platform enables you to set up zero-maintenance data pipelines that just work.
- Wide Range of Connectors: Instantly connect and read data from 150+ sources including SaaS apps and databases, and precisely control pipeline schedules down to the minute.
- In-built Transformations: Format your data on the fly with Hevo’s preload transformations using either the drag-and-drop interface or our nifty python interface. Generate analysis-ready data in your warehouse using Hevo’s Postload Transformation.
- Near Real-Time Replication: Get access to near real-time replication for all database sources with log-based replication. For SaaS applications, near real-time replication is subject to API limits.
- Auto-Schema Management: Correcting improper schema after the data is loaded into your warehouse is challenging. Hevo automatically maps source schema with the destination warehouse so that you don’t face the pain of schema errors.
- Transparent Pricing: Say goodbye to complex and hidden pricing models. Hevo’s Transparent Pricing brings complete visibility to your ELT spending. Choose a plan based on your business needs. Stay in control with spend alerts and configurable credit limits for unforeseen spikes in the data flow.
- 24×7 Customer Support: With Hevo you get more than just a platform, you get a partner for your pipelines. Discover peace with round-the-clock “Live Chat” within the platform. What’s more, you get 24×7 support even during the 14-day free trial.
- Security: Discover peace with end-to-end encryption and compliance with all major security certifications including HIPAA, GDPR, and SOC-2.
Try Hevo Today!
SIGN UP HERE FOR A 14-DAY FREE TRIAL
Method 2: Using Custom Code to Move Data from Confluent Cloud to Snowflake
Follow the below steps to Connect Confluent Cloud to Snowflake.
Generate the Snowflake Key Pair
Before the connector can connect Confluent Cloud to Snowflake, you must generate a key pair. Snowflake authentication needs 2048-RSA. Add the public key to the Snowflake user account and the private key to the connector configuration.
Create the Key Pair
Complete the below steps to generate the key pair.
- Generate the private key using the below command.
openssl genrsa -out snowflake_key.pem 2048
openssl rsa -in snowflake_key.pem -pubout -out snowflake_key.pub
- View the list of the generated Snowflake key files using the below command
ls -l snowflake_key*
-rw-r--r-- 1 1679 Jun 8 17:04 snowflake_key.pem
-rw-r--r-- 1 451 Jun 8 17:05 snowflake_key.pub
- You can view the public key file by running the below command.
-----BEGIN PUBLIC KEY-----
-----END PUBLIC KEY-----
- Copy the key and add it to the new user in the Snowflake account. Copy the part between –BEGIN PUBLIC KEY– and –END PUBLIC KEY– using the below command.
grep -v "BEGIN PUBLIC" snowflake_key.pub | grep -v "END PUBLIC"|tr -d '\r\n'
Create the User and Add the Public Key
Open your Snowflake project and follow the steps below to create a user account and a public key.
- Go to the Worksheets panel and then switch to the SECURITYADMIN role.
- Run the below query in Worksheets for creating a user and add the public key copied earlier.
CREATE USER confluent RSA_PUBLIC_KEY='<public-key>';
- Ensure to add the public key as a single line in the statement. The below image shows how the Snowflake Worksheets should look.
- Follow the below steps to set the correct privileges for the user you have added.
For example, you can send Apache Kafka records to a PRODUCTION database using the schema PUBLIC. The below image shows the required queries for configuring the necessary user privileges for the same.
// Use a role that can create and manage roles and privileges:
use role securityadmin;
// Create a Snowflake role with the privileges to work with the connector
create role kafka_connector_role;
// Grant privileges on the database:
grant usage on database PRODUCTION to role kafka_connector_role;
// Grant privileges on the schema:
grant usage on schema PRODUCTION.PUBLIC to role kafka_connector_role;
grant create table on schema PRODUCTION.PUBLIC to role kafka_connector_role;
grant create stage on schema PRODUCTION.PUBLIC to role kafka_connector_role;
grant create pipe on schema PRODUCTION.PUBLIC to role kafka_connector_role;
// Grant the custom role to an existing user:
grant role kafka_connector_role to user confluent;
// Make the new role the default role:
alter user confluent set default_role=kafka_connector_role;
- Add the private key to your Snowflake configuration. Extract the key and move it to a safe place until you set up the connector.
- Get the list of generated Snowflake key files with the below command.
ls -l snowflake_key*
-rw-r--r-- 1 1679 Jun 8 17:04 snowflake_key.pem
-rw-r--r-- 1 451 Jun 8 17:05 snowflake_key.pub
- View the private key file using the below command.
-----BEGIN RSA PRIVATE KEY-----
-----END RSA PRIVATE KEY-----
- Copy the key and add it to the connector configuration. Copy only the part of the key between –BEGIN RSA PRIVATE KEY– and –END RSA PRIVATE KEY–.Use the below command for doing it.
grep -v "BEGIN RSA PRIVATE KEY" snowflake_key.pem | grep -v "END RSA PRIVATE KEY"|tr -d '\r\n’
Using the Confluent Cloud Console
- Launch your Confluent Cloud Cluster by checking the Quick Start for Confluent Cloud for installation.
- In the left navigation menu, click on Data integration and Connectors. If you already have connectors in your cluster, click on Add connector.
- Select your connector by clicking on the Snowflake Sink Connector Card.
- If you have populated Kafka topics, select the topics you want to connect from the topics list. Or create a new topic by clicking on the Add new topic.
- After the connector runs, verify that the messages are loaded into the Snowflake database table.
Using the Confluent Command Line Interface
Follow the below steps to use the Snowflake Sink connector with Confluent Command Line Interface (CLI) to Confluent Cloud to Snowflake. It is assumed that you have installed Confluent CLI version 2.
- Enter the below command to list the available connectors.
confluent connect plugin list
- Use the below command to specify the required connector configuration properties.
confluent connect plugin describe <connector-catalog-name>
confluent connect plugin describe SnowflakeSink
Following are the required configs:
- Create the JSON connector configuration file that contains configuration properties. The below example shows the required configuration properties.
"topics": "<topic1>, <topic2>",
- Use the below command to load the configuration and start the connector.
confluent connect create --config <file-name>.json
confluent connect create --config snowflake-sink.json
Created connector confluent-snowflake lcc-ix4dl
- Check the connector status using the below command.
confluent connect list
confluent iam service-account list
Id | Resource ID | Name | Description
123456 | sa-l1r23m | sa-1 | Service account 1
789101 | sa-l4d56p | sa-2 | Service account 2
- After executing the connector, check that the records are populated to your Snowflake database.
This article talks about connecting Confluent Cloud to Snowflake. Confluent Cloud assists businesses in streaming real-time data and uses the functionalities of Apache Kafka. Buinseese can integrate the real-time data into a centralized repository like Snowflake for further analysis. Snowflake can then be connected with powerful BI tools to gain meaningful insights from real-time data and help organizations make better business decisions.
Visit our Website to Explore Hevo
Hevo offers a No-code Data Pipeline that can automate your data transfer process, hence allowing you to focus on other aspects of your business like Analytics, Marketing, Customer Management, etc.
This platform allows you to transfer data from 150+ sources (including 40+ Free Sources) such as Confluent Cloud and Cloud-based Data Warehouses like Snowflake, Google BigQuery, etc. It will provide you with a hassle-free experience and make your work life much easier.
Want to take Hevo for a spin?
Sign Up for a 14-day free trial and experience the feature-rich Hevo suite firsthand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.