Most teams start with MySQL for everything, including analytics. At first, it works fine. But things come to a crawl as your data grows and queries become complex. 

Dashboards lag, and heavy reports take forever to run. You notice that your database is doing far more work than it was ever designed for.

Actually, the real problem isn’t MySQL itself; it’s that operational databases are not designed to handle heavy analytics at scale. As you add more data and more tools, the strain just increases. Performance tuning becomes a chore, and even basic analyses can start impacting application performance.

That’s why many teams begin to move their data into dedicated analytics platforms like Snowflake and other modern data warehouses. 

Of course, getting data from MySQL into a warehouse isn’t always easy. You need a reliable way to move large volumes of data and keep everything in sync over time. 

Read on as we walk you through 3 practical methods to get your MySQL data into Snowflake — an automated no-code workflow, Snowflake’s own native connector, and a conventional manual export-and-load process.

What is MySQL?

MySQL Logo

MySQL is an open-source relational database that has been the foundation of web applications for over two decades. It powers everything from small WordPress sites to massive SaaS platforms, and it remains one of the most widely deployed databases in the world. That is largely because it is fast and reliable, and also deeply integrated into virtually every programming language and framework.

That said, MySQL was designed with transactional (OLTP) workloads in mind. That translates to quick reads and writes against relatively small result sets. MySQL begins to show its limits once you start running heavy analytical queries, join dozens of tables across millions of rows, or serve concurrent BI dashboards. This is usually the point where a dedicated analytical warehouse like Snowflake needs consideration.

Key features worth noting

  • ACID-compliant InnoDB engine: Guarantees data consistency even under heavy concurrent writes (the reason why MySQL is trusted for financial and session-critical applications)
  • Binary log (binlog) replication: Captures every change at the row level. This is the same mechanism that Change Data Capture (CDC) tools (including the ones discussed later in this guide) rely on to stream incremental updates out of MySQL
  • Broad ecosystem and managed-service availability: MySQL runs natively on AWS (RDS, Aurora), Google Cloud SQL and Azure Database for MySQL. That means your source database is likely already cloud-hosted, which simplifies connectivity when setting up a pipeline to Snowflake

What is Snowflake?

Snowflake logo

Snowflake is a cloud-native data warehouse built from scratch to separate storage from compute. In practical terms, this means you can scale query processing power up or down independently of how much data you store, and you only pay for what you use. That architecture is completely unlike traditional databases (MySQL included), where storage and compute are tied to the same server.

Snowflake’s ability to handle mixed workloads without breaking a sweat makes it really great for teams migrating off MySQL. You can run complex analytical joins on billions of rows, serve concurrent BI dashboards and share live datasets with external partners. And you can do all of that from the same platform, without managing indexes or vacuum operations.

Key features worth noting

  • Independent scaling of storage and compute: You can create a dedicated warehouse for a heavy ETL job and shut it down the moment it finishes, without affecting any other workload running on the same data
  • Zero-copy data sharing: Snowflake lets you share live, governed datasets across accounts and organizations without physically copying data. This is great for teams that need to collaborate with external partners or business units
  • Near-zero maintenance: There is no indexing, no vacuuming, and no manual partitioning. Snowflake handles clustering and infrastructure management automatically

Prerequisites to Integrate MySQL to Snowflake

Before you start any migration, make sure the following are in place. Skipping these steps is the most common reason pipelines fail on the first attempt.

On the MySQL side

  • MySQL 8.0 or later is a must if you plan to use CDC-based replication (Methods 1 and 2). Earlier versions lack the binlog metadata that connectors depend on.
  • Binary logging must be enabled with these settings
binlog_format = ROW, binlog_row_image = FULL, and binlog_row_metadata = FULL 
  • A dedicated replication user with REPLICATION SLAVE, REPLICATION CLIENT and SELECT privileges. Avoid using a root account for pipeline access
  • Network access. If MySQL sits behind a Virtual Private Cloud (VPC) or firewall, you will need to whitelist the IP ranges of your ETL tool or Docker agent, or configure an SSH tunnel

On the Snowflake side

  • An active Snowflake account with a warehouse, database and schema ready to receive data.
  • A user with appropriate privileges (at minimum: USAGE on the warehouse, CREATE TABLE and CREATE STAGE on the target schema). For the native connector, ACCOUNTADMIN is needed during installation
  • Cloud storage (for Method 3 only). If you’re doing a manual export, you’ll need an S3 bucket, Google Cloud Storage (GCS) bucket or Azure Blob container to stage your CSV files before loading them into Snowflake

Beyond these, we also recommend starting with these few real-world requirements:

  • Binlog retention: CDC pipelines fail if binlogs expire too quickly. Example: binlog_expire_logs_seconds
  • Server ID: Needed for replication-based connectors
  • Primary keys: Many CDC tools require tables to have primary keys for proper updates or deletes
  • Time zone consistency: MySQL and Snowflake timezone mismatches can cause timestamp issues

How to Do a MySQL-Snowflake Migration: Three Methods Broken Down

There are three well-established ways to move data from MySQL to Snowflake. The right choice will depend on whether you need a one-time transfer or ongoing replication and how much infrastructure you’re willing to manage. You should also factor in whether you need transformation abilities along the way.

Here’s a glancable side-by-side comparison before we dive into each method:

Hevo (Automated ETL)Snowflake Connector / OpenFlowManual Export and Load
Setup complexityLow; no-code UIMedium; Docker agent combined with Snowsight configHigh; scripting, staging, SQL involved
Ongoing replicationYes; built-in CDCYes; binlog-based CDCNo; one-time unless scripted
Schema change handlingAutomaticLimitedFully manual
Transformation supportBuilt-in transformationsRaw replication onlyCustom scripts needed
Monitoring and alertsBuilt-in dashboardEvent table logsDIY
Best forContinuous pipelines at any scaleSnowflake-native CDC with some infrastructure comfortOne-time loads or small datasets

Method 1: Using Hevo (Automated MySQL to Snowflake Migration)

Hevo is a fully managed, no-code data pipeline platform that handles both the initial historical load and ongoing incremental replication from MySQL to Snowflake

Hevo uses log-based CDC under the hood, automatically detects schema changes, retries failed loads, and gives you a monitoring dashboard out of the box. Hevo is the fastest route if you want a production-grade pipeline without writing or maintaining custom code.

When to use this method

Pick the Hevo route when you need continuous, real-time replication rather than a one-off transfer. It’s a strong fit for teams that don’t have dedicated data engineers to build and babysit custom pipelines, or for organizations that want observability and alerting built into the pipeline from day one.

Step-by-step walkthrough

Step 1: Create a new pipeline

Log in to Hevo, navigate to Pipelines, and click + Create Pipeline. 

Select MySQL as the source and Snowflake as the destination.

Step 2: Choose the pipeline mode

Select BinLog as the ingestion mode. Doing so enables CDC-based replication, which captures every INSERT, UPDATE, and DELETE from MySQL’s binary log. 

Pick Standard Pipeline and click Continue.

Step 3: Configure MySQL as the source

Enter your MySQL host, port, database user, password, and the database names you want to replicate. 

If the database is inside a private network, enable the SSH tunnel option. For encrypted connections, turn on SSL. Click Test Connection. Once done, click Save & Continue to validate.

Step 4: Configure Snowflake as the destination

Provide your Snowflake account identifier, warehouse, database, schema and authentication details. 

Click Test Connection, then Save & Continue.

Step 5: Select tables and go live

Pick the MySQL tables you want to sync. Hevo will run an initial full load and then switch to continuous CDC.

Once enabled, it handles incremental replication, schema evolution, fault-tolerant retries and pipeline health monitoring automatically.

Method 2: Using Snowflake’s Native Connector for MySQL

Snowflake offers its own connector for MySQL that uses CDC to replicate data. Keep in mind that the original “Snowflake Connector for MySQL” is a preview feature that Snowflake has confirmed will not move to general availability. 

The replacement is the Openflow Connector for MySQL, which delivers better performance and more flexible deployment options. Both work on similar principles, so the steps below apply broadly.

This method is a good fit if you want Snowflake-native workflow and your team is comfortable deploying and monitoring a Docker-based agent within your network.

Prerequisites exclusive to this method

  • MySQL 8.0+ with binary logging configured as described in the Prerequisites section above.
  • Docker (or Kubernetes, OpenShift) installed on a machine that can reach both your MySQL server and Snowflake
  • ACCOUNTADMIN role in Snowflake for the initial installation

Step-by-step walkthrough

Step 1: Install the connector from Snowflake Marketplace

In Snowsight, go to Data Products, and then Marketplace, search for “Snowflake Connector for MySQL” (or “Openflow Connector for MySQL” if available in your region), and click Get

Select an installation warehouse and confirm.

Step 2: Create an event table for monitoring

Snowflake uses an event table to log connector activity. Run the following in a worksheet:

CREATE EVENT TABLE IF NOT EXISTS mysql_connector_events;ALTER ACCOUNT SET EVENT_TABLE = mysql_connector_events;
Step 3: Deploy the agent

Pull the Docker image, create a configuration file with your MySQL and Snowflake credentials, and start the container. 

The agent continuously reads MySQL binlogs and pushes changes into Snowflake.

docker pull snowflakedb/mysql-cdc-agent:latestdocker run -d \  -v $(pwd)/agent-config.json:/config/agent-config.json \  snowflakedb/mysql-cdc-agent:latest
Step 4: Configure replication in Snowsight

Open the installed connector app and set up a source connection pointing to your MySQL instance. 

Once done, select the tables to replicate, and choose Initial Load + CDC as the replication mode.

Step 5: Start replication and validate

Once you enable the pipeline, the connector snapshots the selected tables and then streams ongoing changes via Snowpipe Streaming. 

Verify your data by querying target tables and checking the event table for health status.

Limitations you need to be aware of

  • The legacy connector is a preview feature with no guaranteed bug-fix support going forward. Snowflake recommends migrating to the Openflow Connector.
  • This method demands Docker infrastructure and agent management within your network.
  • It only supports raw replication; there is no built-in transformation layer. If you need to reshape data before it lands in Snowflake, you’ll need a separate tool or dbt models downstream.
  • Only tables with primary keys can be replicated. Geometry-type columns are not supported.
  • This method is a lot less flexible than managed ETL tools like Hevo for complex workflows.

Method 3: Manual Export and Load from MySQL to Snowflake

The manual approach comes down to three stages

  • export your MySQL data into flat files (typically CSV)
  • upload those files to a cloud storage location, and then 
  • use Snowflake’s COPY INTO command to load them. 

It’s not exactly complicated but quite labour-intensive, and it’s best reserved for one-time migrations or small datasets where setting up a full workflow would be overkill.

Step 1: Export data from MySQL

You have two common options. The first is a SQL-based export using SELECT INTO OUTFILE:

SELECT *INTO OUTFILE '/tmp/customers.csv'FIELDS TERMINATED BY ','ENCLOSED BY '"'LINES TERMINATED BY '\n'FROM customers;

The second option is mysqldump with the –tab flag, which produces tab-delimited or CSV files:

mysqldump \  --user=mysql_user --password \  --tab=/tmp \  --fields-terminated-by=',' \  --fields-enclosed-by='"' \  your_database customers

For large tables, compress the output with gzip before uploading. It reduces transfer time by a lot and Snowflake can decompress gzipped files natively during load.

Step 2: Upload files to cloud storage

Snowflake loads data from staged locations, so you need to push your files to S3, GCS, or Azure Blob Storage. 

For example, using the AWS CLI:

aws s3 cp /tmp/customers.csv.gz s3://my-bucket/mysql-export/
Step 3: Create a stage in Snowflake 

Point a Snowflake external stage at your cloud storage location:

CREATE OR REPLACE STAGE mysql_stage  URL='s3://my-bucket/mysql-export/'  CREDENTIALS=(AWS_KEY_ID='xxx' AWS_SECRET_KEY='yyy')  FILE_FORMAT = (TYPE = CSV SKIP_HEADER = 1);
Step 4: Create target tables in Snowflake

Define your Snowflake tables with compatible data types. 

Pay close attention to differences between MySQL and Snowflake — DATETIME maps to TIMESTAMP_NTZ, TINYINT(1) used as a boolean maps to BOOLEAN, and JSON columns should use Snowflake’s VARIANT type.

CREATE OR REPLACE TABLE customers (  id INTEGER,  name STRING,  email STRING,  created_at TIMESTAMP_NTZ);
Step 5: Load data into Snowflake and validate 

Use COPY INTO to ingest the staged files:

COPY INTO customersFROM @mysql_stage/customers.csv.gzFILE_FORMAT = (TYPE = CSV SKIP_HEADER = 1);

After loading, run validation checks. 

  • Compare row counts between MySQL and Snowflake
  • Spot-check NULL handling
  • Verify timestamp and timezone correctness, and
  • Look out for character encoding issues (especially if your MySQL database stores UTF-8 multi-byte characters).

Limitations of the manual method

  • There is no built-in CDC or incremental updates. You have to re-export and reload manually if your source data changes
  • Schema changes require you to manually update your Snowflake DDL and re-run the pipeline.
  • This method is error-prone at scale; large exports can fail without warning, and there’s no built-in retry or monitoring.
  • This takes up lots of operational effort if you need to run this regularly. What starts as a quick one-time load often turns into a fragile cron job that most devs don’t want to maintain.

Why Move MySQL Data to Snowflake?

If you’ve read this far, you probably already have a reason. But if you’re still contemplating the move, here are the some of the most common drivers:

  1. Scalability without compromise: MySQL scales vertically; you add more CPU and RAM to the same server. Snowflake scales horizontally and elastically. You can create additional compute clusters in seconds and shut them down without affecting your production queries.
  2. Purpose-built analytics: Snowflake is designed from the ground up for analytical workloads like large aggregations and semi-structured data parsing, and it handles these without any performance impact on your source system.
  3. Cost efficiency at scale: With Snowflake’s pay-per-second billing and separate storage/compute pricing, you stop paying the moment a query finishes. If you have bursty or unpredictable workloads, this model is much cheaper than keeping a beefy MySQL read replica running around the clock.
  4. Easier collaboration and data sharing: Snowflake’s secure data sharing lets you expose live datasets to other teams or external partners without copying data. This kind of collaboration is difficult to replicate in MySQL without adding up additional infrastructure.
  5. Reduced operational overhead: Snowflake handles provisioning, patching, backups and performance tuning automatically. 

You can also move data from GCP MySQL to Snowflake within minutes for easy and smooth data integration.

Conclusion

Moving data from MySQL to Snowflake isn’t hard in theory, but it’s the maintenance that gets you. After too many broken scripts and failed workflows, you eventually realize that your team is spending more time babysitting data movement than actually using the data.

That’s the problem Hevo is built to eliminate.

Hevo gives you a fully managed, no-code data pipeline that replicates MySQL data into Snowflake in minutes. It uses log-based CDC via MySQL binlogs to stream updates in near real time, so your warehouse stays fresh without putting extra load on your production database.

But the real value shows up after the pipeline is running.

Hevo automatically detects and adapts to schema changes, so new columns or tables don’t break your pipeline. It auto-scales as data volume grows and provides live monitoring and alerts so you always know what’s happening with your data.

You also get built-in transformation capabilities, integrations with 150+ data sources, enterprise-grade security and full visibility into pipeline jobs, without writing or maintaining any extraction code.

Hevo’s free trial lets you set up a production-ready MySQL-to-Snowflake pipeline in under 15 minutes. 

FAQs

How do I transfer data from MySQL to Snowflake?

1. Export your MySQL data as CSV files (using SELECT INTO OUTFILE or mysqldump)
2. Upload them to cloud storage
3. Create a Snowflake stage pointing to that storage, and 
4. Use the COPY INTO command to load the data. 

For ongoing replication, use a CDC-based tool like Hevo or the Snowflake Connector for MySQL instead.

Can I replicate MySQL to Snowflake in real time?

Yes. Both Hevo and Snowflake’s native connectors use binlog-based Change Data Capture to stream inserts, updates and deletes from MySQL to Snowflake in near real time. The manual CSV method does not support real-time replication.

Does Snowflake use MySQL under the hood?

No. Snowflake is a completely independent, cloud-native data warehouse with its own SQL engine and storage layer. It is not built on MySQL, PostgreSQL or any other existing database.

What is the Openflow Connector for MySQL?

Openflow is Snowflake’s next-generation replacement for the original Snowflake Connector for MySQL. It offers improved performance and enhanced customization along with better deployment flexibility. The original connector remains available as a preview feature but will not be promoted to general availability.

How do I handle MySQL-to-Snowflake data type differences?

Some common data differences between the two include
1. MySQL DATETIME → Snowflake TIMESTAMP_NTZ
2. MySQL JSON → Snowflake VARIANT
3. MySQL TINYINT(1) → Snowflake BOOLEAN. 

Automated tools like Hevo handle these mappings for you; with the manual method, you need to define compatible schemas yourself.

                Amit Gupta
                Vice President of Engineering, Hevo Data

                Amit Gupta is the VP of Engineering at Hevo Data and a deeply hands-on leader with over 17 years of experience building products and teams from the ground up. He has led organizations of 60+ engineers and brings strong expertise across backend, frontend, big data, DevOps, and cloud technologies. At Hevo, he focuses on solving complex scalability and system design challenges, ensuring the platform reliably powers data movement at enterprise scale.