Are you looking for a step-by-step approach to set up MySQL BinLog Based Replication? This article will give you a brief overview of MySQL BinLog Based Replication and log files. You will also get to know how you can set up your MySQL BinLog Based Replication using two methods.

Moreover, the limitations in the case of the manual method will also be discussed in further sections. After getting into the steps, let’s also take a look at what is MySQL BinLog Based Replication and the need for it.

Methods to Setup MySQL BinLog Based Replication

This article delves into both the manual and using Hevo methods in depth. You will also see some of the pros and cons of these approaches and will be able to pick the best method based on your use case. Below are the two methods:

Methods to Set up MySQL BinLog Based Replication

Method 1: MySQL BinLog Based Replication Using Hevo

Hevo can help you set up MySQL BinLog Based Replication instantly on a point and click interface. It is a fully-managed data pipeline platform to set up data integration from 150+ data sources (including 40+ free data sources) including MySQL Database, MongoDB, etc., and will let you directly load data to a Data Warehouse such as Snowflake, Amazon Redshift, Google BigQuery, etc. or the destination of your choice.

Get Started with Hevo for Free

Method 2: Setting up MySQL BinLog Based Replication Manually

This method works based on the data update events. These events are captured in the binary log file which is created on the master server. The critical aspect of this method is that the log files, positions in them, and data update events to be transferred in real-time between master and slave.

Method 1: MySQL BinLog Based Replication Using Hevo

Hevo logo
Image Source

Hevo is a completely self-serve, fully managed, automated data pipeline that can help you implement MySQL BinLog Based Replication without having to write any code. Hevo’s point and click interface ensures the lowest time to production possible.

Sign up here for a 14-Day Free Trial!

Here’s how simple it is to set up MySQL BinLog Based Replication with Hevo:

Authenticate and connect your MySQL data source

Learn more about configuring MySQL from our documentation here.

  • Select the replication mode as “BinLog based replication
  • Point to the destination data warehouse or database where you want to move data

Hevo supports MySQL change data capture out of the box and will ensure that your data is loaded from MySQL to your target database or data warehouse in real-time. 

Here are more reasons to try Hevo:

  • Architecture that offers zero-data loss: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.

Experience the power and simplicity of implementing MySQL Binlog Based Replication by signing up for a 14-day free trial with Hevo.

Method 2: Setting up MySQL BinLog Based Replication Manually

You can follow these steps to manually set up MySQL BinLog Based Replication. The steps to set up MySQL Binlog Based Replication are as follows:

Step 1: Enable Binary Logging and Configure Master

This step is about enabling binary log and configuring a unique server ID on the master. This needs to be accomplished using log-bin & server-id options available in the configuration file at {mysqld] section. Each server whether it is a master or slave of the replication structure needs to be configured with a unique server id.

  1. You will need to shut down the MySQL server to edit the my.cnf or my.ini file.
  2. Here is how you can add log-bin & server-id options to enable the binary logging
    [mysqld] log-bin=mysql-bin server-id=1
  3. Once the changes are done to the configuration file, you will need to restart the server.

Step 2: User Creation for Replication

We will need to have the slave user accounts defined on the master server by granting ‘replication slave’ permissions. 

  1. To create a new user account, use the CREATE USER command
    mysql> CREATE USER ‘username’@’%.domain.com’ IDENTIFIED BY ‘password’;
  2. Newly or existing created accounts should have replication privileges granted
    mysql> GRANT REPLICATION SLAVE ON *.* TO ‘username’@’%. domain.com’;

Step 3: Getting Binary Log Coordinates from the Replication Master

To start the replication process, we should know the master’s current coordinates within its binary log to configure the slave.

To find the master binary log coordinates, follow below steps:

  1. Get started with a session on the master. With the command-line client, execute the FLUSH TABLES WITH READ LOCK statement:
    mysql> FLUSH TABLES WITH READ LOCK;

    This will flush all tables and block write statements.

  2. Execute the SHOW MASTER STATUS statement in another session to obtain the current binary log file name and position:
    mysql>SHOW MASTER STATUS;

These are replication coordinates in the form of bin log file & position as shown in the above example. These values will need to be used while configuring the slaves.

Step 4: Creating a Data Snapshot

We can create a dump of all the tables and or databases at the master server and import the same on the slave by using mysqldump. This will need to be performed before the replication task gets started.

In this command, the data is dumped into the file ‘repl_dbdump.db’ with –master-data option:

shell> mysqldump --all-databases --master-data > repl_dbdump.db
  1. If you want to exclude any tables in the database, use –ignore-table option.
  2. If you want to name only specific databases which you want to be dumped, use the –databases option

Step 5: Replication Slave Configuration

Similar to the master, the slaves will also need to be set up with a unique server id so that it does not conflict. The same process will need to be followed to edit the configuration file as indicated in Step 1 with regard to server id.

For example:

[mysqld]

server-id=2

Once you are done with making the changes to the configuration file, restart the respective slave. There is no need to activate binary logging on the slave but if you want to, then it can also be used as a master to other slaves.

Step 6: Master Configuration on the Slave

In order to configure the slave to communicate with the master for replication, set up the slave with the required connection information. The following statements will need to be executed to accomplish the same.

mysql> CHANGE MASTER TO

-> MASTER_HOST='master_host_name',

-> MASTER_USER='replication_user_name',

-> MASTER_PASSWORD='replication_password',

-> MASTER_LOG_FILE='recorded_log_file_name',

-> MASTER_LOG_POS=recorded_log_position;

The following step would be needed depending on whether you have a dump to import or not.

Step 7: Replication Setup between a New Master and Slaves

If you have no snapshot of a previous database to import, then a new master is to be used for the replication.

By starting the slave & executing CHANGE MASTER TO statement, you need to set up replication between a new master and the slave.

Next, perform the above step on each slave.

Step 8: Replication Setup with Existing Data

When setting up replication with existing data, the snapshot needs to be sent from the master to the slave. This has to be performed before the replication process is started.

By considering that we have used mysqldump

  1. Start the slave, using the –skip-slave-start option. This will not trigger the replication
  2. Import the dump file:
    shell> mysql < fulldb.dump

With this, the slave connects to the master to replicate data updates.

The data available on slaves can be utilized continuously for analytics purposes. With this approach, the binary log will need to be observed for any changes in the data as per the events and replicate only those events based on the position id.

We can also set up a scheduling job that notifies each time the data is updated and ensures fresh data is available for analytics purposes. 

Challenges of Building a Custom Setup

There are some limitations of building a custom setup for MySQL replication binlog:

  • Manually configuring & writing customs scripts becomes effort-intensive. Like explained above, this approach needs many tedious steps to work hand in hand. 
  • The process described above is time-consuming. If you are working on projects with tight deadlines, then building the entire setup manually is not advised.
  • Since each step is manually coded, the system is quite brittle. You will need to continuously monitor the infrastructure and ensure that the data replicated is consistent and accurate.
  • In case you need to apply any data enrichment or transformation before replicating the data to your destination, then you will need to build additional code to achieve this. If you are replicating data from MySQL to a data warehouse like Redshift, BigQuery, or Snowflake, for data analytics purposes, this can be a huge bottleneck.

An alternative to going through all these hardships will be to use a cloud-based data automated pipeline tool like Hevo which can execute MySQL BinLog Based Replications in a matter of a few clicks. 

The next method applies for replication across MySQL servers.

Method 3: Binary Log File Position-Based Replication Configuration

The binary log file position approach involves the MySQL instance acting as the source (where database changes occur), writing updates and changes as “events” to the binary log. Depending on the database updates being recorded, the information in the binary log is saved in several logging formats. Replicas are set up to execute the events in the binary log on the local database of the replica and to read the binary log from the source.

A copy of the binary log’s whole contents is sent to each duplicate. The replica is in charge of selecting which commands from the binary log to carry out. All events in the binary log of the source are executed on the replica unless you indicate differently. You may set up the replica to process events exclusively related to specific tables or databases if necessary.

You should remember that it is not possible for the source to log only particular events. The MySQL binlog file location and name that has been read and processed from the source are recorded in each replica. Thus, you can connect multiple replicas to the source, and different parts of the same binary log execute them. Also, it is possible to connect and disconnect the individual replicas with no effect on source operations. This is possible because replicas control the process. You can disconnect and reconnect replicas to restart the processing, as each replica records the current position of the binary log.

The server_id system variable must be used to setup the source and each replica with a distinct ID. Furthermore, the host name, log file name and location inside the source must all be specified in each replica. A CHANGE REPLICATION SOURCE TO or CHANGE MASTER TO statement on the replica can be used to control this information from within a MySQL session. The information is kept in the connection metadata repository of the replica.

There are “events” in the binary log that describe changes in the database, such as table creation operations or table data changes. It also has events for statements that could have made some changes (for example, a DELETE for no matching rows). This is true unless you are using logging based on row. The information on how long each statement took for the data to be updated is also present in the binary log. 

Before we wrap up, let’s take a look into some major purposes of binary log.

Major purposes of binary log

  • Replication: A record of the changes in data is provided by the binary log on a replication source server, which is meant to be sent to replicas. The information in the binary log is sent to replicas by the source. It reproduces those transactions so that the same data changes that were made on the source can be made. 
  • Data Recovery: Those events that are recorded after the backup are re-executed once the backup is restored. They update the databases from the point of the backup. Thus, data can be recovered by MySQL binlog replication troubleshooting if such a need arises. 

The binary logging format decides the format in which events are recorded in the binary log. row-based logging, mixed-base logging, and statement-based logging are the three types of formats supported by binary logs. For replication using the Binlog format in MySQL, you need to keep in mind the MySQL version that you are using.

The binlog_format server system variable is set to control the logging format MySQL server. You can set this variable with global or session scope.

Conclusion

This article gives you comprehensive guidance for MySQL replication setup through three methods. You can perform MySQL master-slave replication using the manual method or you can use Hevo to make your task simpler. 

Hevo is the only real-time ELT No-code Data pipeline platform that cost-effectively automates data pipelines that are flexible to your needs.

Visit our Website to Explore Hevo

Additionally, Hevo helps you clean, transform and enrich data both before and after replicating the database, ensuring that you have analysis-ready data in your warehouse at any point.

Businesses can use automated platforms like Hevo Data to set this integration and handle the ETL process. It helps you directly transfer data from a source of your choice to a Data Warehouse, Business Intelligence tools, or any other desired destination in a fully automated and secure manner without having to write any code and will provide you a hassle-free experience.

Sign Up for the 14-day free trial today to give Hevo a try.

Share your experience of setting up MySQL BinLog Based Replication in the comments section below!

Aniket Renuse
Freelance Technical Content Writer, Hevo Data

Aniket is an experienced freelance writer, skilled in creating engaging content focused on the data industry, covering areas such as data analytics, machine learning, AI, big data, and business intelligence.

No-code Data Pipeline for MySQL