Today’s data-driven businesses rely on MySQL to manage their Databases. Owing to its Cloud-based architecture, MySQL makes for an ideal DBMS (Database Management System) that provides high processing power and reliability for companies. Moreover, Applications Developers fancy MySQL’s Query Processing Engine as it is easily compatible with the popular Operating Systems and Programming Languages. However, Database Administrators sometimes are left frustrated by the limited Query Modification options of MySQL. Moreover, this platform does not provide a direct way to implement Load Sharding for the applications working with large datasets. This is where ProxySQL comes in!

ProxySQL understands SQL traffic and therefore it is capable of extensive routing. This coupled with Load Balancing will allow applications working with MySQL Database to get faster query processing and stronger resilience to database crashes. Using it with MySQL allows companies to freely manage the Database Queries because of its Query mirroring & Query Rewriting features. This way Database Admins can mirror their traffic and manipulate queries according to their requirements. Furthermore, it allows even writing queries on the fly and any structural changes in the underlying database will not affect the running applications in any way.

This article will introduce you to MySQL and ProxySQL along with their key features. It will also discuss the importance of ProxySQL for MySQL users. The article will then guide you through a step-by-step process for implementing ProxySQL to your MySQL databases. Read along to understand more about the benefits of this tool!

Prerequisites

  • One Ubuntu 16.04 server including a non-root user and protected by a firewall. This server will act as your ProxySQL instance.
  • 3 MySQL servers are configured in a way to form a multi-primary replication group.

What is MySQL?

ProxySQL: MySQL Logo

MySQL is a robust system that leverages Cloud Technology to manage your databases. This DBMS launched in 1995 uses SQL (Structured Query Language) for record manipulation and other data-management operations. This open-source tool is easily accessible to everyone who wishes to render its services. However, there are some advanced features that MySQL offers to meet your specific business requirements. You can utilize these features by spending a monthly usage fee. Due to its rich toolset and in-built features, businesses today heavily rely upon the MySQL platform to have scalable and secure data management for their ever-increasing data. 

Key Features of MySQL

The following key features make MySQL a popular DBMS choice: 

  • High performance: With MySQL, you can experience high-speed query processing while using a very simple user interface. Moreover, it is capable of serving numerous users simultaneously because of its rapid and global accessibility. 
  • Compatibility: MySQL ensures secure data transactions for your database and operates at low latency. You can also add multiple web development tools on the MySQL platform to boost optimize your data usage.  
  • Scalability: MySQL allows you to upscale or downscale your data needs at any time. Furthermore, it adjusts seamlessly to the common operating structures like Linux, OS X, Windows, etc.

What is ProxySQL?

ProxySQL: ProxySQL Logo

ProxySQL was launched in 2015, to solve critical problems faced by a Database Administrator while working on a large, sharded MySQL environment. It is an open-source proxy server for MySQL and serves as a bridge between a MySQL server and the applications that run on its databases. This tool specializes in balancing the traffic load among a pool of multiple database servers. This way, it enhances the query processing speed and scalability. 

ProxySQL allows for high data availability by automatically failing over to a standby in case one or more database servers fail. It also enables you to rewrite “wrong queries” in a live database environment. Moreover, it allows you to perform changes in the underlying database without affecting the application in production. 

Key Features of ProxySQL

ProxySQL can enhance your MySQL experience with its following features:

  • Application Layer Proxy: It is capable of adapting to MySQL protocols and provides advanced connection handling features.
  • Database Firewall: Its firewall monitors the incoming & outgoing traffic and ensures that your files are safe from malicious content.
  • Sharding & Rewriting:  It allows you to shard your Database load across servers. It also lets you rewrite transformation commands on the fly.
Supercharge Your MySQL Migration with Hevo

Migrating your data from MySQL doesn’t have to be complex. Relax and go for a seamless migration using Hevo’s no-code platform. With Hevo, you can:

  1. Effortlessly extract data from MySQL and other 150+ connectors
  2. Tailor your data with features like drag-and-drop and custom Python scripts.
  3. Achieve lightning-fast data loading, making your data analysis ready.

Try to see why customers like Slice and Harmoney have upgraded to a powerful data and analytics stack by incorporating Hevo! 

Get Started with Hevo for Free

What is ProxySQL Overhead?

ProxySQL offers a lot of functionality to your MySQL infrastructures such as caching and connection multiplexing — but it’s not free! Ultimately your database has to go through additional processing processes, which adds some expense. This is where overhead comes in and you should know how to calculate it.

When it comes to ProxySQL two types of overhead exist:

  1. Network ProxySQL Overhead
  2. Processing ProxySQL Overhead

Now let’s talk about the two ProxySQL Overhead in a brief.

Network ProxySQL Overhead

Network Overhead is ProxySQL’s location-dependent. In the example given below ProxySQL is located on a separate host.

When compared to directly querying ProxySQL Servers, the application will incur additional network delay for every query. This delay can range from a few milliseconds if ProxySQL is installed on the same local network to considerably more if you chose inadequate ProxySQL locations.

I’ve observed scenarios when ProxySQL was installed in distinct regions from MySQL and Application, resulting in tens of milliseconds of latency (and more than 100 percent of overhead for many queries).

Processing ProxySQL Overhead

The worst-case scenario, in Processing Overhead, is when thousands of expressions based on query rules can add very high overhead. The problem arises when every request that ProxySQL receives undertakes additional processing. With enough CPU power available, the key cost drivers for such processing will be the size of the query, its result set size, and your ProxySQL settings. The more query rules you have, and the more intricate they are, the more processing overhead you should expect.

How to Measure & Reduce the ProxySQL Overhead?

Measuring network overhead can be tricky because it’s environment-specific; hence, in this section, we would rather look at processing overhead in cases we are running ProxySQL.

MySQLDump

Let’s begin with the most illogical test: running MySQLDump on the huge table (around 2GB) and timing how long it takes. This test demonstrates how costly result processing is in ProxySQL because query routing effort is insignificant in this scenario.

ProxySQL | MySQLDump
Image credits: percona

ProxySQL allows us to achieve 20% faster response times (through considering results processing by MySQLDump actual query execution time difference is likely higher).

Another way to look at it is that we have 4.75sec additional to process 10mil rows, which means the ProxySQL overhead is 475ns for around 200-byte row, which is really rather decent.

64 Concurrent Connections Workload

For this workload, we will be using the server with 28 Cores and 56 logical CPU threads and I have to raise mysql-threads to 32 to make sure ProxySQL is not keeping itself on a diet.

ProxySQL | 64 Concurrent Connections Workload
Image credits: percona

There is a lot of interesting data here. First, we can see disabling Prepared Statements gives us a 15% slowdown with a direct MySQL connection and about 13.5% when going through ProxySQL, which makes sense as the Processing overhead on the ProxySQL side should not increase as much when Prepared Statements are disabled.

The performance between direct connection and going through ProxySQL is significant, though going directly is almost 80% faster when Prepared Statements are in use and over 75% when Prepared Statements are disabled. 

If you think about these numbers — considering sysbench itself is taking some resources, for trivial primary key lookup queries the number of resources ProxySQL requires is comparable to those needed by MySQL Server itself to serve the query.

Single Connection Workload

Let’s have a look at the speed of the same simple point lookup query, but this time with only one thread. We additionally schedule Sysbench, and ProxySQL to various CPU cores to ensure that there is no substantial conflict for CPU resources and that we can examine efficiency. All connections in this test are made via UNIX Socket, so we’re looking at best-case scenarios, and Prepared Statements are enabled.

ProxySQL | Single Connection Workload
Image credits: percona

The direct connection outperforms ProxySQL by around 55% in terms of throughput.

Another method is to calculate how long it takes to serve the query directly and with ProxySQL in the middle — it is 46 microseconds with MySQL Directly and 71 microseconds with ProxySQL, implying that ProxySQL adds roughly 25 microseconds.

While 25 microseconds is a considerable fraction of overall query execution in this single-host context and for basic queries, it may be far less relevant for more complex queries and network-based deployments.

Unix Socket vs TCP/IP

It is logical to suppose that the same would apply to ProxySQL deployment, except that with ProxySQL, we have two connections to manage: the connection between ProxySQL and MySQL Server and the connection between ProxySQL and Application. In both circumstances, we may utilize Unix Socket in our single host test. If you use ProxySQL as an application sidecar or on MySQL Server, you will be able to use a Unix socket for at least one of these connections.

The letters “U” and “T” represent connection types — “UU” indicates that Unix Socket was used for both connections, while “TT” indicates that TCP/IP was used in both places.

ProxySQL | Unix Socket vs TCP/IP
Image credits: percona

The results are rather predictable — for the best performance, use Unix Socket, but even utilizing Socket for one of the connection types increases performance.

Using TCP/IP instead of Unix Socket for both connection types lowers performance by more than 20%.

If we use the same arithmetic to calculate how much latency TCP/IP adds, we get 20 microseconds, which means ProxySQL over TCP/IP adds almost double the processing lag compared to ProxySQL via Unix Socket.

Importance of ProxySQL for Load Balancing

ProxySQL can enhance the Load Balancing process among your MySQL Databases due to the following reasons:

  • By leveraging Connection Multiplexing, you can drastically reduce connection usage. Moreover, you can implement Load Balancing by serving multiple frontend connections dynamically via a single backend connection.
  • It allows you to generate complex Query Rules. These rules enhance your database query processing by providing you with, distributed reads across replicas. It also offers to rewrite queries with highly granular criteria on the fly and provides high ProxySQL performance.
  • Since it has a small footprint, you can easily deploy it in any form. For instance, you can use it in a decentralized way for your application servers or in a centralized layer within the servers.
  • It’s zero vendors allow you to lock in your applications while retaining full MySQL compliance. This way it ensures a clean and simple integration into your environment for Load Balancing.

Concepts & Architecture of ProxySQL

The ProxySQL Architecture consists of the following major components:

  • Query Rules Engine: It integrates well with MySQL and operates according to its protocol. The Query Rules Engine monitors the incoming traffic and dynamically decides whether to cache a query, block, re-write, reroute or mirror it onto a target of host group.
  • User Authentication: The required user credentials for the supporting databases are hashed and safely stored in the proxy.
  • Hostgroup Manager: Its role is to manage the server groups, track their state and regulate the outgoing traffic.
  • Connection Pool: This ensures the connections to the backend databases are working. This pool is established with the backends, and the connections are shared by all applications.

Steps to Set Up Load Balancing with ProxySQL

You can manually set up the Load Balancing process for your MySQL Databases using ProxySQL with the following steps:

Step 1: Install ProxySQL

Download the latest Ubuntu compatible version of ProxySQL from here and store it in the /tmp directory.

cd /tmp
curl -OL https://github.com/sysown/proxysql/releases/download/v1.4.4/proxysql_1.4.4-ubuntu16_amd64.deb

Complete the installation by using the .deb file as shown below:

sudo dpkg -i proxysql_*
rm proxysql_*

Now, Update your repository to ensure you’re working with the latest pre-bundled version, and then use sudo apt command to install the mysql-client package as follows:

sudo apt-get update
sudo apt-get install mysql-client

Now, manually start the service with the following command:

sudo systemctl start proxysql

ProxySQL will now run with its default configuration.

Step 2: Create the Administrator Password

Once initialized, ProxySQL stores its configuration in a database which you can secure using a strong password.

First, go to the administration interface where you have to enter the password which by default is admin.

mysql -u admin -p -h 127.0.0.1 -P 6032 --prompt='ProxySQLAdmin> '

You will see a welcome note, indicating that you have successfully logged in, Now use the following code to change your password by replacing the word password with something strong:

UPDATE global_variables SET variable_value='admin:password' WHERE variable_name='admin-admin_credentials';

To put your changes into effect copy memory settings and save them to disk as follows:

LOAD ADMIN VARIABLES TO RUNTIME;
SAVE ADMIN VARIABLES TO DISK;

Now, keep this tab running, and meanwhile, it’s time to set up the MySQL part of this integration.

Step 3: Configure MySQL Monitoring

To monitor the MySQL nodes, the ProxySQL first needs to communicate with them. To facilitate this communication, it must connect to each server via a dedicated user. So, you have to configure users for MySQL nodes and install SQL functions for further querying.

Since MySQL group replication is running in the background, you need to perform the following step for only a single member of their group. In a new terminal, with the following code use a MySQL node and log into a server: 

ssh sammy@your_mysql_server_ip_1

Use the given code and download the SQL file required for ProxySQL group replication:

curl -OL https://gist.github.com/lefred/77ddbde301c72535381ae7af9f968322/raw/5e40b03333a3c148b78aa348fd2cd5b5dbb36e4d/addition_to_sys.sql

Now execute the following command and if no output is generated, it means that the group replication is in place:

mysql -u root -p < addition_to_sys.sql

Now create a Monitor (user) and replace the given password with something strong:

CREATE USER 'monitor'@'%' IDENTIFIED BY 'monitorpassword';

Grant the required privileges to the new Monitor:

GRANT SELECT on sys.* to 'monitor'@'%';

Apply the changes to all nodes:

FLUSH PRIVILEGES;
Integrate MySQL to Amazon S3
Integrate MySQL to Azure Synapse Analytics
Integrate MySQL to MS SQL Server

Step 4: Configure ProxySQL Monitoring

You must update ProxySQL using the user information to allow it to access the MySQL nodes. Go to the MySQL terminal tab and use the following command:

UPDATE global_variables SET variable_value='monitor' WHERE variable_name='mysql-monitor_username';

Use the following command to configure the:

LOAD MYSQL VARIABLES TO RUNTIME;
SAVE MYSQL VARIABLES TO DISK;

The monitor (user) is fully configured and now you have to inform the ProxySQL about the nodes.

Step 5: Connect MySQL Nodes to the ProxySQL Server Pool

ProxySQL’s group replication divides all nodes into 4 logical states, Writer, Backup Writer, Reader, Offline. You need to manually assign corresponding numerical group ids to them. Here, you can use:

  • 1 for the offline host group.
  • 2 for the writer host group.
  • 3 for the reader host group.
  • 4 for the backup writer host group.
INSERT INTO mysql_group_replication_hostgroups (writer_hostgroup, backup_writer_hostgroup, reader_hostgroup, offline_hostgroup, active, max_writers, writer_is_also_reader, max_transactions_behind) VALUES (2, 4, 3, 1, 1, 3, 1, 100);

Now you can add MySQL servers to ProxySQL. Insert the IP address and host group for each server in the mysql_servers table using the following command:

INSERT INTO mysql_servers(hostgroup_id, hostname, port) VALUES (2, '203.0.113.1', 3306);
INSERT INTO mysql_servers(hostgroup_id, hostname, port) VALUES (2, '203.0.113.2', 3306);
INSERT INTO mysql_servers(hostgroup_id, hostname, port) VALUES (2, '203.0.113.3', 3306);

And again, run the following command to transfer the details to the right servers:

LOAD MYSQL SERVERS TO RUNTIME;
SAVE MYSQL SERVERS TO DISK;

To check the effect of the above commands, input the following:

SELECT hostgroup_id, hostname, status FROM runtime_mysql_servers;

You will get the following output:


+--------------+-------------+--------+
| hostgroup_id | hostname    | status |
+--------------+-------------+--------+
| 2            | 203.0.113.1 | ONLINE |
| 2            | 203.0.113.2 | ONLINE |
| 2            | 203.0.113.3 | ONLINE |
| 3            | 203.0.113.1 | ONLINE |
| 3            | 203.0.113.2 | ONLINE |
| 3            | 203.0.113.3 | ONLINE |
+--------------+-------------+--------+
6 rows in set (0.01 sec)

Now, before you can use the ONLINE nodes, you must configure user credentials so that each node can access the MySQL database.

Step 6: Create Users in MySQL

You must first create a user account with the ProxySQL credentials, and grant the necessary privileges to the user:

Note: As in Step 3, you need to perform the following steps must only on a single member. Moreover, gran it the required privileges using the following:

CREATE USER 'playgrounduser'@'%' IDENTIFIED BY 'playgroundpassword';
GRANT ALL PRIVILEGES on playground.* to 'playgrounduser'@'%';

Finally, apply the following changes and exit the terminal:

FLUSH PRIVILEGES;
EXIT;

To verify your work, open the terminal again and use:

imysql -u playgrounduser -p

Now enter:

SHOW TABLES FROM playground;

You must an output similar to as shown below:


+----------------------+
| Tables_in_playground |
+----------------------+
| equipment            |
+----------------------+
1 row in set (0.00 sec)

Use the following command to exit the terminal:

EXIT;

In the next step, you have to create a corresponding user in the ProxySQL server.

Step 7: Create the ProxySQL User

In the ProxySQL terminal write:

INSERT INTO mysql_users(username, password, default_hostgroup) VALUES ('playgrounduser', 'playgroundpassword', 2);

Migrate the configuration using the following:

LOAD MYSQL USERS TO RUNTIME;
SAVE MYSQL USERS TO DISK;

Now open a new terminal and verify that you can use these credentials to connect to the database nodes and write:

ssh sammy@your_proxysql_server_ip

ProxySQL uses port 6033 to listen for incoming client connections. You can try connecting to the real database using port 6033. (You’ll have to use your password for this. The password, in this blog, was playgroundpassword .

mysql -u playgrounduser -p -h 127.0.0.1 -P 6033 --prompt='ProxySQLClient> '

If these commands open up a welcome prompt, then ProxySQL has accepted your credentials.

Step 8: Verify the ProxySQL Configuration

Thi final step is to ensure that the playground database is allowing both read and write statements from ProxySQL.

To verify that you can read data from the playground database, execute the following:

SELECT * FROM playground.equipment;

Output similar to the following will verify that it has successfully read data from the MySQL database:

+----+--------+-------+--------+
| id | type   | quant | color  |
+----+--------+-------+--------+
|  3 | slide  |     2 | blue   |
| 10 | swing  |    10 | yellow |
| 17 | seesaw |     3 | green  |
+----+--------+-------+--------+
3 rows in set (0.00 sec)

Now to verify the write command,  try inserting new data into the above table:

INSERT INTO playground.equipment (type, quant, color) VALUES ("drill", 5, "red");

Now running the SELECT command:

SELECT * FROM playground.equipment;

The output should be similar to the following:

+----+--------+-------+--------+
| id | type   | quant | color  |
+----+--------+-------+--------+
|  3 | slide  |     2 | blue   |
| 10 | swing  |    10 | yellow |
| 17 | seesaw |     3 | green  |
| 24 | drill  |     5 | red    |
+----+--------+-------+--------+
4 rows in set (0.00 sec)

That’s it! Now your ProxySQL setup is complete. Now try it yourself and perform ProxySQL performance tuning for your MySQL databases.

Solve your data replication problems with Hevo’s reliable, no-code, automated pipelines with 150+ connectors.
Get your free trial right away!

Benefits of Load Balancing with ProxySQL

ProxySQL: Load Balancing

Load Balancing has a direct impact on the productivity of an application. Developers prefer to use MySQL Proxy for Load Balancing of Databases to experience the following benefits:

  • It intelligently monitors its MySQL traffic and splits reads from writes accordingly. This is essential to operate a Master-Slave replication setup where the writes must be sent to the master nodes only.
  • It studies the underlying database topology. Therefore, based on whether the specific instances are up or down, it routes requests only to the healthy databases. Furthermore, it safeguards applications from any changes in the complex database topology.
  • It offers query workload analytics to speed up the analysis and performance of Queries. It also provides powerful control mechanisms to administrators. Using this feature, the administrators can cache queries at the proxy layer, re-route queries, or re-write a wrongly written query seamlessly.
  • Its Load Balancing supports the administrator in maintaining high efficiency in terms of operations performed in the overall infrastructure

Conclusion

The article introduced you to MySQL & ProxySQL with their unique features. It then explained the architecture of ProxySQL and listed the importance of using this tool with MySQL. The article also discussed the steps using which you can set up Load balancing for MySQL using this tool. Furthermore, it elaborated on the benefits of using this efficient tool.

Now, to run SQL queries or perform Data Analytics on your MySQL data, you first need to export this data to a Data Warehouse. This will require you to custom code complex scripts to develop the ETL processes. Hevo Data can automate your data transfer process, hence allowing you to focus on other aspects of your business like Analytics, Customer Management, etc. This platform allows you to transfer data from 100+ multiple sources like MySQL to Cloud-based Data Warehouses like Amazon Redshift, Snowflake, Google BigQuery, etc. It will provide you with a hassle-free experience and make your work life much easier.

Getting Started with Hevo – An Overview

Sign up for a 14-day free trial and experience the feature-rich Hevo suite first hand.

Share your understanding of ProxySQL in the comments below!

FAQs

1. What is ProxySQL used for?

ProxySQL is a high-performance database proxy designed to improve MySQL scalability, query routing, load balancing, and failover handling.

2. What is a proxy in SQL?

A proxy in SQL acts as an intermediary between applications and databases, managing traffic, enhancing performance, and ensuring availability.

3. What is the difference between HAProxy and ProxySQL?

HAProxy is a general-purpose load balancer for HTTP/TCP traffic, while ProxySQL is specialized for MySQL and MariaDB, providing advanced query-level routing and caching.

Abhinav Chola
Research Analyst, Hevo Data

Abhinav Chola, a data science enthusiast, is dedicated to empowering data practitioners. After completing his Master’s degree in Computer Science from NITJ, he joined Hevo as a Research Analyst and works towards solving real-world challenges in data integration and infrastructure. His research skills and ability to explain complex technical concepts allow him to analyze complex data sets, identify trends, and translate his insights into clear and engaging articles.