Google Analytics to MySQL – Easy Steps to Load Data

on Data Integration • January 30th, 2020 • Write for Hevo

Google Analytics is the service offered by Google to get complete information about your website and its users. It allows the site owners to measure the performance of their marketing, content, and products. It not only provides unique insights but also helps users deploy machine learning capabilities to make the most of their data. In spite of all the unique analysis services provided by Google, it is sometimes required to get the raw clickstream data from your website into the on-premise databases. This helps in creating deeper analysis results by combining the clickstream data with the organization’s own customer data and product data. In this blog post, we will learn how to move data from google analytics to MySQL – one of the most popular relational databases in use today.

There are two broad approaches one could use to move data from Google Analytics to MySQL

Two approaches to move data from Google Analytics to MySQL

There are two broad approaches one could use to move data from Google Analytics to MySQL

  1. Build a Custom Code:
    You would need to build custom scripts to extract data from Google Analytics, transform it into an analysis-ready format and finally load this to MySQL Database. Once the infrastructure is ready, you would also need to invest in engineering resources to monitor and maintain it.
  2. Buy a Ready-to-Use Solution:
    Using a fully managed platform like Hevo Data, you can move data from Google Analytics with minimal setup and configuration time. Hevo will ensure that the data is reliably loaded on to MySQL in a matter of minutes.

This post talks about building custom code to move data from Google Analytics to MySQL. Towards the end, the post also highlights some of the challenges and limitations of this approach. This will help you pick the right approach for your use case.

Using Custom Code to Move Data from Google Analytics to MySQL:

Getting data from Google Analytics

Google Analytics makes the click event data available through its Reporting API V4. Reporting API provides two sets of Rest API to addresses two specific use cases.

  1. Get aggregated analytics information on user behaviour on your site on the basis of available dimensions – Google calls these metrics and dimensions. Metrics are aggregated information that you can capture and dimensions are the terms on which metrics are aggregated. For example, the number of users will be a metric and time will be a dimension.
  2. Get activities of a specific user = For this, you need to know the user id or client id. An obvious question then is how do you know the user id or client id. You will need to modify some bits in the client-side google analytics function that you are going to use and capture the client id. Google does not specifically tell you how to do this, but there is ample documentation on the internet about it. Please consult the laws and restrictions in your local country before attempting this, since the legality of this will depend on the privacy laws of the country. You will also need to go to the Google Analytics dashboard and register client id as a new dimension.

Google Analytics APIs use oAuth 2.0 as the authentication protocol. Before accessing the APIs, the user first needs to create a service account in the Google Analytics dashboard and generate authentication tokens. Let us review how this can be done.

  1. Go to google service accounts page and select a project. If you have not already created a project, create a project.
  2. Click on Create Service Account.
  3. You can ignore the permissions for now.
  4. On the ‘Grant users access to this service account’ section, click create key
  5. Select JSON as the format for your key.
  6. Click create a key and you will be prompted with a dialogue to save the key in your local computer. Save the key.

We will be using the information from this step when we actually access the API.

Accessing Google Reporting API V4

Google provides easy to use libraries in Python, Java, and PHP to access its reporting APIs. It is best to use these APIs to download the data since it would be a tedious process to access these APIs using command-line tools like CURL. Here we will use the python library to access the APIs. The following steps detail the procedure and code snippets to load data from Google Analytics to MySQL.

  • Use the following command to install the python GA library to your environment.
sudo pip install --upgrade google-api-python-client

This assumes the python programming environment is already installed and works fine.

We will now start writing the script for downloading the data as a CSV file.

  • Import the required libraries.
from apiclient.discovery import build from oauth2client.service_account import ServiceAccountCredentials
  • Initialize the required variables.
SCOPES = [‘https://www.googleapis.com/auth/analytics.readonly’] KEY_FILE_LOCATION = ‘<REPLACE_WITH_JSON_FILE>’ VIEW_ID = ‘<REPLACE_WITH_VIEW_ID>’

The above variables are required for OAuth authentication. Replace the key file location and view id with what we obtained in the first service creation step. View ids are the views from which you will be collecting the data. To get the view id of a particular view that you have already configured, go to the admin section, click on the view that you need and go to view settings.

  • Build the required objects.
credentials = ServiceAccountCredentials.from_json_keyfile_name(KEY_FILE_LOCATION, SCOPES) #Build the service object. analytics = build(‘analyticsreporting’, ‘v4’, credentials=credentials)
  • Execute the method to get the data. The below query is for getting the number of users aggregated by country from the last 7 days.
response = analytics.reports().batchGet(       body={         ‘reportRequests’: [         {           ‘viewId’: VIEW_ID,           ‘dateRanges’: [{‘startDate’: ‘7daysAgo’, ‘endDate’: ‘today’}],           ‘metrics’: [{‘expression’: ‘ga:sessions’}],           ‘dimensions’: [{‘name’: ‘ga:country’}]         }]       }   ).execute()
  • Parse the JSON and write the contents into a CSV file.
import pandas as pd from pandas.io.json  import json_normalize reports = response[‘reports’][0] columnHeader = reports[‘columnHeader’][‘dimensions’]  metricHeader = reports[‘columnHeader’][‘metricHeader’][‘metricHeaderEntries’] columns = columnHeader for metric in metricHeader:   columns.append(metric[‘name’])  data = json_normalize(reports[‘data’][‘rows’])  data_dimensions = pd.DataFrame(data[‘dimensions’].tolist())  data_metrics = pd.DataFrame(data[‘metrics’].tolist())  data_metrics = data_metrics.applymap(lambda x: x[‘values’])  data_metrics = pd.DataFrame(data_metrics[0].tolist())  result = pd.concat([data_dimensions, data_metrics], axis=1, ignore_index=True) result.to_csv(‘reports.csv’) Save the script and execute it. The result will be a CSV file with the following columns Id , ga:country, ga:sessions
  • This file can be directly loaded to a MySQL table using the below command. Please ensure the table is already created.
LOAD DATA INFILE’products.csv’ INTO TABLE customers FIELDS TERMINATED BY ‘,’  ENCLOSED BY ‘“’ LINES TERMINATED BY ‘rn’ ;

That’s it! You now have your google analytics data in your MySQL. Now that we know how to do get the Google Analytics data using custom code, let’s look into the limitations of using this method.

Google Analytics to MySQL: Limitations of Using Custom Code

  1. The method even though elegant, requires you to write a lot of custom code. Google’s output JSON structure is a complex one and you may have to make changes to the above code according to the data you query from the API.
  2. This approach will work for a one-off data load to MySQL, but in most cases, organizations need to do this periodically merging the data point every day with seamless handling of duplicates. This will need you to write a very sophisticated import tool just for Google Analytics.
  3. The above method addresses only one API that is provided by Google. There are many other available APIs from Google which provides different types of data from the Google Analytics engine. An example is a realtime API.  All these APIs come with a different output JSON structure and the developers will need to write separate parsers.

A solution to all the above problems is to use a completely managed Data Integration Platform like Hevo.

Hevo Data – a Simpler Alternative to move data from Google Analytics to MySQL

Hevo Data provides a simple click and execute interface to get all your data from Google Analytics to MySQL in just 2 steps.

  1. Configure and authenticate Google Analytics source
  2. Configure the MySQL database where the data needs to be loaded

Hevo does all the heavy-lifting, masks all ETL complexities, and delivers data in MySQL in a reliable fashion.

Before you attempt building your own code to move data from Google Analytics to MySQL, consider trying Hevo Data for the following reasons:

  1. No-code, easy to use visual interface.
  2. Zero implementation time – With Hevo, you can start moving data in just a few minutes.
  3. Automatic Schema Handling – Hevo can automatically detect any schema changes that occur on Google Analytics and elegantly make required changes on MySQL, without any human intervention.
  4.  100% fault-tolerant – Hevo’s AI-powered algorithms ensure that data is moved from Google Analytics to MySQL in a consistent and reliable manner.

Explore Hevo by signing up for a 14-day free trial here and move data from Google Analytics to MySQL instantly.

No-code Data Pipeline for your Data Warehouse