Do you have enormous amounts of customer and marketing data in HubSpot? Are you looking to bring this data into a data warehouse such as BigQuery so that you leverage the power of SQL to analyze this data better? If yes, then you have landed on the right blog. 

This post talks about steps to move data from HubSpot to BigQuery. This blog post seeks to walk through the process of extracting data from HubSpot and exporting it to Google BigQuery using HubSpot’s REST API and also using Hevo. You will learn multiple methods on how to connect Hubspot to bigquery. 

Note: Consider factors like required data sources, budget limitations, and your team’s technical expertise when deciding on the best way to set up HubSpot BigQuery. Also, before you connect Hubspot to bigquery, carefully consider the specific data you want to export and how you plan to use it for analysis.

Methods to Move Data from HubSpot to BigQuery

Method 1: Using Hevo to Move Data from HubSpot to BigQuery

Hevo Data
Image Source: Self

Hevo is fully managed and completely automates the process of not only loading data from your desired source but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code. Its fault-tolerant architecture ensures that the data is handled in a secure, consistent manner with zero data loss.

Get Started with Hevo for Free

Hevo focuses on two simple steps to move your data from HubSpot to BigQuery

  • Configure Source: Connect Hevo Data with HubSpot by providing a unique name for your Pipeline, along with details about your authorized HubSpot account. You can also choose the historical sync duration for your HubSpot data.
Configuring HubSpot Source
Image Source: Self
  • Integrate Data: Complete HubSpot to BigQuery migration by providing your destination name, account name, region of your account, database username and password, database and schema name, and the Data Warehouse name.
Configure BigQuerys
Image Source: Self

Check Out What Makes Hevo Amazing:

  • Exceptional Security: A Fault-tolerant Architecture that ensures Zero Data Loss.
  • Built to Scale: Exceptional Horizontal Scalability with Minimal Latency for Modern-data Needs.
  • Built-in Connectors: Support for 150+ Data Sources, including Databases, SaaS Platforms, Files & More. Native Webhooks & REST API Connector available for Custom Sources.
  • Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
  • Auto Schema Mapping: Hevo takes away the tedious task of schema management & automatically detects the format of incoming data and replicates it to the destination schema. You can also choose between Full & Incremental Mappings to suit your Data Replication requirements.

To know more about Hubspot BigQuery integration, refer to Hevo documentation:

Get started for Free with Hevo!

Method 2: Manual Method to Move Data from HubSpot to BigQuery

1. Obtain your HubSpot API key

  • Navigate to the main navigation bar in your HubSpot account and click the Settings icon.
  • In the left sidebar menu, go to Integrations > API key.
  • Click Generate API key.
  • If you had previously generated an API key, click Show to display your API key.

  • Record this key.

2. Export HubSpot Data to JSON

API calls to HubSpot should be made to the https://api.hubapi.com base domain. JSON will be returned in all responses.

Get all your HubSpot Contacts:

curl --request GET 
  --url 'https://api.hubapi.com/contacts/v1/lists/all/contacts/all?hapikey=YOUR_HUBSPOT_API_KEY' -o contacts_data.json

Response:

{
  "lists": [
    {
      "dynamic": true,
      "metaData": {
        "processing": "DONE",
        "size": 161,
        "error": "",
        "lastProcessingStateChangeAt": 1284705947989,
        "lastSizeChangeAt": 1579479579246
      },
      "name": "test emails",
      "filters": [
        [
          {
            "filterFamily": "PropertyValue",
            "withinTimeMode": "PAST",
            "checkPastVersions": false,
            "type": "string",
            "property": "email",
            "value": "test",
            "operator": "STR_STARTS_WITH"
          }
        ]
      ],
      "portalId": 55748,
      "createdAt": 1396370538484,
      "listId": 1,
      "updatedAt": 1492585836314,
      "listType": "DYNAMIC",
      "internalListId": 330881,
      "deleteable": true
    },
    {
      "dynamic": false,
      "metaData": {
        "processing": "DONE",
        "size": 2,
        "error": "",
        "lastProcessingStateChangeAt": 1175279425836,
        "lastSizeChangeAt": 1196347964521
      },
      "name": "staticTest",
      "filters": [
        
      ],
      "portalId": 77123,
      "createdAt": 1253682414252,
      "listId": 3,
      "updatedAt": 1358942467316,
      "listType": "STATIC",
      "internalListId": 5
    }
  ],
  "offset": 2,
  "has-more": true
}

Here are some of the other API calls you might use:

  • Analytics API
  • Automation API
  • Companies API
  • CMS API
  • CRM API
  • Events API
  • Email Events API
  • Engagements API
  • Marketing API
  • Products API
  • Social Media API
  • etc.

3. Create a Dataset and a BigQuery table

From your gcloud command-line, run this command:

bq mk hubspot_dataset

4. Create an Empty Table in your hubspot_dataset.

bq mk 
--table 
--expiration 86400 
--description "Contacts table" 
--label organization:development 
hubspot_dataset.contacts_table

The BigQuery API allows you to upload binary data using the following formats: AVRO, CSV, PARQUET, ORC, or NEWLINE_DELIMITED_JSON5. Loading data into your BigQuery table from your local machine.

The following command loads a newline-delimited JSON file contacts_data.json from your local machine into a table named contacts_table in the hubspot_dataset. You need to define the contacts schema in a local schema file named contacts_schema.json. Notice that since you’re loading files from your local machine, you need to specify the data format explicitly.

bq load 
--source_format=NEWLINE_DELIMITED_JSON 
hubspot_dataset.contacts_table 
./contacts_data.json 
./contacts_schema.json

Normally you will want to backup your HubSpot data at least daily or perhaps multiple times per day. You can load additional data into a BigQuery table either from source files in your local machine by running the following command.

5. Appending to a Table Using a Local File

    bq load 
    --autodetect 
    --replace 
    --source_format=NEWLINE_DELIMITED_JSON 
    hubspot_dataset.contacts_table 
    ./contacts_data.json

This command loads data from a file named contacts_data.json and appends data to a table named contacts_table in hubspot_dataset. The schema is defined using a JSON schema file — contacts_schema.json.

6. Scheduling Recurring Load Jobs

  • Create your bash shell script.
 $ sudo mkdir /bin/scripts/ && touch /bin/scripts/backup.sh

Add the following content to the file and save it.

#!/bin/bash     bq load      --autodetect      --replace      --source_format=NEWLINE_DELIMITED_JSON      hubspot_dataset.contacts_table      ./contacts_ data.json
  • From your CLI, edit crontab.
$ crontab -e
  • You will be prompted to edit a file to introduce tasks to be run by cron. Schedule cron to Execute a Job at 6 pm Daily.
0 18 * * * /bin/scripts/backup.sh
  • Navigate to a directory where your backup.sh file is located and make the file executable.
$ chmod +x /bin/scripts/backup.sh
  • The above crontab commands will make cron execute the backup.sh file 6 PM daily.

Limitations of the Manual Method to Move Data from HubSpot to BigQuery

  1. HubSpot’s APIs are subject to rate limiting. These limits are applied per account for custom integrations using API keys.
  2. Wildcards are not supported when you load files using this method. Each file must be loaded individually and this often becomes very inefficient and resource-intensive.
  3. CronJobs can’t alert you if your load jobs fail or never start.
  4. Pulling data from HubSpot’s public endpoints using API keys is subject to daily limits.
  5. You need to specify separate schemas for data extracted from each API endpoint on BigQuery.
  6. This method does not allow you to have access to data in real-time. If your use case demands you to bring data in real-time, then this method is not suitable.
  7. If you need to clean, transform, and enrich your data before loading it to the data warehouse, then you would need to write additional code to achieve that. 
Download the Cheatsheet on How to Set Up High-performance ETL to BigQuery
Download the Cheatsheet on How to Set Up High-performance ETL to BigQuery
Download the Cheatsheet on How to Set Up High-performance ETL to BigQuery
Learn the best practices and considerations for setting up high-performance ETL to BigQuery

Method 3: Manual CSV Files for Hubspot to BigQuery

One of the easiest ways to transfer your HubSpot data to BigQuery is to download and upload CSV files manually. You can Import & Export your data from your HubSpot account and export the following records: Calls, Contacts, Companies, Deals, Payments, Tickets, and Custom objects.

When exporting any records, you can select the file format (XLS, XLSX, CSV) and whether to include only the properties in the view or all properties on the records. Once you have exported your data, you will receive an email with a download link to your export file. Please note that these download links expire after 30 days.

After downloading the file, you can then move it to your Google BigQuery platform.

Transferring HubSpot data to Google BigQuery via CSV files can be a straightforward process, but it comes with several drawbacks that can affect efficiency, data integrity, and overall workflow. Here are five significant drawbacks of using CSV files for this purpose:

  1. Manual Process: Each time there’s an update in HubSpot data, you must manually download and move the CSV files to BigQuery. This process is not only time-consuming but also prone to human error, which can lead to data inconsistencies.
  2. No Continuous Data Flow: Unlike automated or real-time data transfer methods, using CSV files does not allow for a continuous flow of data. This means that the data in BigQuery may not reflect the most current state of your HubSpot data, leading to potential delays in decision-making.
  3. Data Volume Limitations: CSV files may not be the most efficient way to transfer large volumes of data. As your data grows, the process of exporting, downloading, and uploading large CSV files can become cumbersome and resource-intensive.
  4. Complexity in Handling Data Types: CSV files can sometimes simplify or incorrectly represent complex data types, which can lead to issues when the data is imported into BigQuery. This can affect the accuracy of your data analysis.
  5. Limited Automation: Automating the process of exporting data to CSV and importing it into BigQuery can be challenging and may require custom scripting or third-party tools. This adds an additional layer of complexity and potential points of failure in your data pipeline.

In short, while exporting HubSpot data to CSV files for import into BigQuery can be done, it is not without its challenges. The process can be labor-intensive, error-prone, and may not support the real-time or large-scale data needs of many organizations.

Method 4: Using Google Cloud Integration for Hubspot to BigQuery Integration

Here are the steps to integrate Hubspot to BigQuery using the Google Cloud Integration:

  1. Prepare Your Google Cloud Project
    • Ensure your Google Cloud Platform account is billing-enabled.
    • Install and configure the Google Cloud SDK.
    • Enable the BigQuery API for your project.
  2. Create a HubSpot Connection
    • Navigate to Integration Connectors > Connections in the Google Cloud Console.
    • To open the Create Connection page, click the ‘+ Create new‘ button..
    • Select “HubSpot” from the list of available connectors.
    • Choose a location for your connection and fill in the connection details, including the HubSpot API version and authentication type. Use a Private App Access Token stored in Secret Manager for authentication.
  3. Configure Destination in BigQuery
    • Create a new dataset in BigQuery for your HubSpot data.
    • Opt for schema auto-detection or define the schema for your tables.
  4. Utilize the Connection in an Integration
    • After creating the connection, it becomes available in Apigee Integration and Application Integration.
    • Specify data movement from HubSpot to BigQuery in your integration flow.
  5. Monitor Your Integration
    • Use the Google Cloud Console to monitor the performance and usage of your integration.
    • Adjust configurations as needed.

Limitations of Using Google Cloud Integration for HubSpot to BigQuery

  1. API Rate Limiting: The HubSpot API has the capacity to handle up to 10 requests per second, with polling occurring at intervals of 5 minutes or more.
  2. Daily Request Cap: There is a limit of 10,000 requests to the HubSpot API within a 24-hour period.
  3. Failed Request Threshold: Failed requests to the HubSpot API may not exceed 5% of total syncs.
  4. Transaction Processing Limit: The BigQuery connector can process a maximum of 8 transactions per second per node.
  5. Connector Configuration: The process of configuring the connector requires careful attention to IAM roles and enabled services, which can be complex and time-consuming.

Need to Move Data from HubSpot to BigQuery

There are broadly two reasons to move data from your HubSpot to BigQuery.

  1. Data Backup: You store valuable data on your customers on HubSpot. There is a possibility that you lose access to your data due to human error, canceled account subscriptions, and so on. It is therefore essential that you backup your CRM data in a Cloud Data Warehouse.
  2. Data Analytics: When you backup your HubSpot data to Google BigQuery tables, you get an infinitely scalable data store that includes cutting-edge data analytics tools. Using BigQuery, you can run custom SQL queries on your contact and campaign data and generate custom reports and dashboards. You can even combine this data with data from other data sources used by your business to make it even more valuable.
  3. Scalability: BigQuery has the ability to handle vast amounts of data, making it highly scalable. As your business continues to generate more data, it can be easily accommodated on BigQuery. 
  4. Security and Compliance: Built on Google Cloud, BigQuery offers advanced security and compliance features such as access controls, auditing, and data encryption. This ensures that your data is stored securely and complies with regulations.
  5. Cost Savings: With a flexible pricing model, BigQuery can help you achieve significant cost savings. You only pay for the amount of data and processing power you use, which is often less expensive than maintaining your own on-premise data warehouse.

There are several ways to achieve a HubSpot BigQuery integration, including using a pre-built connector like the Google Cloud Integration Connector for HubSpot.

Note, in HubSpot, you can pretty much export everything:

  • Sales data
  • Tasks
  • Calls-to-Action (CTA) analytics
  • Contact lists
  • CRM data
  • Marketing emails
  • Customer feedback
  • Form submission data
  • Notes about each contact, calls, meetings
  • Activity data (clicks, opens, views, URL redirects, etc.)

Conclusion

You have learned about 4 methods you can use to move data from HubSpot to BigQuery. The manual process requires configurations and is demanding, so check out Hevo, which will do all the hard work for you in a simple intuitive process so that you can focus on what matters, the analysis of your data. You can also check out Hubspot to Google Data Studio Integration.

Visit our Website to Explore Hevo

Hevo can help you integrate your data from numerous sources like HubSpot and load them into a destination to analyze real-time data with a BI tool such as Tableau. It helps transfer data from HubSpot to a destination of your choice for free. It will make your life easier and data migration hassle-free. It is user-friendly, reliable, and secure. Check out the pricing details here.  

Sign Up for a 14-day free trial with Hevo to start moving data from HubSpot to BigQuery right now!

Talha
Software Developer, Hevo Data

Talha is a seasoned Software Developer, currently driving advancements in data integration at Hevo Data, where he have been instrumental in shaping a cutting-edge data integration platform for the past four years. With a significant tenure at Flipkart prior to their current role, he brought innovative solutions to the space of data connectivity and software development.

mm
Customer Experience Engineer, Hevo Data

Dimple, an experienced Customer Experience Engineer, possesses four years of industry proficiency, with the most recent two years spent at Hevo. Her impactful contributions significantly contribute to refining customer experiences within the innovative data integration platform.

No-code Data Pipeline for BigQuery

Get Started with Hevo