Summary IconKey Takeaway

Loading data from an API to BigQuery requires the right ingestion method based on your technical expertise, data volume, and pipeline frequency.

Three ways to load API data into BigQuery:

  • Hevo Data is best for no-code, automated pipelines with zero infrastructure overhead and real-time sync
  • Custom Python Code with Load Job is best for teams that need full control over batch ingestion with flexible transformation logic
  • BigQuery Storage Write API is best for production-grade pipelines needing exactly-once delivery, lower cost, and support for both streaming and batch ingestion

Key considerations:

When in doubt, use Hevo Data to get your API data into BigQuery in minutes without writing a single line of code

Use Application Default Credentials (ADC) as the recommended authentication method for all BigQuery API calls

For simple batch loads, a Load Job works well. For high-frequency production pipelines, the Storage Write API is Google’s recommended modern approach

Many businesses rely on cloud applications like Salesforce, HubSpot, Mailchimp, and Zendesk for daily operations. But when data lives in silos across these tools, it becomes difficult to measure key metrics, spot trends, or make informed decisions.

APIs solve this by giving you programmatic access to data from any application. And Google BigQuery gives you a scalable, serverless warehouse to centralize, store, and analyze that data at speed. But connecting an API to BigQuery is not always straightforward.

According to Informatica’s 2024 survey of 300 IT and data professionals, building a data pipeline can take up to 12 weeks, with 78% of teams facing challenges with data orchestration and tool complexity. For teams trying to move API data into BigQuery, that delay directly impacts analytics, reporting, and business decisions.

This guide walks you through two proven methods to load data from an API to BigQuery in 2026. Whether you want a fast, no-code setup or full control through custom Python code, you will find the right approach here.

How to load data from API to BigQuery : 2 Methods

Before diving into each method, here is a quick overview to help you choose the right one for your use case:

Hevo DataCustom Python Code
Best ForNon-technical teams or anyone who wants a fast, automated setup without codingTechnical teams that need full control over data fetching, transformation, and loading
Technical Skill RequiredLowHigh
Setup TimeMinutesHours to days
Real-Time SyncYesNo, requires additional scheduling setup
MaintenanceZero, fully managedOngoing, requires manual upkeep
CostSubscription-based, predictable pricingInfrastructure and development costs vary

Method 1: Load API Data to BigQuery with Hevo (No-Code Integration)

Hevo is a no-code data pipeline tool that helps you move data from any REST API to BigQuery without writing a single line of code. It’s perfect if you want something fast, simple, and automated.

Step 1: Configure REST API as a Source

To set up your API as a source in Hevo:

  • Log in to Hevo and go to Pipelines.
  • On the Pipelines List View, click CREATE.
  • On the Select Source Type page, choose REST API.
  • On the Configure Your REST API Source page, provide the following:
    • Pipeline Name: A unique name for your pipeline (max 255 characters).
    • API Endpoint URL: The endpoint from which you want to pull data.
    • Request Method: Typically GET, unless your API requires something else.
    • Authentication Type: Choose from options like API Key, OAuth, or No Auth.
    • Request Headers/Params: Add required headers (e.g., authorization keys).
    • Polling Frequency: Set how often you want Hevo to pull data.
  • If using OAuth 2.0, add https://<your-hevo-region>.hevodata.com/rest-api/oauth as the redirect URL while registering your app in the REST API source
  • Click TEST & CONTINUE to validate the setup.

Step 2: Configure Google BigQuery as a Destination

  • Click DESTINATIONS in the Navigation Bar. Click + Create Standard Destination and select Google BigQuery
  • Enter the following:
    • Destination Name: A name for your BigQuery destination.
    • Project ID: Your Google Cloud project where BigQuery is hosted.
    • Dataset Name: The BigQuery dataset where you want to send the data.
    • Authentication: Upload a service account key (JSON file) for access.
  • Click TEST CONNECTION to check credentials.
  • Click SAVE & CONTINUE.

Step 3: Final Settings (Optional)

  • Use the Schema Mapper to view and customize how fields map from source to destination.
  • Add Transformations to clean, filter, or enrich data before loading it into BigQuery.
  • You can also set up alerts, c, and data freshness checks for monitoring.

That’s all, Hevo takes care of the rest. It automatically fetches, formats, and loads your data into BigQuery reliably and securely.

Want more details on setting up a REST API source? Check out Hevo’s official documentation.

Method 2: Load API Data to BigQuery with Custom Python Code

If you want complete control over how data is fetched, cleaned, and stored, writing custom Python code is the way to go. It requires coding skills but gives you flexibility with scheduling, transformations, and handling non-standard APIs.

Step 1: Extract Data from the API

We’ll use the ExchangeRate API to get real-time currency data.

import requests

# Replace with your actual API keyurl = “https://v6.exchangerate-api.com/v6/YOUR-API-KEY/latest/USD”response = requests.get(url)

# Check if the request was successfulif response.status_code == 200:    data = response.json()else:    raise Exception(f”API request failed with status {response.status_code}: {response.text}”)

This sends a GET request to the ExchangeRate API and parses the response only if the request was successful.

Step 2: Parse and Prepare the Data

We’ll flatten the JSON into individual records for BigQuery.

from datetime import datetimeimport json

records = []

# Extracting key fieldsbase_currency = data[“base_code”]timestamp = data[“time_last_update_utc”]

# Convert to ISO 8601 formattimestamp = datetime.strptime(timestamp, “%a, %d %b %Y %H:%M:%S %z”).isoformat()

# Convert each exchange rate into a rowfor target_currency, rate in data[“conversion_rates”].items():    records.append({        “base_currency”: base_currency,        “target_currency”: target_currency,        “rate”: rate,        “updated_at”: timestamp    })

What it does: Turns nested exchange rate data into flat records. Converts the timestamp into ISO 8601 format so BigQuery can recognize it.

Step 3: Save Data as NDJSON

BigQuery requires a newline-delimited JSON format for loading structured data.

with open(“currency_data.json”, “w”) as f:    for record in records:        f.write(json.dumps(record) + “\n”)

Step 4: Load Data into BigQuery

Install the BigQuery client if you haven’t:

pip install google-cloud-bigquery

Then, run the following Python script:

from google.cloud import bigquery

# Make sure your Google Cloud credentials are set# Replace with the path to your downloaded service account keyimport osos.environ[“GOOGLE_APPLICATION_CREDENTIALS”] = “path/to/your/credentials.json”

# Initialize BigQuery clientclient = bigquery.Client()

# Define your tabletable_id = “your-project.your_dataset.currency_rates”  # Make sure the dataset exists

# Configure the load jobjob_config = bigquery.LoadJobConfig(    source_format=bigquery.SourceFormat.NEWLINE_DELIMITED_JSON,    autodetect=True,    write_disposition=bigquery.WriteDisposition.WRITE_APPEND  # Use WRITE_TRUNCATE to overwrite)

# Load data into BigQuerywith open(“currency_data.json”, “rb”) as source_file:    load_job = client.load_table_from_file(source_file, table_id, job_config=job_config)

load_job.result()  # Waits for the job to completeprint(f”Loaded {load_job.output_rows} rows into {table_id}”)

Important Notes:

  • The BigQuery dataset must exist beforehand. BigQuery won’t create it automatically.
  • Choose the correct write_disposition:
    • WRITE_APPEND = add to table
    • WRITE_TRUNCATE = replace all rows
    • WRITE_EMPTY = only write if table is empty

Optional Enhancements

Error Handling: Wrap network calls and BigQuery loading in try/except blocks.

Data Validation: Add checks to make sure the API response is complete before loading.

Scheduling: Use Cloud Scheduler with Cloud Functions for a fully managed Google-native scheduling solution. Alternatively use cron for local jobs or Airflow for complex pipeline orchestration.

Method 3: Load API Data to BigQuery Using the Storage Write API

The BigQuery Storage Write API is Google’s recommended modern approach for loading data into BigQuery. Unlike the Load Job method, it supports both streaming and batch ingestion in a single API, offers lower cost, and guarantees exactly-once delivery semantics. It is best suited for teams building production-grade, high-frequency pipelines that need reliable, scalable data ingestion.

Step 1: Install Required Libraries

  1. Install the BigQuery Storage Write API client library:

bash

pip install google-cloud-bigquery-storage google-cloud-bigquery

  1. Install the Protocol Buffers library required for data serialization:

bash

pip install protobuf

Step 2: Authenticate Using Application Default Credentials (ADC)

The Storage Write API recommends using Application Default Credentials (ADC) rather than a service account key file. ADC automatically detects credentials from your environment.

  1. If running locally, authenticate using the Google Cloud CLI:

bash

gcloud auth application-default login

  1. If running on Google Cloud infrastructure such as Cloud Functions, Cloud Run, or Compute Engine, ADC is automatically configured. No additional setup is required.
  2. Alternatively, if you need to use a service account key file explicitly:

python

import os

os.environ[“GOOGLE_APPLICATION_CREDENTIALS”] = “path/to/your/credentials.json”

Step 3: Extract Data from the API

Fetch data from your API using Python’s requests library. We will use the ExchangeRate API as an example:

python

import requests

url = “https://v6.exchangerate-api.com/v6/YOUR-API-KEY/latest/USD”

response = requests.get(url)

if response.status_code == 200:

    data = response.json()

else:

    raise Exception(f”API request failed with status {response.status_code}: {response.text}”)

Step 4: Prepare and Serialize Data as Protocol Buffers

The Storage Write API uses Protocol Buffers for data serialization. Define a schema that matches your BigQuery table:

  1. Create a proto file named currency.proto:

proto

syntax = “proto3”;

message CurrencyRate {

    string base_currency = 1;

    string target_currency = 2;

    double rate = 3;

    string updated_at = 4;

}

  1. Compile the proto file:

bash

protoc –python_out=. currency.proto

  1. Serialize your API data into Protocol Buffer format:

python

from currency_pb2 import CurrencyRate

from datetime import datetime

import json

rows = []

base_currency = data[“base_code”]

timestamp = datetime.strptime(

    data[“time_last_update_utc”], “%a, %d %b %Y %H:%M:%S %z”

).isoformat()

for target_currency, rate in data[“conversion_rates”].items():

    row = CurrencyRate(

        base_currency=base_currency,

        target_currency=target_currency,

        rate=rate,

        updated_at=timestamp

    )

    rows.append(row)

Step 5: Write Data to BigQuery Using the Storage Write API

  1. Initialize the BigQuery Write client and define your table:

python

from google.cloud.bigquery_storage_v1 import BigQueryWriteClient

from google.cloud.bigquery_storage_v1.types import WriteStream, AppendRowsRequest

from google.cloud.bigquery_storage_v1 import types as bq_storage_types

from google.protobuf import descriptor_pb2

client = BigQueryWriteClient()

parent = client.table_path(

    “your-project”,

    “your_dataset”,

    “currency_rates”

)

  1. Create a write stream using the COMMITTED type for exactly-once delivery:

python

write_stream = types.WriteStream()

write_stream.type_ = types.WriteStream.Type.COMMITTED

write_stream = client.create_write_stream(

    parent=parent,

    write_stream=write_stream

)

  1. Serialize and append rows to BigQuery:

python

proto_rows = bq_storage_types.ProtoRows()

for row in rows:

    proto_rows.serialized_rows.append(row.SerializeToString())

request = AppendRowsRequest()

request.write_stream = write_stream.name

request.proto_rows = AppendRowsRequest.ProtoData(

    rows=proto_rows

)

response = client.append_rows(iter([request]))

print(f”Successfully written rows to BigQuery”)

  1. Finalize and commit the stream:

python

client.finalize_write_stream(name=write_stream.name)

print(“Write stream finalized and committed successfully”)

Step 6: Verify the Data in BigQuery

  1. Open the Google Cloud Console
  2. Navigate to your dataset and table
  3. Click Preview to verify the rows were loaded correctly
  4. Alternatively, run a query to confirm:

sql

SELECT * FROM `your-project.your_dataset.currency_rates` LIMIT 10;

Supercharge Your API to BigQuery Integration with Hevo!

Easily move API data to Google BigQuery using Hevo’s no-code ETL platform. Automate real-time data pipelines, handle schema mapping, and ensure error-free API data transfer without writing a single line of code.

Why Choose Hevo for API to BigQuery?

  • Automated Schema Mapping: Maps API data to BigQuery tables seamlessly.
  • Scalable Integrations: Supports 150+ sources (60+ free) for growing data needs.
  • Reliable 24/7 Support: Expert assistance for smooth pipeline management.

Companies like Voiceflow and ScratchPay rely on Hevo to streamline their API-to-BigQuery workflows and unlock real-time analytics.

Get Started with Hevo for Free!

Why Connect APIs to BigQuery?

Integrating APIs with BigQuery offers several advantages:

  • Centralized Analytics: Consolidate data from multiple APIs for unified reporting.
  • Real-Time Insights: Analyze streaming data for timely decision-making.
  • Scalability: Handle large datasets without worrying about infrastructure.
  • Automation: Minimize manual effort and ensure consistent data pipelines.

Use Cases of API and BigQuery Integration

  • Social Media Analytics: Fetch data from Twitter or Instagram to analyze engagement trends.
  • IoT Data Aggregation: Collect sensor data to monitor equipment performance in real-time.
  • E-Commerce Tracking: Combine data from APIs like Shopify or Stripe for sales and inventory analysis.
  • Advanced Analytics: BigQuery has powerful data processing capabilities that enable you to perform complex queries and data analysis on your API data. This way, you can extract insights that would not be possible within API alone.
  • Data Consolidation: If you’re using multiple sources along with API, syncing them to BigQuery can help you centralize your data. This provides a holistic view of your operations, and you can set up a change data capture process to avoid discrepancies in your data.
  • Historical Data Analysis: API has limits on historical data. However, syncing your data to BigQuery allows you to retain and analyze historical trends.
  • Machine Learning: You can apply machine learning models to your data for predictive analytics, customer segmentation, and more by having API data in BigQuery

Limitations of writing custom scripts and developing ETL 

  1. The above code is written based on the current source and target destination schema. If either the incoming data or the schema on BigQuery changes, the ETL process will break.
  2. If you need to clean your data from API, transform time zones, hide personally identifiable information, and so on, the current method does not support it. You will need to build another set of processes to accommodate that. This would also require you to invest extra effort and money.
  3. You are at a serious risk of data loss if, at any point, your system breaks. This could be anything from the source/destination that is not reachable to script breaks and more. You must invest upfront in building systems and processes that capture all the fail points and consistently move your data to the destination.
  4. Since Python is an interpreted language, it might cause performance issues when extracting from API and loading data into BigQuery API
  5. For many APIs, we need to supply credentials to access API. Passing credentials as plain text in Python script is a very poor practice. You will need to take additional steps to ensure your pipeline is secure. 

Challenges in API-to-BigQuery Integration

While connecting APIs to BigQuery unlocks significant value, it comes with its own set of challenges:

  • API Rate Limits: Excessive requests may lead to throttling or blocked access.
  • Authentication Complexities: Handling OAuth tokens or API keys securely.
  • Data Transformation: Converting API data formats to match BigQuery’s schema.
  • Error Handling: Managing failed API calls or incomplete data uploads.

Best Practices for API Data Ingestion

  1. Batch Requests: Avoid rate-limit issues by batching API calls.
  2. Schema Validation: Ensure API data matches your BigQuery table schema to avoid errors.
  3. Monitor Pipelines: Set up alerts for failed or delayed data loads.
  4. Use BigQuery Streaming API: For real-time data, use the streaming API to ensure low-latency ingestion.
  5. Partition Tables: Organize data by date or other criteria to reduce query costs.
  6. Avoid Redundant API Calls: Cache responses to minimize unnecessary requests.

Conclusion

From this blog, you will understand the process you need to follow to load data from API to BigQuery. This blog also highlights various methods and their shortcomings. Connecting APIs to BigQuery can unlock immense value for your business by enabling real-time analytics and centralized data management.

While manual integration offers flexibility, automated solutions like Hevo simplify the process and save time. By following best practices and optimizing your pipelines, you can ensure a seamless and cost-effective data integration experience. Ready to simplify your API-to-BigQuery integration? Try Hevo for Free.

FAQs

1. How to connect API to BigQuery?

1. Extracting data out of your application using API
2. Transform and prepare the data to load it into BigQuery.
3. Load the data into BigQuery using a Python script.
4. Apart from these steps, you can also use automated data pipeline tools to connect your API url to BigQuery.

2. What is Google BigQuery?

Google BigQuery is a fully managed, serverless cloud data warehouse from Google Cloud Platform. It stores data in a columnar format for faster query processing and supports SQL for analyzing large datasets at scale. BigQuery integrates natively with other Google Cloud services and includes BigQuery ML, which allows users to train and run machine learning models directly using SQL without moving data to a separate platform.

3. What authentication method should I use to connect to BigQuery?

Google recommends using Application Default Credentials (ADC) for authentication. Run gcloud auth application-default login in your terminal to set it up. For server-to-server communication, a service account key file is an alternative but ADC is the preferred approach for most use cases.

4. What is the BigQuery data transfer API?

The BigQuery Data Transfer API offers a wide range of support, allowing you to schedule and manage the automated data transfer to BigQuery from many sources. Whether your data comes from YouTube, Google Analytics, Google Ads, or external cloud storage, the BigQuery Data Transfer API has you covered.

5. How to input data into BigQuery?

Data can be inputted into BigQuery via the following methods.
1. Using Google Cloud Console to manually upload CSV, JSON, Avro, Parquet, or ORC files.
2. Using the BigQuery CLI
3. Using client libraries in languages like Python, Java, Node.js, etc., to programmatically load data.
4. Using data pipeline tools like Hevo

6. What is the fastest way to load data into BigQuery?

The fastest way to load data into BigQuery is to use automated Data Pipeline tools, which connect your source to the destination through simple steps. Hevo is one such tool.

7. Can I use Python to load API data into BigQuery?

Yes, you can fetch data with Python’s requests library, transform it into NDJSON format, and use the Google Cloud BigQuery client to load it into your dataset.

8. What challenges should I expect when integrating APIs with BigQuery?

Common challenges include API rate limits, authentication complexities, schema mismatches, error handling, and ensuring data freshness.

9.  What are best practices for API-to-BigQuery data ingestion?

Follow practices like batching requests, validating schema, using BigQuery’s streaming API for real-time data, partitioning tables for cost efficiency, and setting up monitoring/alerts.

mm
Freelance Technical Content Writer, Hevo Data

Lahudas focuses on solving data practitioners' problems through content tailored to the data industry by using his problem-solving ability and passion for learning about data science.