Data analytics involves storing, managing, and processing data from different sources and analyzing it thoroughly to develop solutions for our business problems. While JSON helps to interchange data between different web applications and sources through API connectors, Snowflake assists you in analyzing that data with its intuitive features. Therefore, JSON Snowflake data migration is crucial for analyzing semi-structured datasets.

This article will elucidate four methods to help you integrate data from a JSON document into Snowflake.

Overview of JSON 

JSON is a text-based format widely used to exchange data between different applications. It is mainly built on two data structures: objects and arrays enclosed inside different brackets. JSON’s minimal syntax makes the files smaller and more accessible to transmit. It is versatile, as you can use it with varied programming languages. JSON also supports complex data structures, making it one of the best choices for exchanging data in modern development.

Overview of Snowflake

Snowflake logo

Snowflake is a cloud platform for large-scale data processing. It allows you to create databases within data warehouses. You can host your data warehouse on different platforms, such as AWS, Microsoft Azure, and Google Cloud Platform (GCP). Since Snowflake has separate storage and compute layers, you only have to pay for the services you use, making it a cost-effective option in the long run.

Let’s see how to load JSON file in Snowflake using different methods.

Method 1: JSON Snowflake Data Migration Using Hevo

Hevo is a real-time ELT platform that has made data migration easy with its no-code, cost-effective, and automated data pipelines. It supports 150+ data sources and assists you in transferring your data to a particular destination, such as a database, data warehouse, or cloud platform. Through Hevo, you can set up your data pipeline with minimal effort, making it usable for non-technical users, too.

Benefits of Using Hevo

  • Data Transformation: Hevo provides powerful data transformation through its Python-based drag-and-drop feature, which adjusts the source data before loading it into your destination data warehouse.
  • Automated Schema Mapping: Hevo’s automated schema mapping speeds up data integration. It senses the source schema and automatically replicates it in your destination.
  • Supports Complex Data Structure: Hevo supports different complex structures, including nested JSON arrays and hierarchical data inside the JSON format.
  • Ready-made Integration: Hevo supports 150+ ready-made data integrations across various cloud storage, databases, and SaaS platforms. These integrations allow you to replicate your data from source to destination in minutes.
Get Started with Hevo for Free

Let’s look at the steps for how you can transfer JSON data to Snowflake using Hevo

1. Set JSON as Your Source

To export JSON data to Snowflake, select a parsing strategy while generating the data pipeline in Hevo. The parsing strategy makes reading the data from the JSON document or file easy.

Prerequisites

  • Before you extract the data from the JSON file, you should understand JSONPATH and JSONPATH expressions
  • You should understand the REST API, which will be used as a source to get your JSON data from the JSON file.

*Note: JSON is not a platform. It is a file format. While creating a data pipeline, you can use REST API as your source and save your data in JSON file format.

JSON Snowflake: Set up REST API as Your Source

1. Select REST API as Your Source Type

Let’s see how to set up the REST API as your source. 

Prerequisites
  • You should have a basic knowledge of JSONPATH and JSONPATH expressions. 
  • Your REST API endpoint of the file’s location should be available to extract the data. 
  • To create a data pipeline, you must be a team administrator, collaborator, or pipeline administrator. 
  • You should replace your placeholder name with the Hevo region when registering on Hevo’s platform.

2. Configure REST API for Your Source  

  • Click on PIPELINES in the Navigation Bar. 
  • Select +CREATE from Pipeline View Lists. 
  • Select REST API as a source on the Select Source page. 
  • To configure REST API as your source, specify the following- 
  1. Insert a unique name for your data pipeline, not exceeding 255 characters.
  2. Set up your source as REST API. 
  3. Specify the data path or root for your source. 
  4. Select a pagination method to read your API response. 
  5. Click on TEST & CONTINUE. 

3. Construct a Connection for Your REST API Source Endpoint

To connect to your REST API endpoint, specify a method and URL for your REST API.  

  • Specify Method: You can select the GET method, which requests data from your API endpoint, or the POST method, which sends data to your API source.
  • Specify URL: You must provide the complete URL for your REST API source, for example – https://www.example.com/page.html

If you select the POST method, you must choose a REQUEST BODY for your connection containing the data to be sent to your API source. The data can be in JSON or Data Form format.

JSON Snowflake: Selecting Request Body

4. Setting up a Data Replication Path 

You must direct Hevo to the element in your API source from which you want to extract the data. 

  • Specify Data Root

To specify the source’s data, you can either enter your data root or select from the suggestion that Hevo offers related to the URL you entered for your data source. If you choose the first method, you must check the validity of the data.

4. Setting up a Pagination Method

Pagination is a method for synchronizing data and increasing its readability by dividing it into smaller chunks. To determine the expected result, you must check the API response for the pagination method you selected. To learn how to choose the correct pagination method, read Set up pagination.

To better understand how to configure REST API as your source, refer to the Hevo documentation: Set REST API as Source.

2. Set Snowflake as Your Destination

Connecting your Snowflake instance to Hevo, you can use a private link, which enables an exclusive connection with your cloud provider or a public network using your Snowflake URL. 

Let’s see what steps you must take to configure your Snowflake account as your destination. 

Prerequisites
  • You must have an active Snowflake account. 
  • To create a new role in Hevo, the user must have an ACCOUNTADMIN or SYSADMIN role in their Snowflake account. 
  • You must have assigned user permissions on the Hevo account.
  • Your Hevo account must be assigned with USAGE and SCHEMA permissions. 

Step 1. Create and Configure Snowflake as Your Destination 

  • Log in to your Snowflake account. 
  • To create a new worksheet, you need to click the + icon in the top right corner. 
  • Hevo needs some basic permission to load your data to your destination. Insert the following script in the worksheet you created to grant permission to Hevo to create a new role in your Snowflake account. 
-- create variables for user / password / role / warehouse / database (needs to be uppercase for objects)

set role_name = 'HEVO'; -- Replace "HEVO" with your role name

set user_name = 'HARRY_POTTER'; -- Replace "HARRY_POTTER" with your username

set user_password = 'Gryffindor'; -- Replace "Gryffindor" with the user password

set warehouse_name = 'HOGWARTS'; -- Replace "HOGWARTS" with the name of your warehouse

set database_name = 'RON'; -- Replace "RON" with the name of your database

set schemaName = 'DARK_ARTS'; -- Replace "DARK_ARTS" with the database schema name

set db_schema = concat($database_name, '.', $schemaName);

begin;

-- change role to securityadmin for user / role steps

use role securityadmin;

-- create role for HEVO

create

role if not exists identifier($role_name);

grant role identifier($role_name) to role SYSADMIN;

-- create a user for HEVO

create user if not exists identifier($user_name)

password = $user_password

default_role = $role_name

default_warehouse = $warehouse_name;

-- Grant access to the user

grant role identifier($role_name) to user identifier($user_name);

-- change role to sysadmin for warehouse / database steps

use role sysadmin;

-- create a warehouse for HEVO, if it does not exist

create

warehouse if not exists identifier($warehouse_name)

warehouse_size = xsmall

warehouse_type = standard

auto_suspend = 60

auto_resume = true

initially_suspended = true;

-- create database for HEVO

create database if not exists identifier($database_name);

-- grant HEVO role access to warehouse

grant USAGE

on warehouse identifier($warehouse_name)

to role identifier($role_name);

-- grant HEVO access to current schemas

use role accountadmin;

grant CREATE SCHEMA, MONITOR, USAGE, MODIFY

on database identifier($database_name)

to role identifier($role_name);

-- grant Hevo access to future schemas

use role accountadmin;

grant SELECT on future tables in database identifier($database_name) to role identifier($role_name);

grant MONITOR, USAGE, MODIFY on future schemas in database identifier($database_name) to role identifier($role_name);

use role accountadmin;

CREATE SCHEMA IF not exists identifier($db_schema);

GRANT USAGE, MONITOR, CREATE TABLE, CREATE EXTERNAL TABLE, MODIFY ON SCHEMA identifier($db_schema) TO ROLE identifier($role_name);

commit;
  • To create a data warehouse, replace the script’s sample values in lines 2-7 with your own to connect the warehouse with Hevo. You can create a new warehouse, role, or database name or use a pre-existing one to load your data.
  • To select the script in the worksheet area, use the CMD + A (Mac) or CTRL + A (Windows) command.
  • To run the script, use the command – CMD+return (Mac) or CTRL + Enter (Windows).
  • Once this script runs successfully, you can use the credentials you created in lines 2-7 to connect Hevo to your Snowflake account.

Step 2. Get Your Snowflake Account URL

  • You can host your Snowflake warehouse on the following cloud providers:
  1. Amazon Web Services (AWS)
  2. Google Cloud Platform (GCP)
  3. Mircosoft Azure (Azure) 
  • The account name, cloud service provider, and region must be visible in your Snowflake account. 

For example, https://westeros.us-east-2.aws.snowflakecomputing.com. Westeros is your account name, us-east-2 is the region, and AWS is the service provider.

  • Login to your Snowflake Instance. 
  • Click on the Admin Tab and select Accounts.
  • Move your mouse to the LOCATOR field corresponding to the account for which you need to obtain the URL, and click on the LINK icon to copy the URL.
JSON Snowflake: Select the Account

Step 3. Configure Snowflake as Your Destination

  • Click on DESTINATIONS in the Navigation Bar. 
  • Click on the button +CREATE in Page View Lists. 
  • Select Snowflake as your destination on the add destination page. 
  • Configure Snowflake as your destination by specifying the following:
JSON Snowflake: Configure Snowflake as your Destination
  • Destination Name: Give a unique name to your destination. 
  • Snowflake Account URL: The URL that you copied in step 3. 
  • Database User: Your Hevo user account has a non-administrative role. 
  • Database Password: Password for your database user. 
  • Database Name: Name of the database where your data will get stored. 
  • Database Schema: Name of your database schema. 
  • Warehouse: The data warehouse which is connected to your Snowflake database. 
  • Advanced Settings: 
    •  Populate Loaded Timestamp: To append the hevo_loaded_at column to your destination, you must enable this option to indicate the time when the event was loaded in the destination. 
    • Create Transient Tables: Transient tables possess the same features as permanent tables but don’t support a fail-safe period, making them cost-friendly.
  • Click on TEST CONNECTION to know you have successfully entered all the mandatory information. 
  • Click on SAVE & CONTINUE. This button is enabled after you have successfully entered all the mandatory information. 

To configure Snowflake as your destination, refer to this Hevo documentation: Configure Snowflake your Destination.

SIGN UP HERE FOR A 14-DAY FREE TRIAL

Method 2: JSON Snowflake Data Loading Using PUT Command

In this method, you will learn how to load data from Snowflake using different SQL commands.

Prerequisites

  • Download a Snowflake provider for your JSON file. 
  • Create a virtual warehouse, a database, and a table. 

Step 1: Create a File Format Object 

You need to create a file format with a type source JSON. Remember that the default format file type is CSV, so specify the file type while executing the command.

CREATE OR REPLACE FILE FORMAT sf_tut_json_format

  TYPE = JSON;

Step 2: Create a Stage Object

Create an internal stage object using the command below:

CREATE OR REPLACE TEMPORARY STAGE sf_tut_stage

 FILE_FORMAT = sf_tut_json_format;

Step 3: Stage Your Data File

Use the PUT command to upload your JSON file from your local system to the named stage. 

  • For LINUX or macOS

PUT file:///tmp/load/sales.json @sf_tut_stage AUTO_COMPRESS=TRUE; 

  • For Windows

PUT file://C:\temp\load\sales.json @sf_tut_stage AUTO_COMPRESS=TRUE;

Step 4: Copy Data from the File to the Target Table 

You need to copy the data in the table you created in your database earlier. Let’s say the table name is home_sales. You need to load your JSON file (sales.json.gz ) data to home_sales.

COPY INTO home_sales(city, state, zip, sale_date, price)
FROM (SELECT SUBSTR($1:location.state_city,4),
SUBSTR($1:location.state_city,1,2),
$1:location.zip,
to_timestamp_ntz($1:sale_date),
$1:price
FROM @sf_tut_stage/sales.json.gz t)
ON_ERROR = 'continue';

Step 5: Remove the Successfully Copied File from the Internal Stage

After successfully copying your data from the internal stage to the target table, you must remove the data file from the internal stage object using the REMOVE command.

Step 5: Clean Up 

Return your system to the previous state to clean up the DROP <object> command.

DROP DATABASE IF EXISTS mydatabase;
DROP WAREHOUSE IF EXISTS mywarehouse;

Limitations for Migrating JSON Snowflake Data With PUT Command

  • You have to manually remove the data files from the internal stage, which makes the process lengthy and time-consuming. 
  • If you forget to clean up the extra files every time, you might work with duplicate datasets, causing errors in analysis.

Method 3: Copying Data into the Target Table Using S3 Bucket

This method will teach you how to parse JSON in Snowflake using an S3 Bucket. 

Prerequisites

  • You must have a Snowflake account configured with Amazon AWS, granting users the necessary privileges to create a data warehouse, database, and table.
  • You must have SnowSQL Client installed.

Step 1: Copy Data to the Target Table

Execute the COPY INTO <table> command to copy the data to the target table named RAW_SOURCE. 

COPY INTO raw_source
FROM @my_stage/server/2.6/2016/07/15/15
FILE_FORMAT = (TYPE = JSON);

If you want the entire data copied, you can use this command: SELECT * FROM raw_source;

Step 2: Query the Copied Data

You can query the data using a SELECT statement.

  1. Retrieve device_type with quotes.

You must use src: devic_type to specify a column and the name of the JSON element.

SELECT src:device_type
FROM raw_source;

Result for the following query:

+-----------------+
| SRC:DEVICE_TYPE |
|-----------------|
| "server"        |
+-----------------+
  1. Retrieve device_type value with quotes.

Here, you must use a string data type to typecast the data.

SELECT src:device_type::string AS device_type
FROM raw_source;

Result of the following query:

+-------------+
| DEVICE_TYPE |
|-------------|
| server      |
+-------------+

Step 3: Using Flatten Function

The flatten function derives a lateral view for an object array or variant.

1. Flatten the array object into a Variant Column

You can flatten event objects into the array and convert them into a separate row. The output consists of a VALUE column that stores the data for individual events.  

  • Query the data for each event.
SELECT src:device_type::string,
    src:version::String,
    VALUE
FROM
    raw_source,
    LATERAL FLATTEN( INPUT => SRC:events );
  • To insert the result of the above query, use CREATE TABLE.
CREATE OR REPLACE TABLE flattened_source AS
  SELECT
    src:device_type::string AS device_type,
    src:version::string     AS version,
    VALUE                   AS src
  FROM
    raw_source,
    LATERAL FLATTEN( INPUT => SRC:events );

2. Flatten object keys into separate columns.

You can create a table where each object key is stored in a separate column. You need to type the value of the data using (::); if you don’t, the column assumes it to be a variable type holding any value. 

  • Using CREATE TABLE to create a table named ‘events.’ 
create or replace table events as
select
src:device_type::string as device_type
, src:version::string as version
, value:f::number as f
, value:rv::variant as rv
, value:t::number as t
, value:v.ACHZ::number as achz
, value:v.ACV::number as acv
, value:v.DCA::number as dca
, value:v.DCV::number as dcv
, value:v.ENJR::number as enjr
, value:v.ERRS::number as errs
, value:v.MXEC::number as mxec
, value:v.TMPI::number as tmpi
, value:vd::number as vd
, value:z::number as z
from
raw_source
, lateral flatten ( input => SRC:events );

Step 4: Update Data into the RAW_SOURCE Table from S3 Bucket

  • First, you must identify and add the key constraints to your table. Here, we assume devic_type and value:rv are the primary keys to DEVICTYPE and RV in the events table.

ALTER TABLE events ADD CONSTRAINT pk_DeviceType PRIMARY KEY (device_type, rv);

  • Insert a JSON record into the RAW_SOURCE table. 
insert into raw_source
  select
  PARSE_JSON ('{
    "device_type": "cell_phone",
    "events": [
      {
        "f": 79,
        "rv": "786954.67,492.68,3577.48,40.11,343.00,345.8,0.22,8765.22",
        "t": 5769784730576,
        "v": {
          "ACHZ": 75846,
          "ACV": 098355,
          "DCA": 789,
          "DCV": 62287,
          "ENJR": 2234,
          "ERRS": 578,
          "MXEC": 999,
          "TMPI": 9
        },
        "vd": 54,
        "z": 1437644222811
      }
    ],
    "version": 3.2
  }');
  • Insert the record into the events table using primary keys from the RAW_SOURCE table. The select command allows you to query the following data.
insert into events
select
      src:device_type::string
    , src:version::string
    , value:f::number
    , value:rv::variant
    , value:t::number
    , value:v.ACHZ::number
    , value:v.ACV::number
    , value:v.DCA::number
    , value:v.DCV::number
    , value:v.ENJR::number
    , value:v.ERRS::number
    , value:v.MXEC::number
    , value:v.TMPI::number
    , value:vd::number
    , value:z::number
    from
      raw_source
    , lateral flatten( input => src:events )
    where not exists
    (select 'x'
      from events
      where events.device_type = src:device_type
      and events.rv = value:rv);
  • You must compare JSON keys in all corresponding columns in the RAW_SOURCE table with columns in the events table if no primary keys are present. For more information, refer to the following documentation on update data.

Limitations of JSON Snowflake Data Replication Using S3 Bucket

  • You have to manually update all the data from the S3 table to Snowflake data using large lines of code, which is time-consuming.
  • After updating all the data in your Snowflake Database Table, you need to clean up all the data files you inserted at the internal stage. This can make the process long and tiresome.

Method 4: JSON Snowflake Data Integration Using Insert Statement

Once you understand Snowflake’s different functionalities, you can learn how to query JSON data in Snowflake. 

Prerequisites

  • Before learning to parse JSON in Snowflake, you should understand JSON objects and arrays. 
  • You must understand the Snowflake interface. 
  • You must understand a database and how to create a table inside a database.

To Enter a Single Record of JSON Data Inside a Table

Follow the steps below to insert JSON data into a relational table as a single record.

Step 1: Open your Snowflake Snow Site Web UI to Create a Table

  •  First, you must create a table named json_table with a single column called json_col, which is of the data type “variant.” 
  • Use the following code to create a table – 
JSON Snowflake: Creating a JSON Table

Step 2: Insert a Record in the Table

  • You need to use a SELECT statement and parse_json to insert a record in the table. If you don’t use any statements, an error will appear. If you don’t use any statements, an error will appear.
  • Use the following command to insert the record:

insert into json_tbl(json_col) 

select parse_json (‘{“firstname” : “John” , “empid” : 1001}’);

  • Here’s the result of the following command:
JSON Snowflake

To Enter Multiple Records in the Relational Table 

If you want to insert more than one record from your JSON file, you need to execute the query in the image below:

JSON Snowflake

Result of the following command:

JSON Snowflake

To Enter a JSON String into a Snowflake Database Table 

Use the query below to insert a complex JSON string into a database table. The query consists of a JSON string with a root element employee consisting of nested objects, such as address and phone number.

JSON Snowflake

This is how the result looks inside the database:

JSON Snowflake

To Insert and Access One Single Record From the JSON Table You Created

Step 1: Insert a Single Employee Record from the Json Table Created in the above Step

First, you must define the details of a single employee from the table inside the root element employee and then execute the query using a parse_json function. 

JSON Snowflake

The query result is below. Looking at the right side of the output, you can see Snowflake understands the type of data entered and provides the output accordingly.

JSON Snowflake

Step 2: Access One Single Element from the Object Employee

  • If you want to access a single element inside the employee object, such as the employee’s name or age, you need to insert a ‘:’ column in the code.
  • The syntax of the query should be like this: column name: key value 
  • For example – emp_json: name

Let’s execute the following Select query. While writing the code, you should remember that JSON is case-sensitive. If you are not careful, you might not get the expected result.

JSON Snowflake

Result of the following query- 

JSON Snowflake

Limitations of JSON Snowflake Data Replication Using Insert Statement

  • You can not directly insert data from a JSON file into your Snowflake database just by using a simple insert() function. If you do, it will show an error. 
  • To access the data from a JSON table with variant data types in a structured form, you must manually typecast each column, which is a long and tiresome process.

Use Cases of JSON Snowflake Data Migration

  • Snowflake supports semi-structured data formats such as JSON, making it easy to load and analyze data without complex transformations. 
  • With Snowflake’s built-in query extensions and JSON functions, you can easily manipulate the JSON data to generate desired results.
  • Snowflake has an auto-scaling feature that ensures your overall data performance remains consistent. This feature allows you to allocate additional resources for computing your data and helps you prioritize your work needs.

Learn more about: How to Parse JSON in PostgreSQL

Conclusion

JSON Snowflake data migration allows you to optimize your data management capabilities. Snowflake can store semi-structured JSON data efficiently, and its cloud architecture provides flexibility while you analyze your data. 

Although you can migrate your JSON data to Snowflake in many ways, if you don’t know how to write code for data transformation, you can turn to Hevo. With its cost-effective zero-code pipeline, Hevo quickly automates the data migration process from your source to the destination.

Get Started with Hevo for Free

Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite firsthand.

Share your experience of JSON Snowflake integration in the comments section below!

FAQs (Frequently Asked Questions) 

Q. How can you convert your JSON data into a relational table? 

You can convert your JSON data into a relational table using Hevo’s no-code data pipeline, which helps you load large datasets directly into your destination database. Other methods include loading your data into the Snowflake database or using an insert statement.

Q. How can you upload a JSON file into a Snowflake table?

You can upload your JSON file into a Snowflake table using the PUT command and SELECT statement. After that, you can use SQL queries to upload data to a table or directly load your JSON file in Snowflake using an S3 Bucket.

Saloni Agarwal
Technical Content Writer, Hevo Data

With a strong background in market research for data science and cybersecurity products, Saloni is an expert at crafting informative articles on key topics within the data science domain, such as data transformation, processes, and analysis. Saloni's passion for the field drives her to continually learn and stay abreast of emerging technologies and trends, ensuring her contributions are impactful. Her work aims to enrich the discourse in data science, providing valuable insights and fostering a deeper understanding of complex subjects.

All your customer data in one place.