JIRA is one of the most popular software development tool used by Agile Development teams all across the globe. Teams can plan, release, track, report various aspects of the software development cycle of a project. In this blog, you will find out about various ways to move the data from JIRA to Snowflake data warehouse.
What is JIRA?
It is a platform that effortlessly helps teams to plan, release, track, and report various components regarding a software development life-cycle. The platform also helps the team in setting up various workflows to streamline and optimize the development of the project. The platform can be integrated with various current tools like CI/CD pipelines to monitor them and access them seamlessly.
What is Snowflake?
Snowflake platform is an analytical, flexible, intuitive data warehouse service provided as SaaS. It provides a data warehouse that is quicker, simpler to use, and far more flexible than normal data warehouse offerings.
Snowflake’s data warehouse is not built on an already available database or “big data” software platform such as Apache Hadoop. The Snowflake data warehouse uses a different proprietary SQL database engine with a unique architecture designed for the cloud. To the user, Snowflake may have many similarities to remaining enterprise data warehouses but also has additional functionalities and unique capabilities.
Prerequisites
- The Snowflake platform should be set up. An account for your use should be provisioned before beginning. A user guide for Snowflake can be seen here.
- Python needs to be installed and set up in the system as the blog uses python to manually shift data from JIRA to Snowflake data warehouse. Steps to install can be seen here.
- You need to have an account set up on the JIRA platform, connected to one of your projects, from which the data of the components will be retrieved. More information related to JIRA can found here.
If yours anything like the 1000+ data-driven companies that use Hevo, more than 70% of the business apps you use are SaaS applications Integrating the data from these sources in a timely way is crucial to fuel analytics and the decisions that are taken from it. But given how fast API endpoints etc can change, creating and managing these pipelines can be a soul-sucking exercise.
Hevo’s no-code data pipeline platform lets you connect over 150+ sources in a matter of minutes to deliver data in near real-time to your warehouse. What’s more, the in-built transformation capabilities and the intuitive UI means even non-engineers can set up pipelines and achieve analytics-ready data in minutes.
All of this combined with transparent pricing and 24×7 support makes us the most loved data pipeline software in terms of user reviews.
Take our 14-day free trial to experience a better way to manage data pipelines.
Get started for Free with Hevo!
Method 1: Steps to Load Data from JIRA to Snowflake Manually
Once the prerequisites are met, the following steps need to be taken to complete the transfer of data from JIRA to Snowflake data warehouse:
- Connect to JIRA API. For more information here.
- Read the data from the JIRA platform.
- Push the data values to the Snowflake data warehouse table.
The following code performs the above mentioned 3 steps in python:
# This script shows how to use the client in anonymous mode
# against jira.atlassian.com.
from jira import JIRA
import re
# Logging
import logging
logging.basicConfig(
filename='/tmp/snowflake_python_connector.log',
level=logging.INFO)
import snowflake.connector
import json
# By default, the client will connect to a Jira instance started from the Atlassian Plugin SDK
# (see https://developer.atlassian.com/display/DOCS/Installing+the+Atlassian+Plugin+SDK for details).
# Override this with the options parameter.
options = {"server": "https://jira.atlassian.com"}
jira = JIRA(options)
# Get all projects viewable by anonymous users.
projects = jira.projects()
# Sort available project keys, then return the second, third, and fourth keys.
keys = sorted([project.key for project in projects])[2:5]
# Get an issue.
issue = jira.issue("JRA-1330")
# Find all comments made by Atlassians on this issue.
atl_comments = [
comment
for comment in issue.fields.comment.comments
if re.search(r"@atlassian.com$", comment.author.emailAddress)
]
# Add a comment to the issue.
jira.add_comment(issue, "Comment text")
# Change the issue's summary and description.
issue.update(
summary="I'm different!", description="Changed the summary to be different."
)
# Change the issue without sending updates
issue.update(notify=False, description="Quiet summary update.")
# You can update the entire labels field like this
issue.update(fields={"labels": ["AAA", "BBB"]})
# Or modify the List of existing labels. The new label is unicode with no
# spaces
issue.fields.labels.append(u"new_text")
issue.update(fields={"labels": issue.fields.labels})
# Send the issue away for good.
issue.delete()
# Linking a remote jira issue (needs applinks to be configured to work)
issue = jira.issue("JRA-1330")
issue2 = jira.issue("XX-23") # could also be another instance
jira.add_remote_link(issue, issue2)
projects = jira.projects()
jra = jira.project('JRA')
print(jra.name) # 'JIRA'
print(jra.lead.displayName) # 'John Doe [ACME Inc.]'
components = jira.project_components(jra)
[c.name for c in components] # 'Accessibility', 'Activity Stream', 'Administration', etc.
jira.project_roles(jra) # 'Administrators', 'Developers', etc.
versions = jira.project_versions(jra)
[v.name for v in reversed(versions)] # '5.1.1', '5.1', '5.0.7', '5.0.6', etc.
# to send the project components to a snowflake Datawarehouse for example
# Set your account and login information (replace the variables with
# the necessary values). Note that ACCOUNT might also require the
# region and cloud platform where your account is located, in the form of
# '<your_account_name>.<region_id>.<cloud_platform>' (e.g. 'xy12345.east-us-2.azure')
ACCOUNT = '<your_account_name>'
USER = '<your_login_name>'
PASSWORD = '<your_password>'
import os
# Only required if you copy data from your own S3 bucket
AWS_ACCESS_KEY_ID = os.getenv('AWS_ACCESS_KEY_ID')
AWS_SECRET_ACCESS_KEY = os.getenv('AWS_SECRET_ACCESS_KEY')
# Connecting to Snowflake
con = snowflake.connector.connect(
user=USER,
password=PASSWORD,
account=ACCOUNT,
)
# Creating a database, schema, and warehouse if none exists
con.cursor().execute("CREATE WAREHOUSE IF NOT EXISTS tiny_warehouse")
con.cursor().execute("CREATE DATABASE IF NOT EXISTS testdb")
con.cursor().execute("USE DATABASE testdb")
con.cursor().execute("CREATE SCHEMA IF NOT EXISTS testschema")
# Using the database, schema and warehouse
con.cursor().execute("USE WAREHOUSE tiny_warehouse")
con.cursor().execute("USE SCHEMA testdb.testschema")
# Creating a table and inserting data
con.cursor().execute(
"CREATE OR REPLACE TABLE "
"testtable(col1 integer, col2 string)")
con.cursor().execute(
"INSERT INTO testtable(col1, col2) "
"VALUES(%s,%s,%s,%s)")
# Copying data from internal stage (for testtable table)
con.cursor().execute("PUT file:///tmp/data0/file* @%testtable",[c.name for c in components] )
con.cursor().execute("COPY INTO testtable")
# Copying data from external stage (S3 bucket -
# replace <your_s3_bucket> with the name of your bucket)
con.cursor().execute("""
COPY INTO testtable FROM s3://<your_s3_bucket>/data/
STORAGE_INTEGRATION = myint
FILE_FORMAT=(field_delimiter=',')
""".format(
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY))
# Querying data
cur = con.cursor()
try:
cur.execute("SELECT col1, col2 FROM testtable")
for (col1, col2) in cur:
print('{0}, {1}'.format(col1, col2))
finally:
cur.close()
# Binding data
con.cursor().execute(
"INSERT INTO testtable(col1, col2) "
"VALUES(%(col1)s, %(col2)s)", {
'col1': 789,
'col2': 'test string3',
})
# Retrieving column names
cur = con.cursor()
cur.execute("SELECT * FROM testtable")
print(','.join([col[0] for col in cur.description]))
# Catching syntax errors
cur = con.cursor()
try:
cur.execute("SELECT * FROM testtable")
except snowflake.connector.errors.ProgrammingError as e:
# default error message
print(e)
# customer error message
print('Error {0} ({1}): {2} ({3})'.format(e.errno, e.sqlstate, e.msg, e.sfqid))
finally:
cur.close()
# Retrieving the Snowflake query ID
cur = con.cursor()
cur.execute("SELECT * FROM testtable")
print(cur.sfqid)
# Closing the connection
con.close()
The above code successfully loads the component data of JIRA to Snowflake data warehouse. You can choose what columns and what data that you require, or you can get the whole data itself by creating the corresponding columns in the data warehouse.
PROJECT | PROJECT_COMPONENT | PROJECT_ROLES | PROJECT_VERSIONS |
JIRA | Accessibility | Administrators | 5.1.1 |
The above is one of the example outputs of a Snowflake data warehouse table, which contains the data received from the JIRA API account. The table can have any number of columns of data that are required from the JIRA account. Accordingly, the data table can be changed to reflect the required columns.
Integrate JIRA to Snowflake
Integrate JIRA to BigQuery
Integrate JIRA to Redshift
Limitations of Loading Data from JIRA to Snowflake Manually
Limitations of the above approach are as follows:
- Effort Intensive: Using custom code to move data from JIRA to Snowflake data warehouse requires you to learn and bring together many different technologies. Given the learning curve involved, your data projects’ timelines can be affected.
- Not Real-Time: The process mentioned above does not help you bring data in real-time. You would have to develop a cron job and write extra code to bring data in real-time.
- No Data Transformation: At times, you would encounter use cases where you need to standardize time zones to perform efficient analytics. The mentioned approach does not cover that.
- Constant Monitoring and Maintenance: In the event, there are some changes in the API at JIRA’s end or Snowflake’s end, that will result in irretrievable data loss. Hence, this approach requires constant monitoring and maintenance of the systems involved.
Method 2: Load Data from JIRA to Snowflake Using Hevo
Hevo enables the lowest time to production for such copy operations, allowing developers to focus on their core business logic rather than waste time on the configuration nightmares involved in setting these up.
Sign up here for a 14-Day Free Trial!
With Hevo, bringing data from JIRA to Snowflake data warehouse becomes a cakewalk. Here are the following steps:
- Step 1: Connect Hevo to Jira by providing the Pipeline Name, API Key, User Email ID, and website name.
- Step 2: Complete Jira to Snowflake migration by providing information about your Snowflake database and its credentials such as database name, username, and password, along with information about Schema associated with your Snowflake database.
Hevo takes care of automatically mapping all your data to relevant tables in the Snowflake data warehouse, giving you access to analysis-ready JIRA data in real-time. Hevo also supports pre-built integrations from Jira along with numerous free data sources.
These are more advantages of Hevo:
- Simplicity: Hevo is an extremely intuitive and easy to use platform that does not require prior technical knowledge. With Hevo, you can start pushing data from JIRA to Snowflake data warehouse in just a few clicks.
- Real-time Data: The real-time streaming design structure of Hevo ensures that you move data from JIRA to Snowflake data warehouse immediately, without any delay. In this way, meaningful analysis in real-time can be derived.
- Reliable Data Load: The fault-tolerant architecture of Hevo ensures that your data is loaded consistently and reliably without any loss of data.
- Scalability: Any scale of data can be handled by Hevo. Additionally, data from multiple sources can be loaded to Snowflake data warehouse using Hevo. Both these features of Hevo can help you scale your data infrastructure as per your data needs.
Give Hevo a try by signing up for a 14-day free trial today.
Conclusion
Snowflake is a great data warehouse platform that is very versatile and can be used to aggregate structured data and derives useful insights. JIRA is a great platform to get various user analytics useful to one’s business needs. Depending on the particular use case and data requirement, you may choose to replicate data from JIRA to Snowflake data warehouse table using one of the approaches detailed in this article. You may build a custom code based data pipeline to transfer data from JIRA to Snowflake data warehouse.
Visit our Website to Explore Hevo
Alternatively, you may use an automated ETL tool like Hevo to quickly move data for analysis. Sign Up for a free 14-day trial to give Hevo a try.
Let’s know your experience of connecting JIRA to Snowflake in the comment section below. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.
Sai is a seasoned technical writer with over four years of experience, focusing on data integration and analysis. He is also passionate about DevOps and Machine Learning. Sai crafts informative and comprehensive content tailored to solving complex business problems related to data management while exploring the intersections of these emerging fields.