Azure Data Factory is one of the most popular Cloud-based Data Integration Services that allows you to create, manage, and schedule Data Pipelines or Workflows. It has a rich set of features and functionalities like Two-way Traceability, Code-free Data Synchronization, CI/CD (Continuous Integration and Continuous Delivery), and Pipeline Orchestration.
One such advanced feature of Azure Data Factory is Azure Data Factory Triggers, which allows you to initiate or revoke the Data Pipelines based on a given time interval. With Azure Data Factory Triggers, you can not only schedule and automate Data Pipelines with ease but also track the success and failure rates of active Workflows or Pipelines.
In this article, you will learn about Azure Data Factory, Azure Data Factory Triggers, their types, and steps to create and configure Schedule Triggers to execute Data Pipelines.
Table of Contents
A fundamental understanding of Data Pipelines and Microsoft Azure Data Factory Platform.
What Is Azure Data Factory?
Introduced by Microsoft in 2015, Azure Data Factory is a fully-managed, cloud-based platform that enables users to implement ETL (Extract, Transform, Load), ELT (Extract, Load, Transform), and Data Integration tasks. With Azure Data Factory, you can create Data-Driven Workflow or Pipelines for orchestrating and automating Data Flows and Data Transformation.
Being a Data Integration Service Platform, Azure Data Factory does not internally store data. Instead, it allows you to create and automate Data-Driven Workflow for coordinating the data movement between supported Data Warehouses and downstream computing services in other regions or on-premise environments.
Azure Data Factory (ADF) works based on four critical stages: Connect and Collect, Transform and Enrich, Publish, and Monitor. In the first two stages, ADF connects and fetches data from multiple Data Sources, copies the collected data into a centralized location, and implements Data Processing tasks to clean and enrich the collected data.
Once the data has been transformed into a usable format, it is further used to create and publish Data Pipelines or Workflow. Finally, in the monitoring stage, ADF allows users to track the published Data Pipelines for analyzing the real-time activities, bottlenecks, and workflow errors.
Read more about Azure Data Factory vs Databricks in the following guide: Azure Data Factory vs Databricks: 4 Critical Key Differences.
Hevo Data, a Fully-managed Data Pipeline platform, can help you automate, simplify & enrich your data replication process in a few clicks. With Hevo’s wide variety of connectors and blazing-fast Data Pipelines, you can extract & load data from 100+ Data Sources like MySQL on Microsoft Azure, PostgreSQL on Microsoft Azure, straight into your Data Warehouse, or any Databases.
To further streamline and prepare your data for analysis, you can process and enrich raw granular data using Hevo’s robust & built-in Transformation Layer without writing a single line of code!
Get Started with Hevo for Free
Hevo is the fastest, easiest, and most reliable data replication platform that will save your engineering bandwidth and time multifold. Try our 14-day full access free trial today to experience an entirely automated hassle-free Data Replication!
Azure Data Factory Trigger Types
Azure Data Factory Triggers are used to schedule a Data Pipeline runs without any interventions. In other words, an Azure Data Factory Trigger is a processing unit that determines when to begin or invoke an end-to-end pipeline execution in Azure Data Factory.
Besides the obvious benefit of scheduling the Data Pipeline for future runs, the Azure Data Factory Trigger also provides the ability to choose and process data from the past. Based on the type of the trigger and criteria defined in the specific trigger, the Azure Data Factory Trigger determines when the pipeline execution will be invoked to create a Data-Driven Workflow.
Azure Data Factory Triggers come in three different types: Schedule Trigger, Tumbling Window Trigger, and Event-based Trigger.
This Azure Data Factory Trigger is a popular trigger that can run a Data Pipeline according to a predetermined schedule. It provides extra flexibility by allowing for different scheduling intervals like minute(s), hour(s), day(s), week(s), or month(s).
You can set the start and end dates for the Schedule Trigger to be active so that it only runs a Pipeline based on the given time period. Furthermore, you can also use the Schedule Trigger to run on future calendar days and times, such as the 30th of each month, the first and third Monday of each month, and more.
The Schedule Azure Data Factory Triggers are built with a “many to many” relationship in mind, which implies that one Schedule Trigger can run several Data Pipelines, and a single Data Pipeline can be run by multiple Schedule Triggers.
Tumbling Window Trigger
The Tumbling Window Azure Data Factory Trigger executes Data Pipelines at a specified time slice or pre-determined periodic time interval. It is significantly more advantageous than Schedule Triggers when working with historical data to copy or migrate data.
Consider the scenario in which you need to replicate data from a Database into a Data Lake on a regular basis, and you want to keep it in separate files or folders for every hour or day.
To implement this use case, you have to set a Tumbling Window Azure Data Factory Trigger for every 1 hour or every 24 hours. The Tumbling Window Trigger sends the start and end times for each time window to the Database, returning all data between those periods. Finally, the data for each hour or day can be saved in its own file or folder.
The Event-based Azure Data Factory Trigger runs Data Pipelines in response to blob-related events, such as generating or deleting a blob file present in an Azure Blob Storage. With the Event-based Triggers, you can schedule the Data Pipelines to execute in response to an event from Azure Blob Storage.
In addition, Event-based Triggers are not only compatible with blob, but also with ADLs. Similar to Schedule Triggers, Event Triggers can also work on many-to-many relationships, in which a single Event Trigger can run several Pipelines, and a single Pipeline can be run by multiple Event Triggers.
How to Create and Use Pipeline Executions?
Having been briefed about Azure Data Factory Triggers and their types, in this next section, we will create and configure Schedule Triggers to execute pre-existing Data Pipelines in Azure Data Factory.
For this case, assume that “CopyPipeline_l6c” is a previously created Data Pipeline, which is ready to be scheduled in Azure Data Factory. Since Schedule Triggers can be assigned to many Pipelines (many-to-many relationships), we can assign them to one or more Pipelines that we have created previously.
Follow the steps as described below to implement Azure Data Factory Trigger:
Step 1: To avoid the Data Pipeline failing due to Primary Key problems, you must add a purge or deletion query to the target table of the pipeline named “CopyPipeline l6c” before you start to create Azure Data Factory Triggers.
Step 2: Select “CopyPipeline l6c” from the Pipelines section in the Azure Data Factory workspace. Then, move to the Sink tab and select the Sink Dataset as “DestinationDataset_I6c.”
Step 3: Now, enter the command as “DELETE FROM [SalesLT].[CustomerAddress]” in the Pre-copy script section, as shown in the above image.
Providing a high-quality ETL solution can be a difficult task if you have a large volume of data. Hevo’s Automated, No-Code Platform empowers you with everything you need to have for a smooth data replication experience.
Check out what makes Hevo amazing:
Sign up here for a 14-Day Free Trial!
- Fully Managed: Hevo requires no management and maintenance as it is a fully automated platform.
- Data Transformation: Hevo provides a simple interface to perfect, modify, and enrich the data you want to transfer.
- Faster Insight Generation: Hevo offers near real-time data replication so you have access to real-time insight generation and faster decision making.
- Schema Management: Hevo can automatically detect the schema of the incoming data and map it to the destination schema.
- Scalable Infrastructure: Hevo has in-built integrations for 100+ Data Sources (with 40+ free sources) that can help you scale your data infrastructure as required.
- Live Support: Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
Step 4: In the next step, add a parameter named “StartDt”’ to this Data Pipeline so that you can analyze the variables supplied from the Azure Data Factory Trigger.
Step 5: Now, you are all set to create a new Schedule Trigger. Open the Triggers tab by clicking on the down arrow next to the Factory Resources section. Then, click on the “+New” button, as shown in the above image.
Step 6: A “New Trigger” dialogue box will open, where you have to enter the configuration parameters like Name, Type, and Start Date for your Azure Data Factory Trigger.
In the Name field, you have to provide a unique name to the Trigger. For this case, we named it “WeeklyTrigger_ST.” Then, you have to select the Type of trigger. As we are about to create a Schedule Trigger, click on the corresponding radio button near the Schedule trigger.
In the Start Date field, enter the start date from when the Schedule Trigger should begin to execute your Pipeline. Then, in the Recurrence field, mention how frequently your trigger is supposed to run the Pipeline. For this case, we provided the recurrence or frequency as “Every 1 week” as shown in the image below:
It is not mandatory to set the Advanced Recurrence options. However, you can configure the Advanced Recurrence options to select a particular weekday to run the Pipeline.
You can also provide the specific timings to schedule your Pipeline executions. For setting a specific time, you have to enter values in the Hours and Minutes section, as shown in the image.
Step 7: Now, Azure Data Factory will calculate all possible hour/minute combinations and generate eight-time frames in UTC, such as 12:00 AM, 0:30 AM, 1:00 AM, and 1:30 AM, as shown in the image above.
Step 8: Next, select the No End radio button if you want the Schedule Trigger to continuously run the Pipelines based on the specified time period.
Step 9: Select the “Activated” checkbox to confirm that the Azure Data Factory Trigger gets activated as soon as it is published in the Azure Data Factory.
Step 10: In the next step, you have to assign pre-built pipelines to the newly created Schedule Azure Data Factory Trigger. Select the pipeline named “CopyPipeline_l6c” from the Factory Resources section. Then, click on Triggers from the left side and select New/Edit, as shown in the image.
Step 11: The “Add Trigger” dialog box appears, where you have to select the name of the newly created Schedule Trigger from the drop-down menu.
Step 12: Since Schedule Triggers are capable of executing more than one pipeline, you can also assign the trigger named “WeeklyTrigger_ST” to the pre-built Pipeline named “SQL_AQSL_PL.”
After adding a Schedule Trigger to the two different Pipelines, you have to make sure that the Azure Data Factory Trigger is set to Active.
Step 13: Navigate to the Triggers tab, where you can see the newly created Schedule Trigger named “WeeklyTrigger_ST” which is set to active.
In this article, you learned about Azure Data Factory, Azure Data Factory Triggers, their types, and steps to create and configure Schedule Triggers to execute Data Pipelines. This article mainly focused on configuring the Schedule Triggers to schedule and run pre-built Data Pipelines using Azure Data Factory UI. However, you can also use Azure PowerShell, Azure CLI, Azure Resource Manager Template, .NET SDK, and Python SDK to create and configure Schedule Triggers.
Today, a plethora of organizations rely on Microsoft Azure Data Factory to orchestrate their business processes and create Data Pipelines. Using ADF, organizations can move data from both On-premises and Cloud Source Data Stores to a Centralized Data Store for further analysis. While there are a number of excellent Data Pipeline Tools like ADF available in the market, none comes close to Hevo.
Hevo Data, a No-code Data Pipeline, provides you with a consistent and reliable solution to manage data transfer between a variety of sources and a wide variety of Desired Destinations such as Amazon Redshift, Firebolt, Snowflake, Google BigQuery, PostgreSQL, Databricks, and many more with just a few simple clicks.
Visit our Website to Explore Hevo
Hevo Data with its strong integration with 100+ Data Sources (including 40+ Free Sources) allows you to not only export data from your desired Data Sources & load it to the destination of your choice but also transform & enrich your data to make it analysis-ready.
Hevo also allows the integration of data from Non-native Data Sources using Hevo’s in-built Webhooks Connector. You can then focus on your key business needs and perform insightful analysis.
Want to give Hevo a try? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You may also have a look at the unbeatable pricing, which will assist you in selecting the best plan for your requirements.
Let us know your experience of learning about Azure Data Factory Triggers in the comment box below. We’d be happy to hear your thoughts!