Many companies build their data analytics, data backup, and operational intelligence infrastructures on Amazon’s web services such as Amazon S3 and Redshift. Since businesses today have a lot of data residing in a multitude of SaaS applications used by different departments within the company, there is a grave need to bring this data into S3 and Redshift in order to build an efficient and effective data infrastructure.
To achieve this, precious developer bandwidth would need to be invested for months to hand-code custom ETL scripts. Moreover, the scripts written would tend to be brittle and error-prone often leading to data leaks. This would leave the company with a bunch of data source connectors and convoluted code repositories that are expensive to maintain.
All of this, in turn, makes it very hard for businesses to unify data across all the different data sources and bring it to a destination like Amazon S3 or Redshift to aid business objectives.
AppFlow was built by Amazon to help businesses simplify this.
What is Amazon AppFlow?
AppFlow is a cloud service by Amazon that helps businesses move data bidirectionally between a limited set of SaaS applications and AWS services such as Amazon S3 and Redshift. Each ETL task set up to move data is called a “Flow”.
Notably, unlike the other ETL offerings by AWS (AWS Data Pipeline and AWS Glue), AppFlow comes with a low-code, visual interface to extract, transform, and load data.
AppFlow is primarily aimed at moving data from SaaS applications such as Salesforce, Slack, Marketo, etc. Since AppFlow is fully-managed, businesses can now be rid of tedious tasks of building custom ETL scripts to transfer this data by themselves.
Amzon AppFlow Features
- Low code interface
AppFlow comes with an easy to use interface to configure the data flow. Anyone with limited technical knowledge can configure a Flow in AppFlow. This eliminates the dependency on the engineering team.
- Data Transformation Abilities
AppFlow supports a few basic data transformation capabilities such as masking and filtering. This too happens on a visual interface and allows businesses to have clean data in S3 and Redshift. Using these features you can mask the PII information from your data, skip invalid data points, and more.
- Data Transfer Scheduling
You can set up flows that run on a set schedule or frequency or trigger a flow when an event occurs. It also has an option to do a one-time data transfer as well. This feature comes handy if your team relies on a periodic refresh of this data from source to destination.
Amazon AppFlow Pricing
AppFlow cost would depend on two factors:
- Number of flows run in a month – AppFlow charges $0.001/flow
- Data processing fee for each flow – This is calculated based on the volume of data processed each month. This charge varies with the region of your S3/Redshift destinations.
In addition to the above, AWS would also levy standard requests and storage charges to read and write from AWS services such as Amazon S3. You can read more on AppFlow pricing here.
Amazon AppFlow Limitations
While there are many advantages to using AppFlow, it is not without its limitations. Here are some of the cons of the tool:
- Limited Data Sources
AppFlow currently supports about 14 data sources. AppFlow does not support bringing advertising data from platforms like Google Ads, Facebook Ads, Twitter Ads, or other business applications such as Intercom, Shopify, HubSpot, and more. This can be a deal-breaker.
- Data Load Limitations
Each flow configured allows transferring data from a single object or entity from your SaaS application. Let us take the example of Salesforce: Salesforce allows you to bring data on over 800+ objects that include opportunities, deals, customer contacts, accounts, and so on. In order to transfer data from these 800+ Salesforce objects, you would need to set up and configure 800+ flows, one-by-one. Phew! Imagine doing this for 10+ SaaS sources that you need to bring data from. That is a lot of work.
- Limited Transformations
Amazon AppFlow’s data transformation capabilities are limited to masking data or filtering out bad data. Complex transformations such as currency conversion or date format standardization are not possible.
- No Incremental Update of Data
Each flow that is run on AppFlow dumps the complete data set from source to destination. If you have a requirement to only the data that has changed from the last data transfer, there is no way to achieve this. If there are a small number of changes in your SaaS applications, this would end up wasting a lot of your AppFlow quota (and money).
- Pricing Discrepancies
With AppFlow, you can easily shoot your costs up if you are not very careful. Flow runs resulting from erroneous flow configurations would be taken as successful flow runs. Additionally, every flow run set up to check for new data will be counted towards the cost, even if no new data is available in the source system for transfer.
Hevo – A Robust Alternative to Amazon AppFlow
Hevo is a No-Code Data Pipeline that helps you load data from any source to any destination in real-time without having to write a single line of code. Hevo helps you overcome all the limitations presented by Amazon AppFlow.
Check out a quick 5-min overview on Hevo here:
Here are more reasons to try Hevo for your data pipeline needs:
- Minimal Setup Time
Hevo can be set up by anyone in the team as there is no learning curve involved. Hevo has a simple point and click visual interface that lets you connect your data source and destination in a jiffy. Your data will be moved to the destination in minutes, in real-time.
- 100s of Data Integrations
Hevo has a large pool of native integrations that includes:
- Sales and marketing applications such as Salesforce, HubSpot, etc.
- Advertising platforms such as Google Ads, Facebook Ads, Twitter Ads, etc.
- Data Analytics Applications such as Google Analytics, Mixpanel, etc.
- Databases such as MySQL, MongoDB, Oracle, PostgreSQL, etc.
This, in turn, makes Hevo the right partner for the data pipeline needs of your growing organization.
- Transfer Data for all Objects in one go
Once the data source is connected, Hevo can extract the data for all entities in a single go. Example: Once you connect Salesforce as a source, Hevo will load all the objects in one shot. This will ensure that you are easily able to load all the data you need.
- Mature Data Transformation Capability
Hevo allows you to enrich, transform, and clean the data on the fly before loading to S3 and Redshift. In addition to the basic data transformations, you can also perform advanced data transformations such as date format standardization, currency conversion, and so on. Hevo also allows you to transform the data after loading it to your destination.
- Automatic Schema Mapping
Once you have connected your data source, Hevo automatically detects the complete schema of the incoming data and maps it to the destination tables. With its AI-powered algorithm, it automatically takes care of data type mapping and adjustments – even when the schema changes at a later point.
- Incremental Data Load
Hevo allows you to transfer the data that has changed or modified since the last data sync. This will ensure that there’s efficient utilization of bandwidth at both your source and destination systems.
- Secure and Reliable Data Integration
Hevo has a fault-tolerant architecture that ensures that the data is moved from the data source to the destination in a secure, consistent, and dependable manner with zero data loss.
While you are evaluating your options for a seamless data pipeline platform, do try out Hevo by signing up for a 14-day free trial here.