Many companies build their Data Analytics, Data Backup, and Operational Intelligence infrastructures on Amazon’s web services such as Amazon S3 and Redshift. Since businesses today have a lot of data residing in a multitude of SaaS applications used by different departments within the company, there is a grave need to bring this data into S3 and Redshift to build an efficient and effective data infrastructure. This is why Amazon AppFlow was built to simplify this task.
This post will evaluate Features, Pricing, Limitations, etc. Furthermore, it will provide a step-by-step method of connecting AppFlow to CRMs like Salesforce. Read along to find more about this popular tool.
Introduction to Amazon AppFlow
It is a cloud service by AWS that helps businesses move data bidirectionally between a limited set of SaaS applications and AWS services such as Amazon S3 and Redshift. Each ETL task set up to move data is called a “Flow”.
Notably, unlike the other ETL offerings by AWS (AWS Data Pipeline and AWS Glue), AppFlow comes with a low-code, visual interface to extract, transform, and load data.
AppFlow is primarily aimed at moving data from SaaS applications such as Salesforce, Slack, Marketo, etc. Since AppFlow is fully managed, businesses can now be rid of the tedious tasks of building custom ETL scripts to transfer this data by themselves.
Features of Amazon AppFlow
It provides various the following features to its users. Some of these features are:
1) Low code interface
AppFlow comes with an easy-to-use interface to configure the data flow. Anyone with limited technical knowledge can configure a Flow in AppFlow. This eliminates the dependency on the engineering team.
2) Data Transformation Abilities
AppFlow supports a few basic data transformation capabilities such as masking and filtering. This too happens on a visual interface and allows businesses to have clean data in S3 and Redshift. Using these features you can mask the PII information from your data, skip invalid data points, and more.
3) Data Transfer Scheduling
You can set up flows that run on a set schedule or frequency or trigger a flow when an event occurs. It also has an option to do a one-time data transfer as well. This feature comes in handy if your team relies on a periodic refresh of this data from source to destination.
Pricing of Amazon AppFlow
AppFlow cost would depend on 2 factors:
- Number of flows run in a month – AppFlow charges $0.001/flow
- Data processing fee for each flow – This is calculated based on the volume of data processed each month. This charge varies with the region of your S3/Redshift destinations.
In addition to the above, AWS would also levy standard requests and storage charges to read and write from AWS services such as Amazon S3.
You can read more on AppFlow pricing here.
Connecting AppFlow to Salesforce
It can be connected to CRM’s for data transfer. One such CRM is Salesforce. Following are the steps that you need to follow to set up your AppFlow to Salesforce integration:
Step 1: Create Salesforce Login in AppFlow
The very first step requires you to create a login connection for Salesforce in Amazon AppFlow. For this, visit your AWS account and select the Create Flow option on the AppFlow page. This is shown in the below image.
Now, in the Specify flow details option, enter a name for your flow in the provided Flow name field. After that, go to the Source details field in the Configure flow option, choose Salesforce as your data source as shown in the below image.
Step 2: Manage the App Policies in Salesforce
You must ensure that the right app policies are activated for Amazon AppFlow in your Salesforce account. Visit your Salesforce account, and select the Refresh token is valid until revoked option in the Oath policies section. This is shown in the below image.
Step 3: Handle IP Restrictions in Salesforce
The Salesforce account usually has a list of IPs that are allowed to make a connection with Salesforce. You need to ensure that all Amazon AppFlow IP CIDR blocks of your AWS account are present in that list of Salesforce.
Also, allow the Change Data Capture(CDC) in your Salesforce account to start the flow triggers. You can find this in the Setup field under the Quick Find box as shown in the below image.
Finally, click Save and your AppFlow to Salesforce connection is ready!
Limitations of Amazon AppFlow
While there are many advantages to using AppFlow, it is not without its limitations. Here are some of the cons of the tool:
- Limited Data Sources: AppFlow currently supports about 14 data sources. AppFlow does not support bringing advertising data from platforms like Google Ads, Facebook Ads, Twitter Ads, or other business applications such as Intercom, Shopify, HubSpot, and more. This can be a deal-breaker.
- Data Load Limitations: Each flow configured allows transferring data from a single object or entity from your SaaS application. Let us take the example of Salesforce: Salesforce allows you to bring data on over 800+ objects that include opportunities, deals, customer contacts, accounts, and so on. To transfer data from these 800+ Salesforce objects, you would need to set up and configure 800+ flows, one by one. Phew! Imagine doing this for 10+ SaaS sources that you need to bring data from. That is a lot of work.
- Limited Transformations: Amazon AppFlow’s data transformation capabilities are limited to masking data or filtering out bad data. Complex transformations such as currency conversion or date format standardization are not possible.
- No Incremental Update of Data: Each flow that is run on AppFlow dumps the complete data set from source to destination. If you have a requirement to only the data that has changed from the last data transfer, there is no way to achieve this. If there are a small number of changes in your SaaS applications, this would end up wasting a lot of your AppFlow quota (and money).
- Pricing Discrepancies: With AppFlow, you can easily shoot your costs up if you are not very careful. Flow runs resulting from erroneous flow configurations would be taken as successful flow runs. Additionally, every flow run set up to check for new data will be counted towards the cost, even if no new data is available in the source system for transfer.
Conclusion
The article discussed above in great detail. It introduced you to AppFlow and explained the 3 essential aspects of this ETL tool: Features, Pricing, and Limitations. Moreover, the article provided a step-by-step process to connect AppFlow to CRM like Salesforce. Now, Amazon AppFlow although is an efficient tool, still faces certain limitations which were discussed above.
Share your experience of this blog in understanding Amazon AppFlow in the comments section!
Satyam boasts over two years of adept troubleshooting and deliverable-oriented experience. His client-focused approach has enabled seamless data pipeline management for numerous SMEs and Enterprises. Proficient in Hevo’s ETL architecture and skilled in DBMS sources, he ensures smooth data movement for clients. Satyam leverages automated tools to extract and load data from various databases to warehouses, implementing SQL principles and API calls for day-to-day troubleshooting.