Bringing your client accounts, product lists, sales, marketing leads, and more from Salesforce to Redshift is the first step in building a strong analytics infrastructure. Combining this data with valuable information from other sources within the warehouse can empower you to derive deeper meaningful insights.
In this article, we will look at two ways of getting data from Salesforce to Redshift. We will also discuss the pros and cons of these approaches and ways to navigate them.
Methods to move data from Salesforce to Redshift
Data can be copied from Salesforce to Redshift in either of two ways:
Method 1: Use a fully-managed Data Integration Platform like Hevo Data
Hevo can move your data from Salesforce to Redshift in minutes without the need for any coding. This is done using an interactive visual interface. Hevo is also fully managed, so you need have no concerns about maintenance and monitoring. This will enable you to focus on producing valuable insights from the data.
Method 2: Write custom ETL scripts to move data
You will need to use engineering resources to write the scripts to get data from Salesforce’s API to S3 and then to Redshift. You will also need to maintain the infrastructure for this and monitor the scripts on an ongoing basis.
Let’s look more closely into both of these methods.
Method 1: Copying your data from Salesforce to Redshift using Hevo
- Authenticate and configure Salesforce data source:
- Configure the Redshift warehouse where you want to move your Salesforce data:
By automating all the burdensome ETL tasks, Hevo will ensure that your Salesforce data is securely and reliably moved to Amazon Redshift in real-time.
Method 2: Copying your data from Salesforce to Redshift using custom scripts
Let’s have a look at what is entailed in this process:
- First, you need to write scripts for your selected Salesforce APIs. Salesforce was one of the first companies to use cloud computing and develop APIs. Their range of APIs is legendary. As you will be looking to keep your data current, you need to make sure your scripts can fetch updated data. You may even have set up cron jobs
- Working in Redshift, you will need to create tables and columns and map Salesforce’s JSON files to this schema. You will also have to make sure each JSON data type is mapped to a data type supported by Redshift
- Redshift is not designed for line by line updates, so using an intermediary such as AWS S3 is recommended. If you choose to use S3, you will need to:
- Create a bucket for your data
- Use Curl or Postman to write an HTTP PUT for your AWS REST API
- Once this has been done your data can be sent to S3
- Finally, you will need to run a COPY command is needed to get your data into Redshift
- This intermediate step is another area you need to monitor. If there are any changes in the Salesforce API your S3 bucket will need to be updated
Challenges to expect when transferring data from Salesforce to Redshift using Custom Code
There are significant downsides to writing thousands of lines of code to copy your data. We all know that custom coding holds the promise of control and flexibility. but we often underestimate the complexity and the cost involved.
The next few paragraphs will give you an understanding of the actual downside of custom coding in this instance:
Your Salesforce APIs will need to be monitored for changes and you will need to stay on top of any updates to Redshift. You will also need a data validation system that ensures your data is replicating correctly. This system should also check if your tables and columns in Redshift are being updated as expected.
These administrative tasks are a heavy load in today’s agile environment where resources are almost always fully occupied and utilized. You will have to use a finite number of engineering resources just to stay on top of all the possible breakdowns. This would leave less scope for new projects to be taken up.
Think about how you would:
- Know if Salesforce has changed an API?
- Know when Redshift is not available for writing?
- Find the resources to rewrite code when needed?
- Find the resources to update Redshift schema in response to new data requests?
Opting for Hevo cuts out all these questions. You will have fast and reliable access to analysis-ready data and you can focus your attention to finding meaningful insights.
Advantages of using Hevo
- Code free ETL or ELT: You need not write and maintain any ETL scripts or cron jobs
- Low set up time: Data is copied in minutes once you have connected Salesforce to Redshift
- 100% Data Accuracy: Hevo reliably delivers your data in real-time from Salesforce to Redshift. It’s AI-powered, the fault-tolerant architecture you will always have accurate and current data readily available
- Automatic Schema Handling: Hevo does automatic schema detection, evolution and mapping. The platform will detect any change in incoming Salesforce schema and make necessary changes in Redshift
- Granular Activity Log and Monitoring: Your data flow is monitored in real-time, detailed activity logs are kept. You will also get timely alerts on Slack and email with status reports of data replication, detected schema changes, and more. Hevo’s activity log will let you observe user activities, transfer failures – successes, and more
- Unmatched support via email 24×7 and on Slack
There is a huge amount of flexibility you get from building your own custom solution to move data from Salesforce to Redshift. However, this comes with a high and ongoing cost in terms of engineering resources.
Hevo is a fault-tolerant, dependable Data Integration Platform. With Hevo you will work in an environment where you can securely move data from any source to any destination. In addition to Salesforce, you can load data from 100s of other sources using Hevo. The full range of applications, databases, and tools Hevo can integrate with is listed here: (www.hevodata.com/intergrations).
Hevo makes data replication from Salesforce to Redshift a cakewalk. Sign up for a 14-day free trial here and see for yourself.