Setting up Salesforce Change Data Capture: 7 Easy Steps

on Tutorials • March 16th, 2020 • Write for Hevo

With the rise in popularity of cloud-based CRM systems, many organizations now prefer these rather than implementing a custom CRM for their use case. Salesforce is one of the most popular cloud-based CRM systems at this point. A challenge that comes with using these cloud-based CRMs is in integrating them with the internal data systems and reporting modules that rely upon getting up to date data from the CRM. The implementation architecture which allows real-time synchronization of data between two such systems is called Change Data Capture. Salesforce provides support for Change Data Capture through its streaming events API.

This post details the steps involved in setting up Salesforce Change Data Capture (Salesforce CDC) and using it to sync data to external database or data warehouse systems.

Table of Contents

Introduction to Salesforce

Salesforce Logo

Salesforce is a cloud-based Customer Relationship Management software that is offered as a service. It helps organizations to manage their leads, process orders, account orders, process invoices, and even in supporting the customers. What makes Salesforce popular is the fact it completely absolves the organizations of doing anything to maintain the infrastructure and software. Salesforce also provides reporting and Analytics capabilities in its application suite.

Beyond the default dashboards and reports, Salesforce also provides an application store from where customers can download dashboards and reports created by third-party developers. If that still does not meet your requirements, it also has facilities for implementing custom reports and dashboards. The Salesforce Sales Analytics App uses all the available information to gather valuable insights from the customers.

To learn more about Salesforce, click this link.

Understanding Salesforce CDC

Salesforce Change Data Capture allows one to receive instant event notifications of changes happening to Salesforce records. These messages are sent to the event bus to which clients can subscribe by specifying the required channel. The client application can then subscribe to these events and perform the required processing to insert the records into an external database. 

Data replication from Salesforce to another database typically involves creating a copy of the complete database and subscribing to these events to frequently sync the changing data. Salesforce change events understand the concept of transactions and have enough metadata for the client to identify the grouping and order of the changes that are part of a single transaction. 

In case you are looking to specifically load data to cloud data warehouses like Redshift, Snowflake or BigQuery or to databases such as PostgreSQL or MySQL, click on the links to explore detailed guides to move data to the respective targets.

Simplify Salesforce ETL with Hevo’s No-code Data Pipelines

Hevo Data, a No-code Data Pipeline helps to transfer data from Salesforce (among 100+ sources) to any destination of your choice to visualize it in your desired BI tool. Hevo is fully managed and completely automates the process of not only loading data from your desired source but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code. Its fault-tolerant architecture ensures that the data is handled in a secure, consistent manner with zero data loss.

It provides a consistent & reliable solution to manage data in real-time and always have analysis-ready data in your desired destination. It allows you to focus on key business needs and perform insightful analysis using BI tools. 

Check out what makes Hevo amazing:

  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects schema of incoming data and maps it to the destination schema.
  • Minimal Learning: Hevo with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
  • Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with very little latency.
  • Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.

Simplify your data analysis with Hevo today! Sign up here for a 14-day free trial!

Steps to Set up & Subscribe to Salesforce Change Data Capture

Salesforce event messages come in JSON format that contains a schema identifier and a payload. The payload key generally contains the type of operation and value of the fields that have changed. You can set up Salesforce Change Data Capture for your customer data using the following steps:

Step 1: To set up Change Data Capture on an object, the developer has to go to the ‘Change Data Capture’ section in the setup tab. Once in the Change Data Capture section, search for the object for which Change Data Capture has to be enabled. All the objects including custom objects will be available in this section. In this section let’s work with a custom object sale_item. The object has two fields – name and description. From the list select the required table and add it to the list of selected entities for Change Data Capture.

Step 2: Once this is done, we need to subscribe to the event channel. For this, Salesforce provides an easy to use Java tool called emp connector that can subscribe to the event and print out the event messages. To get this tool first clone the github repository here – git clone  https://github.com/forcedotcom/EMP-Connector.git

Step 3: Use the below command to build the jar:

java -jar target/emp-connector-0.0.1-SNAPSHOT-phat.jar <username> <password> /data/Employee__ChangeEvent

Step 4: Salesforce events can be accessed by subscribing to a specific URL. In case the developer needs access to all the objects, they can use the /data/ChangeEvents URL. For accessing specific default objects URL will be in the below format.

/data/<Object_Name>ChangeEvent
 
For accessing change data events on custom objects the below format is to be used.

 
/data/<Object_Name>__ChangeEvent
 
In our case, since our object sale_item is a custom object, we will use the below URL

 
/data/sale_item__ChangeEvent

Step 5: Open the shell and use the below command using the jar file to subscribe to events in the above URL.

java -jar target/emp-connector-0.0.1-SNAPSHOT-phat.jar <username> <password> /data/sale_item__ChangeEvent

Here the username and password are the signed-in user credentials. There is also the option to use OAuth based token for authentication. Keep the shell running the above command open for further steps.

Step 6: Now that we have subscribed to the event, let’s create a new sale_item object and see what the event is like. Go to the App launcher and select sale_item. Click New and populate the name and description fields. Click Save. 

Step 7: Go to the terminal where the emp connector is running. You will find messages in the below format.

{   "schema": "-pszPCNGM45tUPU1ftkjxEA",   "payload": {     "LastModifiedDate": "2020-01-25T20:36:12.000Z",     "OwnerId": "005RM000001vI54mYAE",     "CreatedById": "005RM000001vI54mYAE",     "ChangeEventHeader": {       "commitNumber": 65842604581,       "commitUser": "005RM000001vI54mYAE",       "sequenceNumber": 1,       "entityName": "sale_item__c",       "changeType": "CREATE",       "changedFields": [],       "changeOrigin": "com/salesforce/api/soap/47.0;client=SfdcInternalAPI/",       "transactionKey": "00051c2e-a75a-3f97-03fc-cdf4e16d9d3c",       "commitTimestamp": 1569443783000,       "recordIds": [         "a00RM0000114ICTYA2"       ]     },     "CreatedDate": "2020-01-25T20:36:12.000Z",     "name": "IPhone",     "description": "IPhone black with triple camera",     "LastModifiedById": "005RM000001vI54mYAE",   },   "event": {     "replayId": 15053   } }

In the above format changeType field is important since it designates what type of change happened. It can take one among the four values – CREATE, UPDATE, DELETE, UNDELETE. The transaction key and sequence number signify the transaction which contained the specific commit and order of the commit in that transaction. The replayId can be used to replay the same event for up to 3 days.

This is how you can set up Salesforce Change Data Capture.

Let’s now look at some of the challenges that a typical developer will face while setting up Salesforce Change Data Capture manually.

Limitations of the Custom-Code Approach for Salesforce Change Data Capture

  • The logical next step in this flow is to use the above feed and write the required insert queries to insert the records into the target database. And, in most cases that is easier said than done, because it needs an expert developer who knows the intricacies of Salesforce as well as the target database. A way to avoid such a learning curve is to use an easy to implement a solution like Hevo Data which can help set up Salesforce Data Integration with numerous databases or data warehouses in just a few clicks.
  • To create a proper replica that has guaranteed consistency, the client application code must contain a logic to handle the transaction id and sequence number. The logic should monitor the transaction identifier of each message and add all the messages in the same transaction to a container that is saved to the target database once the transaction identifier changes. The sequence number must be verified to ensure the order. The sequence number may have gaps in them in case there are objects that the user does not have permission to access.
  • In case you need to transform the data that is extracted from Salesforce before moving to the target database or data warehouse – you will need to write additional code to achieve this.

Conclusion

This article teaches you how to set up Salesforce Change Data Capture with ease. It provides in-depth knowledge about the concepts behind every step to help you understand and implement them efficiently. These methods, however, can be challenging especially for a beginner & this is where Hevo saves the day. Hevo Data, a No-code Data Pipeline helps you transfer data from a source of your choice such as Salesforce in a fully automated and secure manner without having to write the code repeatedly. Hevo, with its strong integration with 100+ sources & BI tools, allows you to not only export & load data but also transform & enrich your data & make it analysis-ready in a jiffy.

What are your thoughts about setting up Salesforce Change Data Capture? Let us know in the comments?

No-Code Data Pipeline for Salesforce