Marc Benioff, together with Parker Harris, Dave Moellenhoff, and Frank Dominguez founded in 1999, and today, Salesforce is one of the most dominant software companies on the planet. Salesforce is a Cloud-Based Customer Relationship Management (CRM) that provides a full-fledged Customer Relationship Management solution to businesses without spending much effort on building one. However, you don’t want your Executives to engage with potential leads and customers without proper context. This is where the Databricks Salesforce integration comes in.

Databricks is a Cloud-based Data Engineering platform founded by the creators of Apache Spark. It is widely used by businesses to seamlessly store, transform, and visualize large amounts of data from various sources. A Databricks Salesforce connection maintains the lead and prospect data up-to-date by automatically synchronizing data from the Data Warehouse/Lake to Salesforce CRM. This makes sure that you’ve all the necessary information and context before engaging with leads and prospects.

This article provides you with a step-by-step guide to establishing a Databricks Salesforce integration for your business. But before getting into the topic of Databricks to Salesforce migration, let’s discuss both the robust platforms in brief.

What is Databricks?

Databricks Salesforce: Databricks Logo | Hevo Data
Image Source

Databricks is a popular Cloud-based Data Engineering platform developed by the creators of Apache Spark. It deals with large amounts of data and allows you to easily extract valuable insights from it. With the main focus on Big Data and Analytics, it also assists you in the development of AI (Artificial Intelligence) and ML (Machine Learning) solutions. Machine Learning libraries such as Tensorflow, Pytorch, and others can be used for training and developing Machine Learning models.

Databricks offers an interactive workspace with a Zero-Management Cloud platform that allows Data Analysts, Developers, and Data Scientists to efficiently interact and extract insights from large volumes of siloed data. DataFrames and Spark SQL libraries can be used to interact with structured data. Furthermore, it can be easily integrated with third-party applications such as BI (Business Intelligence) tools to provide critical insights.

Databricks is widely used across a wide range of industries, including Healthcare, Finance, Media and Entertainment, Retail, etc., to run large-scale production operations.

Key Features of Databricks

Databricks comprises a variety of features that help users work more efficiently on the Machine Learning Lifecycle. Some of the key features of Databricks include:

  • Interactive Notebooks: Databricks’ interactive notebooks provide users with a variety of languages (such as Python, Scala, R, and SQL) and tools for accessing, analyzing, and extracting new insights.
  • Integrations: Databricks can be easily integrated with a variety of tools and IDEs (Integrated Development Environment), including PyCharm, IntelliJ, Visual Studio Code, etc., to make Data Pipelining more structured.
  • Multiple Data Formats: Users can retrieve data in universally accepted formats such as CSV, XML, or JSON by integrating Databricks with other Cloud data storage platforms like Google BigQuery Cloud Storage, Snowflake, and others.
  • Optimized Spark Engine: Databricks allows you to avail the most recent versions of Apache Spark. Equipped with the availability and scalability of multiple Cloud service providers, it is very easy to set up clusters and build a fully managed Apache Spark environment.
  • Delta Lake: Databricks houses an open-source Transactional Storage layer that can be used for the whole data lifecycle. This layer brings data scalability and reliability to your existing Data Lake.
Replicate Data From Salesforce in Minutes Using Hevo’s No-Code Data Pipeline

Hevo Data, a Fully-managed Data Pipeline platform, can help you automate, simplify & enrich your data replication process in a few clicks. With Hevo’s wide variety of connectors and blazing-fast Data Pipelines, you can extract & load data from 150+ Data Sources such as Salesforce straight into Databricks, Data Warehouse or any Databases. To further streamline and prepare your data for analysis, you can process and enrich raw granular data using Hevo’s robust & built-in Transformation Layer without writing a single line of code!


Hevo is the fastest, easiest, and most reliable data replication platform that will save your engineering bandwidth and time multifold. Try our 14-day full access free trial today to experience an entirely automated hassle-free Data Replication!

What is Salesforce?

Databricks Salesforce: Salesforce Logo | Hevo Data
Image Source

For plenty of Entrepreneurs, Business Owners, and Corporations, Salesforce is the king of CRM. Salesforce is the most popular and robust Cloud-Based CRM software designed to support organizations in managing their Sales and Marketing data.

Salesforce will help you to accomplish several Marketing Goals by storing and keeping track of all your Customer Data, Contact Data, and Marketing Leads. You can also generate Sales Forecast Reports with Salesforce to convert your leads. Salesforce also supports Email Integration with applications like Microsoft Outlook, Gmail, etc. They really have just about everything that you could possibly think of when it comes to operations in a business and managing their customers. 

While companies primarily use Salesforce as a CRM tool, Salesforce now offers many more services such as Sales Cloud, Marketing Cloud, Mobile Connectivity, etc. Most companies are modernizing and moving into the Cloud. Salesforce’s services allow businesses to use Cloud Technology to better connect with clients, customers, and partners. This gives them immense flexibility, infinite scale, and a fully connected workforce.

Salesforce is the ultimate solution for your business as it allows for all the necessary elements for your business on an automation platform with embedded intelligence. Salesforce follows a subscription-based model and offers a variety of pricing options, ranging from $25 to about $300 per user every month.

Key Features of Snowflake

Let’s look at some of the key features of Salesforce responsible for its immense popularity:

  • Accounting Management: Salesforce gives companies a complete picture of their customers. At any moment, customers can access Activity Logs, Customer Conversations, Contacts, Internal Account Discussions, and other data.
  • Lead Management: Salesforce helps firms track leads and optimize campaign performance across all marketing platforms. As a result, they will be more equipped to make decisions about how and where to spend their marketing budget.
  • Analytics and Forecasting: Salesforce comprises customizable dashboards that display key performance metrics and reporting capabilities for your business. These analytics can assist you in making more informed business decisions.
  • AI Capabilities: Salesforce enhances its core services with extra features like Application Integration and Artificial Intelligence. This makes it easy to build custom-branded mobile apps, integrate data from ERPs, IoT, and Databases, and leverage predictive capabilities for Service and Sales applications.
  • Exceptional Support: Salesforce has a carefully curated library of industry-specific expert assistance. This comprises Process Flows, Apps, Templates, and Components designed to address any issue the user may have.

You can think of combining your Salesforce data with data coming from various other sources in your Data Warehouse to get valuable insights for your business. You can use automated tools like Hevo Data to integrate data across Salesforce and other sources to a Data Warehouse of your choice.

Also, check out our Ultimate Guide on Salesforce.

Why Migrate from Databricks to Salesforce?

It is not an unusual sight to have multiple disparate sources of data from the Product, Marketing, and Sales departments in a company’s Data Infrastructure. Many organizations deploy a central repository like Databricks to unify and manage all the Data Sources at a single location. A Databricks Salesforce connection allows you to further bring all your customer data from your Data Repository into the hands of your Sales & Support Teams.

You can add incredible value to your organization after successfully establishing the Databricks Salesforce integration. Here’s what you’ll get after connecting Databricks and Salesforce.

  • A Databricks Salesforce connection offers the Sales Teams rich first-party data helping them close new business opportunities.
  • It allows the Marketing Teams to leverage the insightful customer data to create customized, high-quality Email drip campaigns.
  • Product Teams can use valuable data to understand and enhance the customer experience.
  • It also helps in tracking and monitoring the progress of tasks by pushing product data for the Account Managers to know what actions are being taken in the app.
  • A Databricks Salesforce connection reduces churn by synchronizing health scores and churn events to Salesforce CRM for the Account Managers to track.

Now that you’re familiar with Databricks and Salesforce, let’s dive straight into Databricks Salesforce connection.

Databricks Salesforce Integration

Getting data into a Databricks is a fairly simple task. The difficult and less publicized part is getting data out of Databricks and loading it into operational tools like Salesforce. This article on Databricks Salesforce integration will focus on exporting CSV data from Databricks and loading it into Salesforce using Databricks CLI and Salesforce Data Wizard respectively. 

Follow the below-mentioned steps to achieve a Databricks Salesforce migration successfully.

Step 1: Use Databricks CLI to Export CSV

Databricks provides users with a CLI (Command-Line Interface) to interact with Databricks Clusters. You can easily access your CSV file in the Databricks File System (DBFS) using the Databricks CLI and export it to your desired location. You can use the cp command to export the selected CSV files from the DBFS into your local storage.

Run the following command to call the Databricks File System command:

databricks fs

Now, you need the cp command to copy the file. You can use options for additional configuration.

 cp         copies files to or from DBFS
      -r, --recursive

Run the cp command as shown below:

databricks fs cp dbfs:/myfolder/extract.csv ./extract.txt

This command copies the extracted CSV file from the DBFS folder and stores it under the current directory. 

You can add the -r option if your use case requires you to do this recursively for an entire folder.

databricks fs cp -r dbfs:/myfolder .

Note: The dot “.” in the end represents the current local folder.

What Makes Hevo’s ETL Process Best-In-Class?

Providing a high-quality ETL solution can be a difficult task if you have a large volume of data. Hevo’s automated, No-code platform empowers you with everything you need to have for a smooth data replication experience.

Check out what makes Hevo amazing:

  • Fully Managed: Hevo requires no management and maintenance as it is a fully automated platform.
  • Data Transformation: Hevo provides a simple interface to perfect, modify, and enrich the data you want to transfer.
  • Faster Insight Generation: Hevo offers near real-time data replication so you have access to real-time insight generation and faster decision making. 
  • Schema Management: Hevo can automatically detect the schema of the incoming data and map it to the destination schema.
  • Scalable Infrastructure: Hevo has in-built integrations for 150+ sources (with 50+ free sources) that can help you scale your data infrastructure as required.
  • Live Support: Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
Sign up here for a 14-day free trial!

Step 2: Use Salesforce Data Wizard to Import CSV

Now that you’ve successfully exported the Databricks CSV file, it’s time to import the CSV into Salesforce to successfully accomplish the Databricks Salesforce connection. Data Import Wizard is Salesforce’s native feature that allows users to easily import up to 50,000 records at a time. Follow the steps below mentioned steps to upload your CSV file.

  • Go to “Setup” in Salesforce and search for “Data Import Wizard” in the “Quick Find” bar.
Databricks Salesforce: Data Import Wizard
Image Source
  • Click on “Data Import Wizard”.
  • Go through the prompt information and click on “Launch Wizard”.
  • Now, you’ll be prompted to select the type of data that you’re importing. You can either choose “Standard Objects” to import accounts, leads, contacts, solutions, articles, etc or “Custom Objects” to import custom data.
  • Next, decide your type of import: “Add new records”, “Update existing records”, or “Add new and update existing records”.
Databricks Salesforce: Import Data
Image Source
  • Fill in the rest of the required fields depending on your use case. 
  • You will then be prompted to upload your CSV file.
  • You can select a Comma or Tab for the “Values Separated By”.
  • Once you’re done with the configuration, click on “Next”.
Databricks Salesforce: Import Configuration
Image Source
  • Now, for Data Consistency, you need to map the fields between your source CSV data and target data.
  • Review the mapping and start your import.
Databricks Salesforce: Import Status
Image Source

That’s it, with some basic knowledge of Datbricks CLI, you can easily establish a Databricks Salesforce connection. However, this method has its own limitations, not the least of which is the time it takes every time you need to export and load a CSV. You can opt for a third-party solution if you don’t want to spend a lot of time importing CSV and resolving data issues.


Databricks Salesforce integration is a good option to unleash the full potential of your customer data. Having all your customer data in Salesforce allows your Sales & Support Teams to make the most of it by studying and understanding the customer patterns, context, and trends. This article introduced you to Databricks and Salesforce and later provided you with a step-by-step guide to establishing a Databricks Salesforce connection. 

If you are an advanced user of Salesforce, you are most probably dealing with a lot of Data Sources, both internally and from other Software-as-a-Service (SaaS) offerings. Having all this data in a central repository like Databricks helps in analysis and accelerates the Decision-Making Process. Connecting Salesforce to Databricks or some other warehouse destination using a Data Integration tool like Hevo can save you a lot of time and effort.

visit our website to explore hevo

Hevo Data with its strong integration with 150+ Sources & BI tools such as Salesforce, allows you to not only export data from sources & load data to the destinations like Databricks, but also transform & enrich your data, & make it analysis-ready so that you can focus only on your key business needs and perform insightful analysis using BI tools.

Give Hevo Data a try and sign up for a 14-day free trial today. Hevo offers plans & pricing for different use cases and business needs, check them out!

Share your experience of working around Databricks Salesforce connection in the comments section below.

Raj Verma
Business Analyst, Hevo Data

Raj is a skilled data analyst with a strong passion for data analysis and architecture, having a flair for writing technical content as well. With extensive experience in handling marketing data, Raj has adeptly navigated abstract business problems to derive actionable insights that drive significant results.

No-code Data Pipeline For Salesforce Data Integration