Salesforce (Legacy) walkthrough

Sending Salesforce data into a target

important

Salesforce v49 is supported.

Using Data Integration, you can pull data from Salesforce and send that data into your target database.

Procedure

Navigate to the Data Integration Account.
Click Create New River from the top right-hand corner of the Data Integration page.
Choose Data Source to Target as your river type.
In the General Info tab, enter the river name and describe it.
Navigate to the Source tab. Find Salesforce in the list of data sources and select it.
Define a Salesforce connection (this is the connection created earlier in the process).

note

If you do not yet have a Salesforce connection in your Data Integration account, you can create a new connection by clicking Create New Connection.

Pulling data from Salesforce

Salesforce data is organized in tables called entities.
The entities can be regular Salesforce entities or custom ones.
You can pull all the data from a given table or part of it using a filter according to an incremental field. For example, to retrieve the accounts created since 1.1.25, you must retrieve the data according to the createdDate field and only when its value is later than 1.1.25.

Bulk, SOQL, and Metadata

There are three ways to extract data from Salesforce with Data Integration:

Bulk API - The new and preferred way to extract large sets of data from Salesforce has a limitation of 10,000 batches in a 24-hour sliding window.
SOAP API/SOQL - Extract data utilizing the SOAP API; this method tends to be slower than the Bulk API.
Metadata - It is a report that lets you extract metadata information on an entity or multiple entities. Each row in a metadata report represents the definition of a field from the selected entities. The 'pickup values' field holds the naming conventions between the API and the UI for the closed list of pickup values. The Metadata report is useful when comparing API to UI naming conventions.

Additional features for Bulk API mode

The Bulk API has an additional feature to extract the method all only, called PK Chunking.

PK Chunking is an automatic primary key chunking that splits bulk queries on large tables into chunks based on the record IDs, or primary keys of the queried records.
Supported objects: Account, Campaign, CampaignMember, Case, CaseHistory, Contact, Event, EventRelation, Lead, LoginHistory, Opportunity, Task, User, and custom objects. A custom object is any object that ends with an _c. The available range for PK chunking is between 100,000 and 250,000 records per chunk. PK chunking enhances the extraction of large datasets.

Bulk API limitations

Batches for data loads can consist of a single file not larger than 10 MB.
A batch can contain a maximum of 10,000 records.
A batch can contain a maximum of 10,000,000 characters for all the data in a batch.
A field can contain a maximum of 32,000 characters.
10,000 batches in 24 24-hour sliding window limitation.

Configuring a Salesforce river

Choose your River mode from the following:

Multi-Tables: Load multiple tables (entities) simultaneously from Salesforce to your target.
Single: Choose a single table (entity) to load into a single target.

Multi-table mode

Load multiple tables simultaneously from Salesforce to your target. You can choose Bulk API or SOQL API as the Extraction API.

Extraction API

After selecting the extract API, the metadata in the Mapping tab is pulled according to your selection.

note

When switching between these options, the metadata of tables and columns in the Mapping tab will be updated accordingly.

Auto-detect new fields in each run

By default, Data Integration updates the extracted tables metadata before each run execution. New fields are added automatically when pulling the data from the river. Turning off this option makes the river run according to its saved metadata without updating it before executing the data. You can track metadata updates manually by clicking the Reload Metadata in the Mapping tab. When new fields are added to the column mapping, existing target names/data types for mappings will remain unchanged.

Mapping

In the Mapping tab, select the entities to load.

Click the Edit to edit individual entities' table settings.

Table settings in a Multi-table mode

On the Table Settings tab, you can perform the following:

Change the loading mode.
Change the extraction method. If you select Incremental, you can define which field will be used to define the increment.
Filter by an expression used as a WHERE clause to fetch the selected data from the table.
Set PK Chunking for entities that support primary key chunking.
Enable the option to include deleted rows in the extracted data.

Filter

Apply any filter to act as a WHERE clause while pulling the data.

note

Pull all the data without filters from Salesforce, and then filter it using the Logic in Data Integration. If you decide to use the filters in Salesforce, ensure that you maintain the syntax of the filters as supported by Salesforce.
Combine multiple filters using the next operators- AND and OR.
Number and string values must be quoted. Dates and boolean values must not be quoted. For example, billingcountry='United States' OR billingpostalcode='48226' AND isdeleted=FALSE

Include deleted rows

This option includes the rows that were marked as deleted by the Salesforce soft-delete mechanism. In this case, the isDeleted field's value will be True.

warning

Selecting this option can affect the performance of the river.

Entity

Select the entity for which you want to pull the data. Click the input to get a list of all available entities in the given Salesforce account.

Extract method

Using Data Integration, you can pull your data incrementally or pull the entirety of the data that exists in the table:

All: Fetch all data in the table using Chunks.
Incremental: Gives you the option to run over a column in a table.
Incremental field: Click the input to get a list of all available columns in the selected entity. This must be a field with incremental values that can be filtered, such as dates, running numbers, and more.
Incremental type: After selecting the incremental field, select what kind of increment the selected incremental field is. It can be Date, Timestamp, or Running Number.

Time period: Select the time range of data to pull from the selected entity.

note

Data Integration manages the increments over the runs using the Maximum value in the data. You can always retrieve the entire dataset since the last run, which prevents data gaps. You need to configure your river once.

To pull the data from Salesforce:

Select the Start Date: Data Integration pulls only data with the selected incremental field later than this start date.
Select the End Date: Data Integration pulls only data from the selected incremental field earlier than this start date. Leave the End Date field empty to retrieve data until the river runs.
After the river runs, the start date will be updated with the value of the end date, and the end date will be updated with an empty value. The next run will extract data later than the current end date.
Include End Value: Enable it to include records with the end value in the results. If you turn off this checkbox, then those records will be pulled in the next run.

note

The Start Date does not advance if a River run is unsuccessful. If you want to remove this default setting, click More options and select the checkbox to advance the start date even if the River run is unsuccessful (Not recommended).

Interval chunk size

Salesforce may experience difficulties returning large amounts of data, so splitting the requests into smaller ones might be helpful. Doing so will take the given start date and end date and will calculate the smaller period of time to pull the data for.

Important

The results count in both cases (pulling data without interval chunks and with any selected interval chunks) will be the same. Chunking is only for improving the performance of the connection against the Salesforce API.
If using the interval chunks, make sure to pull the date column to identify the time of each record in the results.

Mapping attribute

Select the fields to pull from the selected entity.

Click Auto Mapping to execute the mapping process.
Data Integration samples the data from Salesforce and presents the fields in the given table (regular fields and custom fields). Each field has its own type. This type cannot be changed.
You can search for a specific field in the table and then remove it.
Remove a field or multiple fields: Mark each field by clicking the button to the left of the field and clicking the trash in the top-right corner of the mapping table.
To clear all the fields in the mapping table, click the "V" sign in the top left-hand corner to mark all the fields in the table and then click the Delete icon.
Any field in the mapping table will be pulled from the entity table. If there are unnecessary fields, remove them from the mapping table to improve API performance.
The fields in this mapping table will be copied to the target attributes mapping when clicking Auto Mapping.

Non-queryable objects

Queryable objects is a term used to describe a feature of Salesforce objects that lets you retrieve data through the Salesforce API via queries. Although the majority of standard and custom objects in Salesforce are queryable by default, there are certain exceptions due to security considerations.

Data Integration does not display the non-queryable objects in your Salesforce object list. To learn more about the list of non-queryable objects, refer to the non-queryable objects topic.

Activity logs

The Activity Logs provide an inside perspective on the processes in Salesforce River.