SharePoint walkthrough

note

The SharePoint connector is currently in Early Access. For more information, refer to Feature release stages.

The SharePoint connector lets you extract files from Microsoft SharePoint Online document libraries and load them into a supported target. It supports CSV, Excel, JSON Lines, and pass-through file types.

Before you begin, ensure you have configured a SharePoint connection.

Key concepts

Incremental sync: Retrieves only files added or modified since the last Data Flow run. The connector uses Microsoft's delta token mechanism to detect changes. The first run performs a full scan and saves a delta token; subsequent runs use that token to fetch only new or changed files. The connector manages this automatically.

Delta token: A pointer issued by Microsoft Graph that marks the state of a SharePoint document library at a specific point in time. The connector stores and manages this token automatically after each run.

Supported file types

CSV: parsed with configurable delimiter, quote character, header rows, and newline delimiter options.
Excel (.xlsx)
JSON Lines
Other: pass-through mode. The connector moves the file to the target as-is without parsing or schema inference. Use the File Pattern field to filter by extension.

Procedure

In Data Integration, click Create Data Flow.
Select Source to Target Flow as your Data Flow type.
In the source list, select SharePoint.
In the Source Connection field, select an existing connection or click New Connection to create one. You can also click Edit to modify the selected connection or Test Connection to verify it.
In the Site URL field, enter the full SharePoint site URL, for example, https://contoso.sharepoint.com/sites/finance. Click the field to load available sites using autocomplete, or type the URL directly.
In the Folder Path field, enter the folder to extract files from, or click the field to browse available folders using autocomplete. Subfolders are included automatically.
In the Extract Method dropdown, select one of the following:
- All: performs a full sync on every run.
- Incremental: retrieves only new or changed files since the last run.
Optionally, configure the following fields:
- Prefix: narrows the scope of extraction to a file path prefix, for example, reports/newReports.
- File Pattern: filters files by name pattern using * as a prefix or suffix, for example, *.csv.
- Number of Files to Pull: limits the number of files retrieved per run. Leave empty to retrieve all. Use a specific number for test runs only.
In the File Type toggle, select the format that matches your source files: CSV, Excel, JSON Lines, or Other.
If you selected CSV, configure the following additional options:
- Delimiter: the character that separates fields. Default: ,
- Quote Char: the character used to enclose field values. Default: "
- Header Rows To Skip: the number of header rows to ignore before reading data.
- Newline Delimiter: the character used to indicate the end of a row.
Select Compressed File if your source files are compressed.
Select Handling Special Characters if your files contain special characters.
note
Enabling Handling Special Characters may impact performance due to additional processing.
On the Target tab, select a target from the Select Data Target panel and complete the target-specific configuration.
On the Settings tab, configure the following options as needed:
- Schedule: click Schedule Me! to set up an automatic run schedule. Schedules run in UTC time.
- Timeouts: set how long the Data Flow can run before it times out. The default is 12 hours.
- Notifications: enable email notifications for On Failure, On Warnings, or On Run Time Threshold events.
Enter a name for the Data Flow and click Save.

Configuration reference

Field	Required	Description
Source Connection	Yes	Select an existing connection, or click New Connection to create one.
Site URL	Yes	The full SharePoint site URL, for example, `https://contoso.sharepoint.com/sites/finance`. Supports autocomplete.
Folder Path	Yes	The folder to extract files from. Subfolders are included automatically. Supports autocomplete.
Extract Method	No	All performs a full sync. Incremental retrieves only new or changed files since the last run.
Prefix	No	A file path prefix to narrow the scope of extraction, for example, `reports/newReports`.
File Pattern	No	A pattern to match specific file names. Use `` as a prefix or suffix, for example, `.csv`.
Number of Files to Pull	No	Limits the number of files retrieved per run. Leave empty to retrieve all. Use a specific number for test runs only.
File Type	No	Toggle to select the format: CSV, Excel, JSON Lines, or Other.
Delimiter	No	Visible when File Type is CSV. The character that separates fields. Default: `,`
Quote Char	No	Visible when File Type is CSV. The character used to enclose field values. Default: `"`
Header Rows To Skip	No	Visible when File Type is CSV. The number of header rows to skip before reading data.
Newline Delimiter	No	Visible when File Type is CSV. The character used to indicate the end of a row.
Compressed File	No	Enable if your source files are compressed.
Handling Special Characters	No	Enable to process files that contain special characters. Enabling this option may impact performance due to additional processing.

Tips and common issues

A Global Administrator must grant admin consent during Azure setup. Without this approval, the connector returns a 403 permission error.
The Client Secret value is only visible once in the Azure portal. Save it immediately after creation.
The Site URL must follow the format https://{tenant}.sharepoint.com/sites/{site-name}.
When File Type is set to Other, use the File Pattern field to filter by extension, for example, *.pdf.