Skip to main content
Feedback

SharePoint walkthrough

The SharePoint connector lets you extract files from Microsoft SharePoint Online document libraries and load them into a supported target. It supports CSV, Excel, JSON Lines, and pass-through file types.

Before you begin, ensure you have configured a SharePoint connection.

Key concepts

Incremental sync: Retrieves only files added or modified since the last Data Flow run. The connector uses Microsoft's delta token mechanism to detect changes. The first run performs a full scan and saves a delta token; subsequent runs use that token to fetch only new or changed files. The connector manages this automatically.

Delta token: A pointer issued by Microsoft Graph that marks the state of a SharePoint document library at a specific point in time. The connector stores and manages this token automatically after each run.

Supported file types

  • CSV: parsed with configurable delimiter, quote character, header rows, and newline delimiter options.
  • Excel (.xlsx)
  • JSON Lines
  • Other: pass-through mode. The connector moves the file to the target as-is without parsing or schema inference. Use the File Pattern field to filter by extension.

Procedure

  1. In Data Integration, click Create Data Flow.
  2. Select Source to Target Flow as your Data Flow type.
  3. In the source list, select SharePoint.
  4. In the Source Connection field, select an existing connection or click New Connection to create one. You can also click Edit to modify the selected connection or Test Connection to verify it.
  5. In the Site URL field, enter the full SharePoint site URL, for example, https://contoso.sharepoint.com/sites/finance. Click the field to load available sites using autocomplete, or type the URL directly.
  6. In the Folder Path field, enter the folder to extract files from, or click the field to browse available folders using autocomplete. Subfolders are included automatically.
  7. In the Extract Method dropdown, select one of the following:
    • All: performs a full sync on every run.
    • Incremental: retrieves only new or changed files since the last run.
  8. Optionally, configure the following fields:
    • Prefix: narrows the scope of extraction to a file path prefix, for example, reports/newReports.
    • File Pattern: filters files by name pattern using * as a prefix or suffix, for example, *.csv.
    • Number of Files to Pull: limits the number of files retrieved per run. Leave empty to retrieve all. Use a specific number for test runs only.
  9. In the File Type toggle, select the format that matches your source files: CSV, Excel, JSON Lines, or Other.
  10. If you selected CSV, configure the following additional options:
    • Delimiter: the character that separates fields. Default: ,
    • Quote Char: the character used to enclose field values. Default: "
    • Header Rows To Skip: the number of header rows to ignore before reading data.
    • Newline Delimiter: the character used to indicate the end of a row.
  11. Select Compressed File if your source files are compressed.
  12. Select Handling Special Characters if your files contain special characters.
    note

    Enabling Handling Special Characters may impact performance due to additional processing.

  13. On the Target tab, select a target from the Select Data Target panel and complete the target-specific configuration.
  14. On the Settings tab, configure the following options as needed:
    • Schedule: click Schedule Me! to set up an automatic run schedule. Schedules run in UTC time.
    • Timeouts: set how long the Data Flow can run before it times out. The default is 12 hours.
    • Notifications: enable email notifications for On Failure, On Warnings, or On Run Time Threshold events.
  15. Enter a name for the Data Flow and click Save.

Configuration reference

FieldRequiredDescription
Source ConnectionYesSelect an existing connection, or click New Connection to create one.
Site URLYesThe full SharePoint site URL, for example, https://contoso.sharepoint.com/sites/finance. Supports autocomplete.
Folder PathYesThe folder to extract files from. Subfolders are included automatically. Supports autocomplete.
Extract MethodNoAll performs a full sync. Incremental retrieves only new or changed files since the last run.
PrefixNoA file path prefix to narrow the scope of extraction, for example, reports/newReports.
File PatternNoA pattern to match specific file names. Use * as a prefix or suffix, for example, *.csv.
Number of Files to PullNoLimits the number of files retrieved per run. Leave empty to retrieve all. Use a specific number for test runs only.
File TypeNoToggle to select the format: CSV, Excel, JSON Lines, or Other.
DelimiterNoVisible when File Type is CSV. The character that separates fields. Default: ,
Quote CharNoVisible when File Type is CSV. The character used to enclose field values. Default: "
Header Rows To SkipNoVisible when File Type is CSV. The number of header rows to skip before reading data.
Newline DelimiterNoVisible when File Type is CSV. The character used to indicate the end of a row.
Compressed FileNoEnable if your source files are compressed.
Handling Special CharactersNoEnable to process files that contain special characters. Enabling this option may impact performance due to additional processing.

Tips and common issues

  • A Global Administrator must grant admin consent during Azure setup. Without this approval, the connector returns a 403 permission error.
  • The Client Secret value is only visible once in the Azure portal. Save it immediately after creation.
  • The Site URL must follow the format https://{tenant}.sharepoint.com/sites/{site-name}.
  • When File Type is set to Other, use the File Pattern field to filter by extension, for example, *.pdf.
On this Page