Google BigQuery as a target

Configure Google BigQuery as a target in Source to Target Rivers to load data from supported source systems.

Prerequisites

Before you begin, ensure that you have:

A valid Google BigQuery connection
The target dataset already exists in BigQuery
Permission to create and load tables in the dataset

Refer to Create a Google BigQuery connection topic to establish a connection.

Configuring Google BigQuery as the target

Target
Schema

Use the Target tab to specify where data is loaded in Google BigQuery and to configure optional staging and metadata settings.

Navigate to Target tab and select Google BigQuery (Target).
Under Target Connection, select your BigQuery connection.
Click Test Connection to ensure the target credentials are valid.
In Data Loading Settings, select the destination Dataset ID from the drop-down menu.
Click Advanced Settings to expand the configuration panel.
- (Optional) Turn on Add Rivery Metadata to include tracking columns like Rivery_last_update in your target table. Configure the Custom File Zone (Staging)
- Custom File Zone:
  - Bucket Name: Select the cloud storage bucket used for intermediate staging. For example, data-pipeline-staging.
  - File Zone Path: Define the directory structure using variables or text. For example, {river_name}_{river_id}.
  - File Zone Folders Period Partition: Select the folder partitioning frequency, such as By Day, from the drop-down menu.

Configure how Data Integration retrieves data from your selected source based on your synchronization requirements. At this stage, select the extraction method to use:

Standard Extraction

Standard Extraction is suitable for scheduled or periodic data loads.

Standard extraction flow

In the Schema tab, select Standard Extraction.
In the left panel, search for Schemas (table services).
Select the tables you want to load.
For each selected table, configure the required options as described in Common table configuration.

Change Data Capture (CDC)

CDC provides near–real-time replication by capturing INSERT, UPDATE, and DELETE events from SAP.

Configure CDC migration

In the Schema tab, select CDC (Change Data Capture).
When prompted, configure Initial Migration Process – Default Settings:
- Enable Initial Migration (recommended):
  Performs a one-time full load of historical data. Automatically disables after the River runs successfully.
- Skip Initial Migration:
  Skips historical data and captures only new changes going forward.
Click Continue.

Select tables for CDC

Select the tables to load
For each selected table, configure the required options as described in Common table configuration.

Common table configuration

note

Applies to both Standard Extraction and CDC extraction methods.

Configure table mapping

Use this procedure to map source columns to the target table and configure match keys, partitioning, and clustering.

Procedure

From the table configuration page, select the Mapping tab.
Review the source columns and verify the Target Column Name, Type, and Mode.
Clear the checkbox for any column you do not want to load to the target.
(Optional) Use Search to locate a column or click Reload Metadata to refresh the column list.
To define match keys:
- Click Match Key.
- Select one or more columns and move them to Match Key Columns.
- Ensure at least one match key is defined when using Upsert Merge loading mode.
To configure partitioning:
- Locate a column of type DATE or TIMESTAMP.
- Select the Partition checkbox.
- In Partition Granularity, select Year, Month, Day, or Hour (Hour is available only for TIMESTAMP columns).
To configure clustering:
- Click Cluster.
- Select up to four columns and move them to the cluster list.
- Arrange the columns in the required order.
(Optional) Click Add Calculated Column to define a derived column.
Click Apply Changes.

Configure table source settings

To define how data is extracted from the selected source table, perform the following steps:

Procedure

From the table configuration page, click the Table Source Settings tab.
In the Extraction Method section, select one of the following:
- Incremental – Extracts only new or updated records based on the configured incremental field.
- All – Extracts all records from the table.
In the Exporter Chunk Size field, specify the number of records to export per chunk.
(Optional) In the Filters field, enter a filter expression to limit the data extracted from the source.
Click Apply Changes to save the table source settings.

Configure table target settings

Use this procedure to define how data is written to the target table.

Procedure

From the table configuration page, click the Table Target Settings tab.
In the Target Table Name field, enter the table name.
Under Split Tables By , select one of the following:
- Don’t Split: Loads all data into a single target table.
- Timestamp: Splits data into multiple target tables based on a timestamp value.
- Expression: Splits data based on a custom expression.

In the SQL Dialects, select the SQL dialect to use:

Standard SQL (default)
Legacy SQL (if supported by the target system)

Switching between legacy SQL and standard SQL

Switching between Legacy SQL and Standard SQL

Legacy SQL and Standard SQL differ in supported features, syntax, and behavior. These differences must be carefully considered when switching between dialects.

Table reference syntax

Standard SQL uses the following format:

bigquery-public-data.samples.shakespeare

For more information, refer to the BigQuery documentation on migrating from Legacy SQL to Standard SQL.

RECORD (STRUCT) fields

In Standard SQL, columns of type RECORD are automatically flattened. In Legacy SQL, RECORD columns are not flattened.

When switching from Legacy SQL to Standard SQL:

A single RECORD column is expanded into multiple flattened columns.
The resulting table schema may change.

If your target table already exists, do not switch back to Legacy SQL mode.

Append and merge considerations

Append and merge operations may behave differently between SQL dialects. Differences in record handling can result in unexpected behavior or data loss.

If a table contains partitions or clusters:

Queries using Legacy SQL syntax fail.
Rivers configured with Legacy SQL cannot query partitioned or clustered tables.

Record handling in Standard SQL

When using Standard SQL, record handling depends on whether the logic step is mapped and whether flattening is enabled.

Behavior matrix

Logic step status	Flattening disabled	Flattening enabled
Logic step not mapped	Records are loaded as-is	You are prompted to map the step
Logic step mapped	You are prompted to modify mapping or disable flattening	Records are flattened

Behavior summary

If the logic step is not mapped and flattening is disabled, records are loaded as-is.
If the logic step is not mapped and flattening is enabled, mapping is required.
If the logic step is mapped and flattening is enabled, records are flattened.
If the logic step is mapped and flattening is disabled, you must modify the mapping or disable flattening.

Creating or modifying partitions and clusters on an existing table

BigQuery does not support modifying partitioning or clustering on an existing table. To change these settings, the table must be dropped and recreated.

Procedure

Step 1: Create a temporary table

Create a temporary table containing the existing data and define the desired partitioning and clustering.

CREATE TABLE <dataset_name>.<tmp_table_name>
PARTITION BY TIMESTAMP_TRUNC(<timestamp_column>, {DAY | HOUR | MONTH | YEAR})
CLUSTER BY <column1>, <column2> AS
SELECT *
FROM <dataset_name>.<table_name>;
```
Example:

CREATE TABLE dwh.temp_users
PARTITION BY TIMESTAMP_TRUNC(inserted_at, DAY)
CLUSTER BY user_id AS
SELECT *
FROM dwh.users;

Step 2: Drop the original table

After confirming that the data exists in the temporary table, drop the original table.

Until this process is complete, the original table is unavailable.

DROP TABLE <dataset_name>.<table_name>;

Example:

DROP TABLE dwh.users;

Step 3: Copy the temporary table

Using the BigQuery console:

Select the temporary table in the Resources pane.
Click Copy table.
In the Destination section:

Choose the destination dataset
Enter the original table name

Complete the copy operation.

Step 4: Drop the temporary table

DROP TABLE <dataset_name>.<tmp_table_name>;

Example:

DROP TABLE dwh.temp_users;

Enable Override Default Target Settings to customize target behavior for this table.
In the Table Loading Mode field, select how data is loaded into the target:
- Upsert Merge (Default): Matches rows based on the selected key columns and updates existing records or inserts new ones.
- Append Only: Appends incoming data to the target table without updating existing rows.
- Overwrite: Replaces the existing data in the target table with the incoming data during each load.
note
If no key columns are selected, the system automatically applies Append Only loading mode.

Configure merge method for a target table

To define how records are merged into the target table when Upsert Merge loading mode is selected.

In the Merge Method field, select one of the following options:
- Merge (Default): Inserts new rows when they do not exist in the target and updates existing rows using a MERGE clause.
  warning
  The values in the selected Merge Key columns must be unique within the source data. If duplicate keys are present in the source, the BigQuery river fails with the following error: UPDATE/MERGE must match at most one source row for each target row.
- Switch – Merge: Replaces the entire target table with the incoming data during the load operation.
Click Apply Changes to save the merge configuration.
warning
When using Merge (Optimized with Partition Pruning), updating a partition column value can create duplicate records because the original row may exist in a different partition.
Recommendation:
- Avoid changing partition column values.
- If partition changes are required, ensure only one record exists per key, or use Switch-Merge to rewrite the table.