BigQuery as a source walkthrough
You can establish a new Data Flow within Data Integration, proceeding to the selection of extraction modes and the administration of your data storage. You can effectively harness Data Integration capabilities to extract data from BigQuery and seamlessly merge it into your data ecosystem.
Prerequisite
Ensure to create a connection for BigQuery source within Data Integration.
Data Flow modes
When using BigQuery as a source, you can select between two Data Flow modes:
- Multi-Tables (Standard extraction)
- Custom Query
Multi-tables (Standard extraction)
This mode in Data Integration combines data from different tables into a single schema before transferring it to the destination. It establishes table relationships to ensure consistent linking and loading. Data Integration Multi-Tables Data Flow mode mainly employs SQL queries for transformations, with scheduling or manual triggering options.
For more information about using Multi-tables (Standard extraction), refer to Databases Data Flow Modes topic.
Custom query
Data Integration Custom Query Data Flow mode empowers you to input data into the platform via personalized SQL queries, offering control over data loading and transformations. You can specify data and transformations precisely, using SQL, pulling from databases or data warehouses. Data Integration then schedules data for automatic or on-demand loading, ensuring real-time data access.
-
When using a custom query, Data Integration treats specific data types as strings. Data Integration encloses the data types labeled as TO_JSON_STRING in double quotes when converting them into strings.
-
When opting for the Incremental Extract mode, you have only 2 options: Datetime and Running number.
To retrieve further details about using Custom Query, refer to Databases Data Flow Modes topic.
Limitations
-
Data export bucket region - The region designated in the 'Region' input for the connection must align with the region of the Data export bucket (Custom Filezone).
-
Project-based connection - Data Integration enables connections on a per-region basis. When dealing with multiple BigQuery projects, establish a distinct connection for each project.
-
Mapping for primary and foreign keys - Data Integration automatically detects mapping match keys for native BigQuery Primary and Foreign key fields only. Data Integration may not recognize custom keys during the mapping process.
-
Partition and cluster fields - The mapping process within Data Integration does not allocate Partition and Cluster Fields.
-
Absence of deleted records indication - Data Integration does not flag removed data, so deleted records do not appear as absent within the Target.