Google Cloud Storage connection
Set up your Google Cloud credential JSON key, create a Google Cloud Storage bucket, and obtain the necessary credentials for using Google Cloud Storage with Data Integration.
Prerequisites
Ensure you have signed up for Google Cloud Platform and have an Admin user for the Google Cloud console. If not, you need to sign up to Google Cloud console.
Creating a service account user for Data Integration
Data Integration uses Google Cloud Storage to upload your source data. Create a user in the Google Cloud Platform Console with access to the relevant bucket and BigQuery Project.
Procedure
-
Log in to Google Cloud Platform console.
-
Go to IAM & Admin.
-
Click Create Service Account.
-
In the Service Account section, set your Service Account name (for example, Data_Integration User) and click Create and Continue.
-
Create an email for the Service Account:
a. Assign the "BigQuery Admin" role.
b. Click "Add Another Role" and assign the "Storage Admin" role.
c. Copy your Service Account Email for later use in Data Integration.
-
Create a key for the service account:
a. Go to the service account page, locate the service you created, and click on it.
b. In the new service account page, click Key.
c. Click Add Key.
d. Choose key type JSON and click create. e. Your JSON secret key is downloaded.
Enable Cloud Storage and GCS API
Procedure
-
Go to API's & Services and click ENABLE APIS AND SERVICE.
-
Search for Google Cloud Storage JSON API and click Enable API.
Creating a Google Cloud Storage bucket
Data Integration needs a Google Cloud Storage bucket to be a FileZone before your data is loaded up to BigQuery. You can use the FileZone bucket or objects as a base for other Hadoop or Apache Spark operations by Google Data PROC, or by your other services. So, create a Google Cloud Storage bucket for Data Integration
Procedure
-
Sign into Google Cloud Platform console.
-
Go to Storage > Browse and click Create Bucket.
-
In the Create Bucket page:
a. Set Bucket Name, example:
project_name_data_integration_file_zoneb. Set your Bucket to be Regional (Multi-Region is not stable for loading) and choose your preferred location.
c. Click Create.
Configuring your Google Cloud Storage bucket in Data Integration
Procedure
-
Navigate to the Data Integration Account.
-
Click Connections and select + New Connection.
-
From the source list, choose Google Cloud Storage.
-
Enter your credentials for the Google Platform Service Account:
a. Connection Name
b. Description (Optional)
c. Project Id: This is available in the Google Platform Home section.
d. Project Number (Optional): This is available in the Google Platform Home section.
e. Service Account email: The Service Account ID used to copy the Service Account Wizard.
f. Choose file: The JSON credentials file generated at the end of the Service Account Wizard.
g. Region: The region of your bucket.
h. Default bucket: The default bucket Data Integration uses(you created).
-
Click Test Connection. If the connection succeeds, save the connection.
If you cannot get a valid connection set up, contact support.
Known issues
Sometimes, the Storage Admin type user role does not have a certain permission storage.buckets.get given to it by default:
In this case, you must edit your GCP user roles by duplicating that Storage Admin role by clicking Create from Role. Ensure that the custom role you create includes the storage.buckets.get permission, and then assign your service account this custom role instead of the Storage Admin.