Skip to main content
Feedback

Setting up Google Cloud Storage as a target

Setting up Google Cloud Storage (GCS) as your target data platform

Overview

Welcome to Getting Started with Data Integration and Google Cloud Storage (GCS).

This guide will show you how to set your Google Cloud credential JSON key, how to create a Google Cloud Storage bucket, and how to get the right credentials for using Google Cloud Storage with Data Integration. In the end of the guide, you will make a quick setup in Data Integration to connect your Google Cloud Storage.

Before you use this guide, ensure you’ve signed up for Google Platform and you have a console admin user.

If you don’t have one of these prerequisites, you can start here .

Create a service account user for Data Integration

Data Integration uses Google Cloud Storage bucket to upload your source data into it. Therefore, there is need to create a user in Google Cloud Platform Console, that will have the access to the relevant bucket and to the relevant BigQuery Project.

So, first of all - let’s create a user for Data Integration.

How do we do that?

  1. Sign into Google Cloud Platform console.
  2. Go to IAM & Admin  > Service account > and click CREATE SERVICE ACCOUNT.
  3. In the wizard:
    1. Set your Service Account name (i.e: Data_Integration User) and click CREATE AND CONTINUE.
    2. Grant the service account acces to the project by setting Roles: Click on the drop down list and select BigQuery Admin Then click the ADD ANOTHER ROLE and do the same process for Storage Admin Copy your Account Service ID / Email from the service account list.  Later, you will use this to enter it in a Data Integration connection.
  4. Now lets create a key for the service account:
    1. Go to the service account screen, locate the serivce you've just created and click on it
    2. In the new service account screen click on Key
    3. Click on add key
    4. Choose key type JSON and click on create
    5. Your JSON secret key will be download. keep it in a safe place.

Enable Cloud Storage and GCS API

  1. Go to API's & Services and click ENABLE APIS AND SERVICE.
  2. Search for Google Cloud Storage JSON API and click Enable API.

Create a Google Cloud Storage Bucket

Data Integration needs a Google Cloud Storage bucket to be a FileZone before your data is loading up to BigQuery. You can either use the FileZone bucket or objects as a base to other Hadoop or Apache Spark operation by Google Data PROC, or by your other services.

So, let's create a Google Cloud Storage bucket for Data Integration:

  1. Sign into Google Cloud Platform Console.
  2. Go to Storage > Browse, and click CREATE BUCKET.
  3. In the wizard: 
    1. Set Bucket Name, for example: project_name_data_integration_file_zone
    2. Set your Bucket to be Regional (Multi-Region is not stable for loading)and choose your preferred location
    3. Click CREATE .

Configure your Google Cloud Storage bucket in Data Integration

Let’s create a new connection for your Google Cloud Storage. Enter your credentials information for Google Platform Service Account.

  1. Connection Name
  2. Project Id  (can be found on Google Platform Home section)
  3. Project Number (optional - can be found on Google Platform Home section)
  4. Service Account Email  - it's Service Account Id that you used to copy the Service Account Wizard.
  5. Region - the region your bucket was created at
  6. Set your custom file zone to save the data in your own staging area (Optional).
  7. Click Test Connection at the bottom to test! Once a valid connection is made, save the connection.

Known issues

  • Sometimes the Storage Admin type user role does not have a certain permission storage.buckets.get given to it by default: In this case, you will have to edit your GCP user roles by duplicating that Storage Admin role by clicking , making sure the custom role you create has the storage.buckets.get permission, then assigning your service account this custom role instead of the Storage Admin (See the Create a Service Account User for Data Integration section of this document).

Conclusion

This guide showed you how to create a Service Account user for Data Integration and Cloud Storage Bucket. You now have a Google Cloud Storage connection that you can use in every river that targets to it and also as a source.

On this Page