Skip to main content
Feedback

PostgreSQL connection

This document offers a detailed guide on how to create a PostgreSQL connection, providing step-by-step instructions.

Prerequisites

  • You must first Whitelist our IPs - IP whitelisting is a security measure that permits network access exclusively to designated IP addresses. Whitelist IPs are only used for sources that have Internet access and a public IP address.
  • Configure PostgreSQL for Data Integration - Establish a user account specifically for Data Integration and assign the necessary privileges to the user.

Configure PostgreSQL for Data Integration

Before integrating PostgreSQL with Data Integration, certain configuration steps need to be followed. Note that the instructions provided here are for Standard extraction. If Change Data Capture extraction is required, refer to the specific configuration documentation.

Create a user for Data Integration

Execute the following SQL command to create a new user for Data Integration, replacing <'myUsername'> with the desired username and <'password'> with an encrypted password:

CREATE USER `<myUsername>` WITH ENCRYPTED PASSWORD '`<password>`';
info

For users considering Change Data Capture usage, please note that special characters such as +, ?, #, &, {, } are not permitted in passwords. However, you can use the following characters: -, ., _, ~, $, and !.

Grant permissions to the user

Grant the new user access to the table or schema from which you want to extract data. Use the following commands, replacing <'database'>, <'schema'>, and <'table'> with the appropriate names:

  • Grant connect privilege on the database:
    GRANT CONNECT ON DATABASE `<database>` TO `<myUsername>`;
  • Grant usage privilege on the schema:
    GRANT USAGE ON SCHEMA `<schema>` TO `<myUsername>`;
  • Grant select privilege on all tables in the schema:
    GRANT SELECT ON ALL TABLES IN SCHEMA `<schema>` TO `<myUsername>`;
    note

    If you wish to assign privileges to specific columns, you can utilize the following SQL script:

    GRANT SELECT (Column_1, Column_2, Column_3, Primary_Key, ...) ON `<schema.table_name>` TO `<myUsername>`;

    Ensure to replace "Column_1" and the others with your chosen column names. If a Primary Key exists in the table, be sure to include it in the script.

  • Set default privileges for future tables in the schema:
    ALTER DEFAULT PRIVILEGES IN SCHEMA `<schema>` GRANT SELECT ON TABLES TO `<myUsername>`;

Establishing a connection

  1. Choose your Connection Name.
  2. Enter Host.
  3. Fill in the Port Number.
  4. Enter your Database name.
  5. Input your Username and Password.
  6. Use the Test Connection function to see if your connection is up to the task. If the connection succeeded, you can now use this connection in Data Integration.

Disconnect handling

There are two approaches to dealing with database disconnects:

  • Optimistic - The optimistic approach to dealing with stale / closed connections is to let SQLAlchemy handle disconnects as they happen, at which point all connections in the pool are invalidated, which means they're assumed to be stale and will be refreshed upon the next checkout. In general, the pessimistic approach adds some overhead but provides a more stable connection, whereas the optimistic approach is faster but is recommended for databases that are less likely to be disconnected or restarted.

  • Pessimistic - This approach entails issuing a test statement on the SQL connection at the beginning of each connection pool checkout to ensure that the database connection is still operational.

SSH connection

Data Integration has the capability to establish connections with on-premises servers by employing a hybrid approach for data processing. While we suggest using the SSH Tunneling Method, it is important to note that it is not mandatory.

Data Integration can establish connections to on-prem servers by employing a hybrid data processing approach. While we suggest utilizing the SSH Tunneling Method, it's essential to know that its adoption is not obligatory.

To use the SSH Tunneling via Data Integration, configure your SSH Tunnel server in your network. This server will have the ability to get SSH connections from Data Integration IPs, and will be able to connect to your database server via the internal IP or address. Data Integration connects to that SSH server using a private key.

After you've configured SSH tunneling, you can set it up in the Data Integration Connection:

image.png

SSL connection

In order to establish a connection, certain PostgreSQL hosts may necessitate the use of an SSL certificate for validation and enabling.

note
  • Utilizing an SSL connection is optional.
  • SSL connections are exclusively available when using PostgreSQL databases hosted on Amazon RDS/Aurora and Google Cloud SQL.

Custom file zone

Setting up a Custom File Zone in Data Integration is an optional feature, with the default option relying on the Managed File Zone provided by Data Integration, which requires no setup. The primary benefit of establishing a Custom File Zone lies in enabling organizations to guarantee that their data is stored within their designated file zones, instead of being stored in Data Integration Managed File Zone.

To use a Custom File Zone, follow the instructions provided below:

  1. Toggle the Custom File Zone to true.
  2. You have the option to choose the preconfigured FileZone Connection by clicking on FileZone Connection (ensure that only Custom File Zone in Amazon S3 is compatible).
  3. Choose a Default Bucket from the drop-down list.
  4. Use the Test Connection function to see if your connection is up to the task. If the connection was successful, click Save.
On this Page