Skip to main content
Feedback

Configuring Python Logic Step

Python provides quick and easy data manipulation. This feature is part of the Logic River type, which lets you create a data transformation process using steps, variables, containers, conditions, loops, and actions.

info

Data Integration Python feature is enabled for Enterprise, Pro Plus, and Professional customers by default. To get access, contact your Data Integration representative directly or connect with our team via the Contact Us.

Use cases

  • Transformation of data.
  • Initiate analytic process (Tableau extract, Sisense Elasticube, etc.).
  • Use Python to connect to APIs (not possible via Action River).
  • Run a Python script to start the machine learning process.

Supported version and packages

Data Integration currently only supports Python 3.8.4. In the future, more versions will be supported.

Packages

By default, Data Integration handles the installation of the most recent packages:

  1. NumPy
  2. Pandas
  3. Matplotlib
  4. Regex
  5. Requests
  6. wordcloud
  7. Scikit Learn
  8. Sqlalchemy
  9. Tableau api lib
  10. Tableau server client

If you wish to install another package, simply add it to the Install Additional Packages section:

Installing additional packages requires the following syntax:

package-name==version
warning
  • A package already in the Packages list cannot be updated or degraded.
  • If the package version is left blank, the most recent version identified in pip will be installed.
  • Packages that require additional drivers or other system requirements to be installed on Linux are not supported.

Working with variables

Python features dynamic variables. These can be used to link Rivers, automate activities, and communicate data.

Procedure

  1. Navigate to the Data Integration console.

  2. Click the River tab from the left-hand menu.

  3. Click ADD River and choose Logic River or the existing one from the list.

  4. Select the Variables tab in the upper right corner.

  1. Set up a variable (temp is an example).
note
  • To create an array, select the Contains Multiple Values checkbox.
  • If the option Clear Value On Start is selected, the variable's value will always be updated to the original value when a new River is formed, regardless of what happens to it during a River.
  • Multiple variables can be defined.
  • Users can encrypt sensitive values stored in River variables.
  1. Click Apply Changes.
  2. To use the variable(s), run this script:
from rivery_variables import temp
print(temp)
info

Run this script if you're working with numerous variables:

from rivery_variables import temp, variable2, variable3
print(temp, variable2, variable3)
  • Use this script to save and overwrite the variable value (only strings):
from rivery_variables import temp
temp.save("insert here new content of variable")
  • Run this script if you wish to concatenate multiple variables (only strings):
from rivery_variables import variable1, variable2
concatenated_variable = f'{variable1}_{variable2}'
  • In the Logic River Variables window, use square brackets to save an array as the variable's value, then select the Contains Multiple Values checkbox.

If you want to use the variable as an array, run this script (only arrays):

from data_in_variables import temp
temp.save([5,6,7,8])
note

When using a Multiple Value variable, you must only use square brackets to contain an array. Quotation marks (as with strings) are not an option.

  • This script runs a loop over a variable with multiple values.

Variables: image.png

Script:

from rivery_variables import temp, temp_Dict
for var in temp:
print(var)

for key, value in temp_Dict.items():
print(key)
print(value)

Encrypted variables

Users have the option to encrypt sensitive data kept in River variables.

The variable's value is encrypted and hidden when the Encrypt option is used. In-line Python code cannot decrypt encrypted variables, and encrypted variables cannot store new values.

To learn more, refer to the Encrypted Variables.

How to use environment variables

Environment variables can be used throughout the entire platform. Regarding Python steps, Environment variables are read-only variables; they can only change their values in the Variables tab of the main menu, not anywhere along the River.

Procedure

  1. Navigate to the Data Integration console.

  2. Click the River tab from the left-hand menu.

  3. Click ADD River and choose Logic River or the existing one from the list.

  4. Click the Variables tab in the main menu.

  1. Click + Add Variable.
  2. Add a value to the variable.
note

To create an Array, use square brackets to save the value as a variable.

  1. To use the variable(s), run this Python script:
   from rivery_environment_variables import Test
print(Test)
note
  • If you wish to import both Environment and River variables, and a variable with the same name exists, only the second importing command will be carried out. Here's an Example:
from rivery_variables import Temp_Var 
from rivery_environment_variables import Temp_Var

Only the command from rivery_environment_variables import Temp_Var will be carried out.

  • If this is the case and the River and Environment variables share the same name, you can add aliases to these variables and specifically use them in that Logic step. Other logic steps won't recognize these aliases. Here's an example:
from rivery_variables import Temp_Var as Temp1
from rivery_environment_variables import Temp_Var as Temp2

print(Temp1)
print(Temp2)
  • Environment variables in Python logic steps are "read-only" variables; their values cannot be changed or saved.

Resources

Python lets manipulation of vast volumes of data, which is why Data Integration offers seven distinct resource types:

image.png

Each logic step is linked with a resource type, which can be found here: image.png

note

Before a Python script is run, there is a one-minute server startup period.

Logs

The Logic river will handle all of the Python logs and post them to the Logic step's logs.

To get the log, perform the following:

  1. Select Activity in the upper left corner.
info

You can also find the log in the main menu's Activities tab by searching for the name of your River. Follow the on-screen instructions:

  1. Click arrow.

  2. Click Download Logs.

note
  • Data Integration does not modify the logs, therefore they will be displayed exactly as they are from the Python code.
  • Python logs guidelines.

Python pricing

The BDU (Boomi Data Units) of the Python Logic step is calculated by adding the script's entire time and the quantity of network usage.

note

The Python Logic Step BDU (logicode_bdu) will be charged regardless of the run status of the Logic Step.

The Python pricing is based on:

  1. Execution time of the user’s Python script (seconds)
  2. The server size they chose to execute the script (see below)
  3. Network bandwidth - 0.4 BDU for every 100MB of data transferred
Server SizeBDU per MinuteBDU per Hour
XS0.0211.2
S0.0412.5
M0.0824.9
L0.1659.9
XL0.32919.7
XXL0.388423.304
XXXL0.49229.52

The BDU can be found in the Logic Step Details:

  1. Click Logic Steps
  1. Logic Step Details is now available.

Recommendations

Data Integration recommends the following to avoid repeated River runs and BDUs:

  1. Mock the variables using pytest, unittest, or other testing libraries. Run the scripts on your local Computer and check the results.
  2. Double-check that you've installed all the necessary packages in the River's settings.
  3. It is advised to have as many logs as feasible to assist in verifying script outcomes.

Limitations

  • Avoid using the space character (" ") when naming a River Variable. The River Variable will not be saved, and the River will fail to run.
  • Variables could be used up to 13 MB or 1000 rows of data, whichever comes first. (This constraint applies to all logic steps, not only Python).
  • Data Integration CLI does not support the Python Logic step.
  • The use of the os and sys libraries is strictly prohibited due to security regulations, and commenting out the imports for these libraries is not allowed.
  • Python reserved words cannot be used in Data Integration as it may cause syntax errors and conflicts during River runs. Here is a comprehensive list of all the reserved words in Python:

image.png

On this Page