Migrating and constructing pipeline flows for DataStage

Last updated: Jul 10, 2025
Migrating and constructing pipeline flows for DataStage

The following steps and limitations apply to migrated Sequence Jobs and flows that are constructed directly with the pipeline canvas.

For a list of general pipeline issues, see Known issues and limitations for DataStage.

Migrated flows

For more information on each component, see Pipeline components for DataStage.

Wait for file
Manually reselect or configure the file path. As a helper node for cross loop, the default timeout value is 23:59:59. Manually update value or set to 00:00:00 for no timeout.
Wait for all
Replaces Sequencer (all) and Nested condition.
Wait for any
Replaces Sequencer (any).
Terminate pipeline
Replaces Terminator.
Final message text is not supported.
Terminate loop
Controls the loop status and marks it as complete or failed. If the loop node has the result control_break_node_id after it finishes, the loop terminates and not all iterations are completed. The Terminate loop node is added if there is only one condition link to the End loop node from the parent node. For the Terminate loop node, only one of the condition links on the parent node can be true.
Loop in sequence
Replaces Start/end loop.
Run DataStage job
Replaces Job activity for parallel jobs.
The List type is mapped to Enum. Path is mapped to the File type. For information, see Configuring global objects for Orchestration Pipelines.
Run Pipelines job
Replaces Job activity for sequence jobs. For information, see Run Pipelines job.
Run Bash script
You must replace single quotes around environment variables with double quotes so they are not treated as string literals.

In DataStage, mounting volumes to copy the scripts or files for pipeline in Bash node is not supported. To reference the files in Bash node, see Referencing files in Bash node (DataStage).

Set user variables
Replaces User variable. User variables are defined on the global level. For more information, see Configuring global objects for Orchestration Pipelines.
Error handling
Replaces Exception handler.
Use error.status and error.status_message to get the failed node's information. Use ds.GetErrorSource() and ds.GetErrorNumber() to get the error source and error number.

Set and get user status

To set user status in a DataStage job, you can call the built-in function SetUserStatus from the Expression builder in the Transformer stage. When you go to Triggers in the Transformer and call SetUserStatus, it cannot be used on input column derivations.

To get the status in a pipeline that calls the DataStage job with a Run DataStage job node, you can use the built-in function ds.GetUserStatus(tasks.<node name>) with the name of the Run DataStage job node. You can also access it in the job results with tasks.<node name>.user_status. To set user status in a pipeline, you must add it as a variable with the Set user variables node and select Make user variable value available as a pipeline result, which makes it an output parameter that other pipelines can access. Another pipeline can use a Run pipeline job node to call the pipeline that set the user status, and then get the user status using tasks.<node name>.results.output_parameters.<user status parameter name>.

If SetUserStatus is called in a child pipeline, migration creates a global user variable named user_status and selects the option Make user variable value available as a pipeline result. In the parent pipeline, it also replaces the expression that gets the status of the child pipeline, .$UserStatus, with tasks.results.output_parameters.user_status.

Constructed flows

Run DataStage job
The DSJobRunEnvironmentName environment variable specifies the runtime environment for DataStage jobs. You can add the DSJobRunEnvironmentName environment variable to the Run DataStage job node to override the default runtime environment that is set at a project level or a job level for a specific run.
For example, if you want to run a specific job in the ds-px-large runtime environment, you can override the default ds-px-small runtime environment that is set at a project level or a job level. Under the Input tab of the Run DataStage job node, add DSJobRunEnvironmentName as an environment variable in the Environment variables section, and set its value to ds-px-large.
If you add an environment variable in the Environment variables section of the Run DataStage job node, and the same environment variable is already set at the DataStage flow level, the value that is passed from the Run DataStage job node takes precedence.
Run Bash script
Echo statements must use double quotes to access the value of a variable. For example, echo "variablename" will replace "variablename" with the value of the variable. echo 'variablename' will just echo the name of the variable.