Informatica Reference

Archive for the ‘Informatica Functions’ Category

Returns the position of a character set in a string, counting from left to right.

Syntax

INSTR( string, search_value [,start [,occurrence [,comparison_type ]]] )

Argument
Required/Optional
Description
string
Required
The string must be a character string. Passes the value you want to evaluate. You can enter any valid transformation expression. The results of the expression must be a character string. If not, INSTR converts the value to a string before evaluating it.
search_value
Required
Any value. The search value is case sensitive. The set of characters you want to search for. The search_value must match a part of the string. For example, if you write INSTR(‘Alfred Pope’, ‘Alfred Smith’) the function returns 0.
You can enter any valid transformation expression. If you want to search for a character string, enclose the characters you want to search for in single quotation marks, for example ‘abc’.
start Required Must be an integer value. The position in the string where you want to start the search. You can enter any valid transformation expression.
The default is 1, meaning that INSTR starts the search at the first character in the string.If the start position is 0, INSTR searches from the first character in the string. If the start position is a positive number, INSTR locates the start position by counting from the beginning of the string. If the start position is a negative number, INSTR locates the start position by counting from the end of the string. If you omit this argument, the function uses the default value of 1.
occurrence Required A positive integer greater than 0. You can enter any valid transformation expression. If the search value appears more than once in the string, you can specify which occurrence you want to search for. For example, you would enter 2 to search for the second occurrence from the start position.
If you omit this argument, the function uses the default value of 1, meaning that INSTR searches for the first occurrence of the search value. If you pass a decimal, the PowerCenter Integration Service rounds it to the nearest integer value. If you pass a negative integer or 0, the session fails.
comparison_type Optional The string comparison type, either linguistic or binary, when the PowerCenter Integration Service runs in Unicode mode. When the PowerCenter Integration Service runs in ASCII mode, the comparison type is always binary.
Linguistic comparisons take language-specific collation rules into account, while binary comparisons perform bitwise matching. For example, the German sharp s character matches the string “ss” in a linguistic comparison, but not in a binary comparison. Binary
comparisons run faster than linguistic comparisons.
Must be an integer value, either 0 or 1:
– 0: INSTR performs a linguistic string comparison.
– 1: INSTR performs a binary string comparison.
Default is 0.

Return Value

Integer if the search is successful. Integer represents the position of the first character in the search_value, counting from left to right.

1. 0 if the search is unsuccessful.
2. NULL if a value passed to the function is NULL.

Return Value

The following expression returns the position of the first occurrence of the letter ‘a’, starting at the beginning of each company name. Because the search_value argument is case sensitive, it skips the ‘A’ in ‘Blue Fin Aqua Center’, and returns the position for the ‘a’ in ‘Aqua’:

INSTR( COMPANY, ‘a’ )

COMPANY
RETURN VALUE
Blue Fin Aqua Center 13
Maco Shark Shop 2
Scuba Gear 5
Frank’s Dive Shop 3
VIP Diving Club 0

The following expression returns the position of the second occurrence of the letter ‘a’, starting at the beginning of each company name. Because the search_value argument is case sensitive, it skips the ‘A’ in ‘Blue Fin Aqua Center’, and returns 0:

INSTR( COMPANY, ‘a’, 1, 2 )

COMPANY
RETURN VALUE
Blue Fin Aqua Center 0
Maco Shark Shop 8
Scuba Gear 9
Frank’s Dive Shop 0
VIP Diving Club 0

The following expression returns the position of the second occurrence of the letter ‘a’ in each company name, starting from the last character in the company name. Because the search_value argument is case sensitive, it skips the ‘A’ in ‘Blue Fin Aqua Center’, and returns 0:

INSTR( COMPANY, ‘a’, -1, 2 )

COMPANY
RETURN VALUE
Blue Fin Aqua Center 0
Maco Shark Shop 2
Scuba Gear 5
Frank’s Dive Shop 0
VIP Diving Club 0

The following expression returns the position of the first character in the string ‘Blue Fin Aqua Center’ (starting from the last character in the company name):

INSTR( COMPANY, ‘Blue Fin Aqua Center’, -1, 1 )

COMPANY
RETURN VALUE
Blue Fin Aqua Center 1
Maco Shark Shop 0
Scuba Gear 0
Frank’s Dive Shop 0
VIP Diving Club 0

Using Nested INSTR

You can nest the INSTR function within other functions to accomplish more complex tasks.

The following expression evaluates a string, starting from the end of the string. The expression finds the last (rightmost) space in the string and then returns all characters to the left of it:

SUBSTR( CUST_NAME,1,INSTR( CUST_NAME,’ ‘ ,-1,1 ))

CUST NAME
RETURN VALUE
PATRICIA JONES PATRICIA
MARY ELLEN SHAH MARY ELLEN

The following expression removes the character ‘#’ from a string:

SUBSTR( CUST_ID, 1, INSTR(CUST_ID, ‘#’)-1 ) || SUBSTR( CUST_ID, INSTR(CUST_ID, ‘#’)+1 )

CUST_ID
RETURN VALUE
ID#33 ID33
#A3577 A3577
SS #712403399 SS 712403399

Returns one of two values you specify, based on the results of a condition.

Syntax

IIF( condition, value1 [,value2] )

Argument
Required/Optional
Description
condition
Required
The condition you want to evaluate. You can enter any valid transformation expression that evaluates to TRUE or FALSE.
value1
Required
Any datatype except Binary. The value you want to return if the condition is TRUE. The return value is always the datatype specified by this argument. You can enter any valid transformation expression, including another IIF expression.
value2 Optional Any datatype except Binary. The value you want to return if the condition is FALSE. You can enter any valid transformation expression, including another IIF expression.

Unlike conditional functions in some systems, the FALSE (value2) condition in the IIF function is not required. If you omit value2, the function returns the following when the condition is FALSE: 1. 0 if value1 is a Numeric datatype. 2. Empty string if value1 is a String datatype. 3. NULL if value1 is a Date/Time datatype. For example, the following expression does not include a FALSE condition and value1 is a string datatype so the PowerCenter Integration Service returns an empty string for each row that evaluates to FALSE: IIF( SALES > 100, EMP_NAME )

SALES
EMP_NAME
RETURN VALUE
150 John Smith
John Smith
50 Pierre Bleu
” (empty string)
120 Sally Green Sally Green
NULL Greg Jones ” (empty string)

Return Value value1 if the condition is TRUE. value2 if the condition is FALSE. For example, the following expression includes the FALSE condition NULL so the PowerCenter Integration Service returns NULL for each row that evaluates to FALSE: IIF( SALES > 100, EMP_NAME, NULL )

SALES
EMP_NAME
RETURN VALUE
150 John Smith
John Smith
50 Pierre Bleu
” (empty string)
120 Sally Green Sally Green
NULL Greg Jones ” (empty string)

If the data contains multibyte characters and the condition argument compares string data, the return value depends on the code page and data movement mode of the PowerCenter Integration Service.

IIF and Datatypes

When you use IIF, the datatype of the return value is the same as the datatype of the result with the greatest precision.

For example, you have the following expression:
IIF( SALES < 100, 1, .3333 )

The TRUE result (1) is an integer and the FALSE result (.3333) is a decimal. The Decimal datatype has greater precision than Integer, so the datatype of the return value is always a Decimal.

When you run a session in high precision mode and at least one result is Double, the datatype of the return value is Double.

Special Uses of IIF

Use nested IIF statements to test multiple conditions. The following example tests for various conditions and returns 0 if sales is 0 or negative:

IIF( SALES > 0, IIF( SALES < 50, SALARY1, IIF( SALES < 100, SALARY2, IIF( SALES < 200, SALARY3, BONUS))), 0 )

Use IIF in update strategies. For example:

IIF( ISNULL( ITEM_NAME ), DD_REJECT, DD_INSERT)

Alternative to IIF

Use “DECODE”  instead of IIF in many cases. DECODE may improve readability. The following shows how you use DECODE instead of IIF using the first example from the previous section:

DECODE( TRUE, SALES > 0 and SALES < 50, SALARY1, SALES > 49 AND SALES < 100, SALARY2, SALES > 99 AND SALES < 200, SALARY3, SALES > 199, BONUS)

You can often use a Filter transformation instead of IIF to maximize session performance.

The transformation language includes a group of variable functions to update the current value of a mapping variable throughout the session. When you run a workflow, the PowerCenter Integration Service evaluates the start and current value of a variable at the beginning of the session based on the final value of the variable from the last session run. Use the following variable functions:

  1. SetCountVariable
  2. SetMaxVariable
  3. SetMinVariable
  4. SetVariable

Use different variable functions with a variable based on the aggregation type of the variable.

When using mapping variables in sessions with multiple partitions, use variable functions to determine the final value of the variable for each partition. At the end of the session, the PowerCenter Integration Service performs the aggregate function across all partitions to determine one final value to save to the repository. Unless overridden, it uses the saved value as the start value of the variable for the next time you use this session.

For example, you use SetMinVariable to set a variable to the minimum evaluated value. The PowerCenter Integration Service calculates the minimum current value for the variable for each partition. Then at the end of the session, it finds the minimum current value across all partitions and saves that value into the repository.

Use SetVariable only once for each mapping variable in a pipeline. When you create multiple partitions in a pipeline, the PowerCenter Integration Service uses multiple threads to process that pipeline. If you use this function more than once for the same variable, the current value of a mapping variable may have indeterministic results.

Counts the rows evaluated by the function and increments the current value of a mapping variable based on the count. Increases the current value by one for each row marked for insertion. Decreases the current value by one for each row marked for deletion. Keeps the current value the same for each row marked for update or reject. Returns the new current value.

At the end of a successful session, the PowerCenter Integration Service saves the last current value to the repository. When used with a session that contains multiple partitions, the PowerCenter Integration Service generates different current values for each partition. At the end of the session, it determines the total count for all partitions and saves the total to the repository. Unless overridden, it uses the saved value as the initial value of the variable for the next time you use this session.

Use the SETCOUNTVARIABLE function only once for each mapping variable in a pipeline. The PowerCenter Integration Service processes variable functions as it encounters them in the mapping. The order in which the PowerCenter Integration Service encounters variable functions in the mapping may not be the same for every session run. This may cause inconsistent results when you use the same variable function multiple times in a mapping.

Use SETCOUNTVARIABLE with mapping variables with a Count aggregation type. Use SETCOUNTVARIABLE in the following transformations:

  • Expression
  • Filter
  • Router
  • Update Strategy

The PowerCenter Integration Service does not save the final value of a mapping variable to the repository when any of the following are true:

  • The session fails to complete.
  • The session is configured for a test load.
  • The session is a debug session.
  • The session runs in debug mode and is configured to discard session output.

Syntax
SETCOUNTVARIABLE( $$Variable )

Argument
Required/
Optional
Description
$$Variable
Required
Name of the mapping variable you want to set. Use mapping variables with a count aggregation type.

Return Value

TRUE if the data matches the pattern.

Example

You have a mapping that updates a slowly changing dimension table containing distributor information. The following expression counts the number of current distributors with the mapping variable $$CurrentDistributors and returns the current value to the CUR_DIST port. It increases the count by one for each inserted row, decreases the count for each deleted row, and keeps the count the same for all updated or rejected rows. The initial value of $$CurrentDistributors from the previous session run is 23.

SETCOUNTVARIABLE ($$CurrentDistributors)

(row marked for)
DIST_ID
DISTRIBUTOR
CUR_DIST
(update)
000015
MSD Inc.
23
(insert)
000024
Darkroom Co.
24
(insert)
000025
Howard’s Supply
25
(update)
000003
JNR Ltd.
25
(delete)
000024
Darkroom Co.
24
(insert)
000026
Supply.com
25

At the end of the session, the PowerCenter Integration Service saves ‘25’ to the repository as the current value for $$CurrentDistributors. The next time the session runs, the Integration Service evaluates the initial value to $$CurrentDistributors to ‘25’.

The PowerCenter Integration Service saves the same value for $$CurrentDistributors to the repository for sessions with multiple partitions as for sessions with a single partition.

Sets the current value of a mapping variable to the higher of two values: the current value of the variable or the value you specify. Returns the new current value. The function executes only if a row is marked as insert. SETMAXVARIABLE ignores all other row types and the current value remains unchanged.

At the end of a successful session, the PowerCenter Integration Service saves the final current value to the repository. When used with a session that contains multiple partitions, the PowerCenter Integration Service generates different current values for each partition. At the end of the session, it saves the highest current value across all partitions to the repository. Unless overridden, it uses the saved value as the initial value of the variable for the next session run.

When used with a string mapping variable, SETMAXVARIABLE returns the higher string based on the sort order selected for the session.

Use the SETMAXVARIABLE function only once for each mapping variable in a pipeline. The PowerCenter Integration Service processes variable functions as it encounters them in the mapping. The order in which the PowerCenter Integration Service encounters variable functions in the mapping may not be the same for every session run. This can cause inconsistent results when you use the same variable function multiple times in a mapping.

Use SETMAXVARIABLE with mapping variables with a Max aggregation type. Use SETMAXVARIABLE in the following transformations:

  • Expression
  • Filter
  • Router
  • Update Strategy

The PowerCenter Integration Service does not save the final value of a mapping variable to the repository when any of the following conditions are true:

  • The session fails to complete.
  • The session is configured for a test load.
  • The session is a debug session.
  • The session runs in debug mode and is configured to discard session output.

Syntax

SETMAXVARIABLE( $$Variable, value )

Argument
Required/
Optional
Description
$$Variable
Required
Name of the mapping variable you want to set. Use mapping variables with Max aggregation
type.
value
Required
The value you want the PowerCenter Integration Service to compare against the current value of the variable. You can enter any valid transformation expression that evaluates to a datatype compatible with the datatype of the variable.

Return Value

The higher of two values: the current value of the variable or the value you specified. The return value is the new current value of the variable.

When value is NULL the PowerCenter Integration Service returns the current value of $$Variable.

Examples

The following expression compares the number of items purchased in each transaction with a mapping variable $$MaxItems. It sets $$MaxItems to the higher of two values and returns the historically highest number of items purchased in a single transaction to the MAX_ITEMS port. The initial value of $$MaxItems from the previous session run is 22.

SETMAXVARIABLE ($$MAXITEMS, ITEMS)

TRANSACTION
ITEMS
MAX_ITEMS
0100002
12
22
0100003
5
22
0100004
18
22
0100005
35
35
0100006
5
35
0100007
14
35

At the end of the session, the PowerCenter Integration Service saves ‘35’ to the repository as the maximum current value for $$MaxItems. The next time the session runs, the PowerCenter Integration Service evaluates the initial value to $$MaxItems to ‘35’.

If the same session contains three partitions, the PowerCenter Integration Service evaluates $$MaxItems for each partition. Then, it saves the largest value to the repository. For example, the last evaluated value for $$MaxItems in each partition is as follows:

Partition      Final Current Value for $$MaxItems
Partition 1      35
Partition 2      23
Partition 3      22

Sets the current value of a mapping variable to the lower of two values: the current value of the variable or the value you specify. Returns the new current value. The SETMINVARIABLE function executes only if a row is marked as insert. SETMINVARIABLE ignores all other row types and the current value remains unchanged.

At the end of a successful session, the PowerCenter Integration Service saves the final current value to the repository. When used with a session that contains multiple partitions, the PowerCenter Integration Service generates different current values for each partition. At the end of the session, it saves the lowest current value across all partitions to the repository. Unless overridden, it uses the saved value as the initial value of the variable for the next session run.

When used with a string mapping variable, SETMINVARIABLE returns the lower string based on the sort order selected for the session.

Use the SETMINVARIABLE function only once for each mapping variable in a pipeline. The PowerCenter Integration Service processes variable functions as it encounters them in the mapping. The order in which the PowerCenter Integration Service encounters variable functions in the mapping may not be the same for every session run. This may cause inconsistent results when you use the same variable function multiple times in a mapping.

Use SETMINVARIABLE with mapping variables with a Min aggregation type. Use SETMINVARIABLE in the following transformations:

  • Expression
  • Filter
  • Router
  • Update Strategy

The PowerCenter Integration Service does not save the final value of a mapping variable to the repository when any of the following conditions are true:

Syntax
SETMINVARIABLE( $$Variable, value )

Argument
Required/
Optional
Description
$$Variable
Required
Name of the mapping variable you want to set. Use with mapping variables with Min aggregation type.
value
Required
The value you want the PowerCenter Integration Service to compare against the current value of the variable. You can enter any valid transformation expression that evaluates to a datatype compatible with the datatype of the variable.

Return Value

The lower of two values: the current value of the variable or the value you specified. The return value is the new current value of the variable.

When value is NULL, the PowerCenter Integration Service returns the current value of $$Variable.

Example

The following expression compares the price of an item with a mapping variable $$MinPrice. It sets $$MinPrice to the lower of two values and returns the historically lowest item price to the MIN_PRICE port. The initial value of $$MinPrice from the previous session run is 22.50.

SETMINVARIABLE ($$MinPrice, PRICE)

DATE
PRICE 
MIN_PRICE
05/01/2000 09:00:00
23.50
22.50
05/01/2000 10:00:00
27.00
22.50
05/01/2000 11:00:00
26.75
22.50
05/01/2000 12:00:00
25.25
22.50
05/01/2000 13:00:00
22.00
22.00
05/01/2000 14:00:00
22.75
22.00
05/01/2000 15:00:00
23.00
22.00
05/01/2000 16:00:00
24.25
22.00
05/01/2000 17:00:00
24.00
22.00

At the end of the session, the PowerCenter Integration Service saves 22.00 to the repository as the minimum current value for $$MinPrice. The next time the session runs, the PowerCenter Integration Service evaluates the initial value to $$MinPrice to 22.00.

If the same session contains three partitions, the PowerCenter Integration Service evaluates $$MinPrice for each partition. Then, it saves the smallest value to the repository. For example, the last evaluated value for $$MinPrice in each partition is as follows:

Partition      Final Current Value for $$MinPrice
Partition 1      22.00
Partition 2      22.00
Partition 3      22.00

Sets the current value of a mapping variable to a value you specify. Returns the specified value. The SETVARIABLE function executes only if a row is marked as insert or update. SETVARIABLE ignores all other row types and the current value remains unchanged.

At the end of a successful session, the PowerCenter Integration Service compares the final current value of the variable to the start value of the variable. Based on the aggregate type of the variable, it saves a final current value to the repository. Unless overridden, it uses the saved value as the initial value of the variable for the next session run.

Use the SETVARIABLE function only once for each mapping variable in a pipeline. The PowerCenter Integration Service processes variable functions as it encounters them in the mapping. The order in which the PowerCenter Integration Service encounters variable functions in the mapping may not be the same for every session run. This may cause inconsistent results when you use the same variable function multiple times in a mapping.

Use SETVARIABLE in the following transformations:

  • Expression
  • Filter
  • Router
  • Update Strategy

The PowerCenter Integration Service does not save the final value of a mapping variable to the repository when any of the following conditions are true:

  • The session fails to complete.
  • The session is configured for a test load.
  • The session is a debug session.
  • The session runs in debug mode and is configured to discard session output.

Syntax
SETVARIABLE( $$Variable, value )

Argument
Required/
Optional
Description
$$Variable
Required
Name of the mapping variable you want to set. Use with mapping variables with Max/Min aggregation type.
value
Required
The value you want to set the current value of the variable to. You can enter any valid transformation expression that evaluates to a datatype compatible with the datatype of the variable.


Return Value

Current value of the variable.

When value is NULL, the PowerCenter Integration Service returns the current value of $$Variable.

Examples

The following expression sets a mapping variable $$Time to the system date at the time the PowerCenter Integration Service evaluates the row and returns the system date to the SET_$$TIME port:

SETVARIABLE ($$Time, SYSDATE)

TRANSACTION
ITEMS
SET_$$TIME
0100002
534.23
10/10/2000 01:34:33
0100003
699.01
10/10/2000 01:34:34
0100004
97.50
10/10/2000 01:34:35
0100005
116.43
10/10/2000 01:34:36
0100006
323.95
10/10/2000 01:34:37

At the end of the session, the PowerCenter Integration Service saves 10/10/2000 01:34:37 to the repository as the last evaluated current value for $$Timestamp.

The next time the session runs, the PowerCenter Integration Service evaluates all references to $$Timestamp to 10/10/2000 01:34:37.

The following expression sets the mapping variable $$Timestamp to the timestamp associated with the row and returns the timestamp to the SET_$$TIMESTAMP port:

SETVARIABLE ($$Time, TIMESTAMP)

TRANSACTION
TIMESTAMP
TOTAL
SET_$$TIMESTAMP
0100002
10/01/2000 12:01:01
534.23
10/01/2000 12:01:01
0100003
10/01/2000 12:10:22
699.01
10/01/2000 12:10:22
0100004
10/01/2000 12:16:45
97.50
10/01/2000 12:16:45
0100005
10/01/2000 12:23:10
116.43
10/01/2000 12:23:10
0100006
10/01/2000 12:40:31
323.95
10/01/2000 12:40:31

At the end of the session, the PowerCenter Integration Service saves 10/01/2000 12:40:31 to the repository as the last evaluated current value for $$Timestamp.

The next time the session runs, the PowerCenter Integration Service evaluates the initial value of $$Timestamp to
10/01/2000 12:40:31.

At the end of the session, the PowerCenter Integration Service saves 10/01/2000 12:40:31 to the repository as the last evaluated current value for $$Timestamp.