Informatica Reference

Archive for the ‘Transformations’ Category

  1. When Informatica’s transformation does not provide the exact functionality we need, we can develop complex functions with in a dynamic link library or Unix shared library.
  2. To obtain this kind of extensibility, we can use Transformation Exchange (TX) dynamic invocation interface built into Power mart/Power Center.
  3. Using TX, you can create an External Procedure Transformation and bind it to an External Procedure that you have developed.
  4. Two types of External Procedures are available

COM External Procedure (Only for WIN NT/2000)
Informatica External Procedure ( available for WINNT, Solaris, HPUX etc)

Components of TX:
(a) External Procedure
This exists separately from Informatica Server. It consists of C++, VB code written by developer. The code is compiled and linked to a DLL or Shared memory, which is loaded by the Informatica Server at runtime.

(b) External Procedure Transformation
This is created in Designer and it is an object that resides in the Informatica Repository. This serves in many ways

  • This contains metadata describing External procedure
  • This allows an External procedure to be references in a mapping by adding an instance of an External Procedure transformation.

All External Procedure Transformations must be defined as reusable transformations.

Therefore you cannot create External Procedure transformation in designer. You can create only with in the transformation developer of designer and add instances of the transformation to mapping.

Tips for Aggregator Transformations

Use sorted input to decrease the use of aggregate caches.

Sorted input reduces the amount of data cached during the session and improves session performance. Use this option with the Sorter transformation to pass sorted data to the Aggregator transformation.

Limit connected input/output or output ports.

Limit the number of connected input/output or output ports to reduce the amount of data the Aggregator transformation stores in the data cache.

Filter the data before aggregating it.

If you use a Filter transformation in the mapping, place the transformation before the Aggregator transformation to reduce unnecessary aggregation

Tips for Filter Transformations

Use the Filter transformation early in the mapping.

Steps to Create a Filter Transformation 157 To maximize session performance, keep the Filter transformation as close as possible to the sources in the mapping. Rather than passing rows that you plan to discard through the mapping, you can filter out unwanted data early in the flow of data from sources to targets.

Use the Source Qualifier transformation to filter.

The Source Qualifier transformation provides an alternate way to filter rows. Rather than filtering rows from within a mapping, the Source Qualifier transformation filters rows when read from a source. The main difference is that

the source qualifier limits the row set extracted from a source, while the Filter transformation limits the row set sent to a target. Since a source qualifier reduces the number of rows used throughout the mapping, it provides better performance.

Tips for Joiner Transformations

Perform joins in a database when possible.

Performing a join in a database is faster than performing a join in the session. In some cases, this is not possible, such as joining tables from two different databases or flat file systems. If you want to perform a join in a database,

use the following options:

¨ Create a pre-session stored procedure to join the tables in a database.

¨ Use the Source Qualifier transformation to perform the join.

Join sorted data when possible.

You can improve session performance by configuring the Joiner transformation to use sorted input. When you configure the Joiner transformation to use sorted data, the Integration Service improves performance by minimizing disk input and output. You see the greatest performance improvement when you work with large datasets.

For an unsorted Joiner transformation, designate the source with fewer rows as the master source.

For optimal performance and disk storage, designate the source with the fewer rows as the master source. During a session, the Joiner transformation compares each row of the master source against the detail source. The fewer unique rows in the master, the fewer iterations of the join comparison occur, which speeds the join process.

For a sorted Joiner transformation, designate the source with fewer duplicate key values as the master source.

For optimal performance and disk storage, designate the source with fewer duplicate key values as the master source. When the Integration Service processes a sorted Joiner transformation, it caches rows for one hundred keys at a time. If the master source contains many rows with the same key value, the Integration Service must cache more rows, and performance can be slowed.

Tips for Lookup Transformations

Add an index to the columns used in a lookup condition.

If you have privileges to modify the database containing a lookup table, you can improve performance for both cached and uncached lookups. This is important for very large lookup tables. Since the Integration Service needs to query, sort, and compare values in these columns, the index needs to include every column used in a lookup condition.

Place conditions with an equality operator (=) first.

If you include more than one lookup condition, place the conditions in the following order to optimize lookup performance:

¨ Equal to (=)

¨ Less than (<), greater than (>), less than or equal to (<=), greater than or equal to (>=)

¨ Not equal to (!=)

Cache small lookup tables.

Improve session performance by caching small lookup tables. The result of the lookup query and processing is the same, whether or not you cache the lookup table.

Join tables in the database.

If the lookup table is on the same database as the source table in the mapping and caching is not feasible, join the tables in the source database rather than using a Lookup transformation.

Use a persistent lookup cache for static lookups.

If the lookup source does not change between sessions, configure the Lookup transformation to use a persistent lookup cache. The Integration Service then saves and reuses cache files from session to session, eliminating the time required to read the lookup source.

Call unconnected Lookup transformations with the :LKP reference qualifier.

When you write an expression using the :LKP reference qualifier, you call unconnected Lookup transformations only. If you try to call a connected Lookup transformation, the Designer displays an error and marks the mapping invalid.

Configure a pipeline Lookup transformation to improve performance when processing a relational or flat file lookup source.

You can create partitions to process a relational or flat file lookup source when you define the lookup source as a source qualifier. Configure a non-reusable pipeline Lookup transformation and create partitions in the partial pipeline that processes the lookup source.

Tips for Lookup Caches

Cache small lookup tables.

Improve session performance by caching small lookup tables. The result of the lookup query and processing is the same, whether or not you cache the lookup table.

Use a persistent lookup cache for static lookup tables.

If the lookup table does not change between sessions, configure the Lookup transformation to use a persistent lookup cache. The Integration Service then saves and reuses cache files from session to session, eliminating the time required to read the lookup table.

Tips for Stored Procedure Transformations

Do not run unnecessary instances of stored procedures.

Each time a stored procedure runs during a mapping, the session must wait for the stored procedure to complete

in the database. You have two possible options to avoid this:

  • Reduce the row count. Use an active transformation prior to the Stored Procedure transformation to reduce the number of rows that must be passed the stored procedure. Or, create an expression that tests the values before passing them to the stored procedure to make sure that the value does not really need to be passed.
  • Create an expression. Most of the logic used in stored procedures can be easily replicated using expressions in the Designer.


  • Active and Connected Transformation.
  • The Source Qualifier transformation represents the rows that the Power Center Server reads when it runs a session.
  • It is only transformation that is not reusable.
  • Default transformation except in case of XML or COBOL files.

Tasks performed by Source Qualifier:

  • Join data originating from the same source database: We can join two or more tables with primary key-foreign key relationships by linking the sources to one Source Qualifier transformation.
  • Filter rows when the Power Center Server reads source data: If we Include a filter condition, the Power Center Server adds a WHERE clause to the Default query.
  • Specify an outer join rather than the default inner join: If we include a User-defined join, the Power Center Server replaces the join information Specified by the metadata in the SQL query.
  • Specify sorted ports: If we specify a number for sorted ports, the
  • Power Center Server adds an ORDER BY clause to the default SQL query.
  • Select only distinct values from the source: If we choose Select Distinct, the Power Center Server adds a SELECT DISTINCT statement to the default SQL query.
  • Create a custom query to issue a special SELECT statement for the Power Center Server to read source data: For example, you might use a Custom query to perform aggregate calculations. The entire above are possible in Properties Tab of Source Qualifier transformation.

SQ PROPERTIES TAB

1) SOURCE FILTER:

We can enter a source filter to reduce the number of rows the Power Center Server queries.

Note: When we enter a source filter in the session properties, we override the customized SQL query in the Source Qualifier transformation.

2) NUMBER OF SORTED PORTS:

When we use sorted ports, the Power Center Server adds the ports to the ORDER BY clause in the default query.

By default it is 0. If we change it to 1, then the data will be sorted by column that is at the top in SQ. Example: DEPTNO in above figure.

  1. If we want to sort as per ENAME, move ENAME to top.

If we change it to 2, then data will be sorted by top two columns.

3) SELECT DISTINCT:

If we want the Power Center Server to select unique values from a source, we can use the Select Distinct option.

  • Just check the option in Properties tab to enable it.

4) PRE and POST SQL Commands

  • The Power Center Server runs pre-session SQL commands against the source database before it reads the source.
  • It runs post-session SQL commands against the source database after it writes to the target.
  • Use a semi-colon (;) to separate multiple statements.

5) USER DEFINED JOINS

Entering a user-defined join is similar to entering a custom SQL query. However, we only enter the contents of the WHERE clause, not the entire query.

We can specify equi join, left outer join and right outer join only. We cannot specify full outer join. To use full outer join, we need to write SQL Query.

6) SQL QUERY

For relational sources, the Power Center Server generates a query for each Source Qualifier transformation when it runs a session. The default query is a SELECT statement for each source column used in the mapping. In other words, the Power Center Server reads only the columns that are connected to another Transformation.

In mapping above, we are passing only SAL and DEPTNO from SQ_EMP to Aggregator transformation. Default query generated will be:

  • SELECT EMP.SAL, EMP.DEPTNO FROM EMP

Note: If we do not cancel the SQL query, the Power Center Server overrides the default query with the custom SQL query.

We can enter an SQL statement supported by our source database. Before entering the query, connect all the input and output ports we want to use in the mapping.

Important Points:

  • When creating a custom SQL query, the SELECT statement must list the port names in the order in which they appear in the transformation.
  • Example: DEPTNO is top column; DNAME is second in our SQ   mapping.
  • So when we write SQL Query, SELECT statement have name DNAME first, DNAME second and so on. SELECT DEPT.DEPTNO, DEPT.DNAME
  • Once we have written a custom query like above, then this query will Always be used to fetch data from database. In our example, we used WHERE SAL>2000. Now if we use Source Filter and give condition SAL) 1000 or any other, then it will not work. Informatica will always use the custom query only.
  • Make sure to test the query in database first before using it in SQL Query. If query is not running in database, then it won’t work in Informatica too.
  • Also always connect to the database and validate the SQL in SQL query editor.

Refer : http://informaticatutorials-naveen.blogspot.com/2011/04/source-qualifier-tf_21.html

  • Active and connected transformation.

We can filter rows in a mapping with the Filter transformation. We pass all the rows from a source transformation through the Filter transformation, and then enter a Filter condition for the transformation. All ports in a Filter transformation are input/output and only rows that meet the condition pass through the Filter Transformation.

How to filter out rows with null values?

To filter out rows containing null values or spaces, use the ISNULL and IS_SPACES Functions to test the value of the port. For example, if we want to filter out rows that  Contain NULLs in the FIRST_NAME port, use the following condition:

IIF (ISNULL (FIRST_NAME), FALSE, TRUE)

This condition states that if the FIRST_NAME port is NULL, the return value is FALSE and the row should be discarded. Otherwise, the row passes through to the next Transformation.

Performance tuning:

Filter transformation is used to filter off unwanted fields based on conditions we Specify.

  1. Use filter transformation as close to source as possible so that unwanted data gets Eliminated sooner.
  2. If elimination of unwanted data can be done by source qualifier instead of filter,Then eliminate them at Source Qualifier itself.
  3. Use conditional filters and keep the filter condition simple, involving TRUE/FALSE or 1/0.

Refer : http://informaticatutorials-naveen.blogspot.com/2011/04/filter-transformation_20.html

  • Connected and Active Transformation
  • The Sorter transformation allows us to sort data.
  • We can sort data in ascending or descending order according to a specified sort key.
  • We can also configure the Sorter transformation for case-sensitive sorting, and specify whether the output rows should be distinct.

When we create a Sorter transformation in a mapping, we specify one or more ports as a sort key and configure each sort key port to sort in ascending or descending order. We also configure sort criteria the Power Center Server applies to all sort key ports and the system resources it allocates to perform the sort operation.

The Sorter transformation contains only input/output ports. All data passing through the Sorter transformation is sorted according to a sort key. The sort key is one or more ports that we want to use as the sort criteria.

Sorter Transformation Properties

1. Sorter Cache Size:

The Power Center Server uses the Sorter Cache Size property to determine the maximum amount of memory it can allocate to perform the sort operation. The Power Center Server passes all incoming data into the Sorter transformation Before it performs the sort operation.

  • We can specify any amount between 1 MB and 4 GB for the Sorter cache size.
  • If it cannot allocate enough memory, the Power Center Server fails the Session.
  • For best performance, configure Sorter cache size with a value less than or equal to the amount of available physical RAM on the Power Center Server machine.
  • Informatica recommends allocating at least 8 MB of physical memory to sort data using the Sorter transformation.

2. Case Sensitive:

The Case Sensitive property determines whether the Power Center Server considers case when sorting data. When we enable the Case Sensitive property, the Power Center Server sorts uppercase characters higher than lowercase characters.

3. Work Directory

Directory Power Center Server uses to create temporary files while it sorts data.

4. Distinct:

Check this option if we want to remove duplicates. Sorter will sort data according to all the ports when it is selected.

Performance Tuning:

Sorter transformation is used to sort the input data.

  1. While using the sorter transformation, configure sorter cache size to be larger than the input data size.
  2. Configure the sorter cache size setting to be larger than the input data size while Using sorter transformation.
  3. At the sorter transformation, use hash auto keys partitioning or hash user keys Partitioning.

Refer : http://informaticatutorials-naveen.blogspot.com/2011/04/sorter-transformation_20.html

  • Active and connected transformation.

A Router transformation is similar to a Filter transformation because both transformations allow you to use a condition to test data. A Filter transformation tests data for one condition and drops the rows of data that do not meet the Condition. However, a Router transformation tests data for one or more conditions And gives you the option to route rows of data that do not meet any of the conditions to a default output group.

Mapping A uses three Filter transformations while Mapping B produces the same result with one Router transformation.

A Router transformation consists of input and output groups, input and output ports, group filter conditions, and properties that we configure in the Designer.

Working with Groups

A Router transformation has the following types of groups:

  • Input: The Group that gets the input ports.
  • Output: User Defined Groups and Default Group. We cannot modify or delete Output ports or their properties.

User-Defined Groups: We create a user-defined group to test a condition based on incoming data. A user-defined group consists of output ports and a group filter Condition. We can create and edit user-defined groups on the Groups tab with the Designer. Create one user-defined group for each condition that we want to specify.

The Default Group: The Designer creates the default group after we create one new user-defined group. The Designer does not allow us to edit or delete the default group. This group does not have a group filter condition associated with it. If all of the conditions evaluate to FALSE, the IS passes the row to the default group.

Refer : http://informaticatutorials-naveen.blogspot.com/2011/04/router-transformation_20.html

  • Active and connected transformation

The Rank transformation allows us to select only the top or bottom rank of data. It Allows us to select a group of top or bottom values, not just one value.

During the session, the Power Center Server caches input data until it can perform The rank calculations.

Rank Transformation Properties :

  • Cache Directory where cache will be made.
  • Top/Bottom Rank as per need
  • Number of Ranks Ex: 1, 2 or any number
  • Case Sensitive Comparison can be checked if needed
  • Rank Data Cache Size can be set
  • Rank Index Cache Size can be set

Rank Index

The Designer automatically creates a RANKINDEX port for each Rank transformation. The Power Center Server uses the Rank Index port to store the ranking position for Each row in a group.

For example, if we create a Rank transformation that ranks the top five salaried employees, the rank index numbers the employees from 1 to 5.

  • The RANKINDEX is an output port only.
  • We can pass the rank index to another transformation in the mapping or directly to a target.
  • We cannot delete or edit it.

Defining Groups

Rank transformation allows us to group information. For example: If we want to select the top 3 salaried employees of each Department, we can define a group for Department.

  • By defining groups, we create one set of ranked rows for each group.
  • We define a group in Ports tab. Click the Group By for needed port.
  • We cannot Group By on port which is also Rank Port.

Refer : http://informaticatutorials-naveen.blogspot.com/2011/04/rank-transformation_20.html