Informatica PowerCenter Pushdown Optimization
Informatica Pushdown Optimization Option increases performance by providing the flexibility to push transformation processing to the most appropriate processing resource.
Using Pushdown Optimization, data transformation logic can be pushed to source database or target database or through the PowerCenter server.
It converts the transformation logic into SQL statements, which can directly execute on database. This minimizes the need of moving data between servers and utilizes the power of database engine.
How Pushdown Optimization Works?
When you run a session configured for pushdown optimization, the Integration Service analyzes the mapping and transformations to determine the transformation logic it can push to the database.
The Integration Service converts the transformation logic into SQL statements and sends to the source or the target database to perform the data transformation.
Different Type Pushdown Optimization
You can configure pushdown optimization in the following ways,
- Source-side pushdown optimization
- Target-side pushdown optimization
- Full pushdown optimization
1. Source-side pushdown optimization:
When you run a session configured for source-side pushdown optimization, the Integration Service analyzes the mapping from the source to the target
The Integration Service generates a SELECT statement based on the transformation logic for each transformation it can push to the database.
When you run the session, the Integration Service pushes all transformation logic that is valid to push to the database by executing the generated SQL statement. Then, it reads the results of this SQL statement and continues to run the session.
2. Target-side pushdown optimization
When you run a session configured for target-side pushdown optimization, the Integration Service analyzes the mapping from the target to the source.
The Integration Service generates an INSERT, DELETE, or UPDATE statement based on the transformation logic for each transformation it can push to the database, starting with the first transformation in the pipeline it can push to the database.
The Integration Service processes the transformation logic up to the point that it can push the transformation logic to the target database. Then, it executes the generated SQL.
3. Full pushdown optimization
The Integration Service pushes as much transformation logic as possible to both source and target databases.
To use full pushdown optimization, the source and target must be on the same database.
When you run a session configured for full pushdown optimization, the Integration Service analyzes the mapping starting with the source and analyzes each transformation in the pipeline until it analyzes the target.
It generates SQL statements that are executed against the source and target database based on the transformation logic it can push to the database.
Informatica PowerCenter Partitioning
In addition to a better ETL design, it is obvious to have a session optimized with no bottlenecks to get the best session performance.
After optimizing the session performance, we can further improve the performance by exploiting the under-utilized hardware power. This refers to parallel processing and we can achieve this in Informatica PowerCenter using Partitioning Sessions.
What is Session Partitioning
The Informatica PowerCenter Partitioning Option increases the performance of PowerCenter through parallel data processing.
Partitioning option will let you split the large data set into smaller subsets which can be processed in parallel to get a better session performance.
Type of Session Partitions
Different types of partition algorithms are available,
1. Database partitioning: The Integration Service queries the database system for table partition information. It reads partitioned data from the corresponding nodes in the database.
2. Round-Robin Partitioning: Using this partitioning algorithm, the Integration service distributes data evenly among all partitions. Use round-robin partitioning when you need to distribute rows evenly and do not need to group data among partitions.
3. Hash Auto-Keys Partitioning: The PowerCenter Server uses a hash function to group rows of data among partitions. When hash auto-key partition is used, the Integration Service uses all grouped or sorted ports as a compound partition key. You can use hash auto-keys partitioning at or before Rank, Sorter, and unsorted Aggregator transformations to ensure that rows are grouped properly before they enter these transformations.
4. Hash User-Keys Partitioning: Hash user keys. The Integration Service uses a hash function to group rows of data among partitions based on a user-defined partition key. You choose the ports that define the partition key.
5. Key Range Partitioning: With this type of partitioning, you specify one or more ports to form a compound partition key for a source or target. The Integration Service then passes data to each partition depending on the ranges you specify for each port.
We can invoke the user interface for session partition as shown in below image from your session using the menu Mapping -> Partitions.
Difference Between Informatica 8 ,9 and 10
New Features on Informatica 9:
1. Informatica 9 includes the Informatica Developer and Informatica Analyst client tools.
2. Multiple rows return
Lookups can now be configured as an Active transformation to return Multiple Rows. We can configure the Lookup transformation to return all rows that match a lookup condition. A Lookup transformation is an active transformation when it can return more than one row for any given input row.
3. Enhanced support for lookup transformations, including cache updates, database deadlock resilience, SQL overrides for un cached lookups, and multiple rows return
4. Limit the Session Log
You can limit the size of session logs for real-time sessions. You can limit the size by time or by file size. You can also limit the number of log files for a session.
5. Single admin console for data quality (Informatica developer tool), PowerCenter, Power exchange and data services.
6. Informatica has the capability to pull data from IMS, DB2 on series and series and from other several other legacy systems (Mainframe) environment like VSAM, Datacom, IDMS etc.
7. Informatica supports open interfaces such as web services and can integrate with other tools that support these as well including BPM tool.
8. Informatica 9 will continue support feature of running the same workflow simultaneously.
Leave a comment