Suppose you want to update a column with the value 0, if it that column contains negative value.
Let us also assume that there are over 2 million row in that column that has a negative value.
The usual way to write the update method is as shown below: The issue with this query is that it will take a lot of time as it affects 2 million rows and also locks the table during the update.
We discussed about a design approach for this scenarion in one of our prior articles.
Here in this updated article lets discuss a different approach to update Larger tables using Informatica Mapping.
One of the issues we come across during the ETL design is "Update Large Tables".
This is a very common ETL scenarion especially when you treat with large volume of data like loading an SCD Type 2 Dimension.
yeah if you have exclusive use on this server due to migration, then yeah splitting it up in parallel and doing smaller "chunks" (as David suggest) might actually allow you to at least see/monitor the progress ... Submitting data in chunks seems as a reasonable idea to me because it allows you to do migration in steps, control redo (and archive) log size, pause and resume update as needed based on server load.
There are couple other things you can try to make the query run faster. When full table scan and is involved and you have enough resources, it can be quite beneficial.Use T_DIM_CUST_CUST_ID, which is the column from the target table to identify the records to be inserted/updated. If it is NULL, record will be set for insert else record will be set for update. Below is the Router Group Filter Condition and you can see how the mapping looks like in the below image (Below mapping image has not any transformation logic in it). MERGE INTO T_DIM_CUST USING T_DIM_CUST_TEMP ON T_DIM_CUST. Now using a Router Transformation, route the records to INSERT/UPDATE path.