21 Updating the Target Schema

Scenario

You are in charge of managing a data warehouse that has been in production for a few months. The data warehouse was originally created using two source schemas, Human Resources (HR) and Order Entry (OE) and was loaded into the Warehouse (WH) target schema. Recently you were made aware of two changes to tables in the HR and OE schemas. The WH schema must be updated to reflect these changes.

Change #1: The first change was made to the HR schema as show in Figure 21–1. The length of the REGION_NAME column in the REGIONS table was extended to 100 characters.

Figure 21-1 Changed REGIONS Table

Description of "Figure 21-1 Changed REGIONS Table"
Change #2: The second change was made to the OE schema as shown in Figure 21–2. A row called LOT_SIZE_NUMBER was added to the ORDER_ITEMS table with a precision of 8 and scale of 0.

Figure 21-2 Changed ORDER_ITEMS Table

Description of "Figure 21-2 Changed ORDER_ITEMS Table"

Solution

In order to update the WH schema, you must first determine the impact of these changes and then create and execute a plan for updating the target schema. The following steps provide an outline for what you need to do:

Step 1: Identify Changed Source Objects

Step 2: Determine the Impact of the Changes

Step 3: Reimport Changed Objects

Step 4: Update Objects in the Data Flow

Step 5: Redesign your Target Schema

Step 6: Re-Deploy Scripts

Step 7: Test the New ETL Logic

Step 8: Update Your Discoverer EUL

Step 9: Execute the ETL Logic

Case Study

Step 1: Identify Changed Source Objects

The first step in rolling out changes to your data warehouse is to identify the changes in source objects. In order to do this, you must have a procedure or system in place that can notify you when changes are made to source objects.

In our scenario, you were made aware by the group managing the HR and OE schemas that some objects had been changed. There were two changes, the first was made to the HR schema. The REGION_NAME column was extended from 25 to 100 characters to accommodate longer data. The second change was made to the OE schema. The LOT_SIZE_NUMBER column was added and needs to be integrated into the WH schema.

Step 2: Determine the Impact of the Changes

After you have identified the changes, you must determine their impact on your target schema.

For Change #1, made to the HR schema, you need to update any dependent objects. This entails reimporting the REGIONS table and then updating any objects that use the REGION_NAME column. To identify dependent objects, you can use the Impact Analysis Diagram. You also need to update any mappings that use this table.

For Change #2, made to the OE schema, in addition to reimporting the table and updating mappings, you need to find a way to integrate the new column into the WH schema. Since the column was added to keep track of the number of parts or items in one unit of sales, add a measure called NUMBER_OF_IND_UNITS to the SALES cube in the WH schema and have this measure for each order. Then you need to connect this new column to the SALES cube.

Step 3: Reimport Changed Objects

Since two source objects have changed, you must start by reimporting their metadata definitions into your workspace. Select both the REGIONS table in the HR schema and the ORDER_ITEMS table in the OE schema from the navigation tree and use the Metadata Import Wizard to reimport their definitions.

Warehouse Builder automatically detects that this is an update and proceeds by only updating changed definitions. The Import Results dialog box that displays at the end of the import process displays the details of the synchronization. Click OK to continue the import and commit your changes to the workspace. If you do not want to continue with the import, click Undo.

Step 4: Update Objects in the Data Flow

If the change in the source object altered only existing objects and attributes, such as Change #1 in the HR schema, use Impact Analysis diagrams to identify objects that need to be reconciled.

In our scenario, we need to reconcile the column length in all objects that depend on the REGIONS table to ensure that the data continues to load properly.

To update objects in the data flow:

Select the REGIONS table in the HR schema from the navigation tree. Select View and then click Impact.

The Metadata Dependency Manager opens and the Impact Analysis diagram reveals that the CUSTOMER dimension in the WH schema is the only object impacted by the REGIONS table.

This step requires that you have already set up the Repository Browser. For more information on setting this up, see Oracle Warehouse Builder Installation and Administration Guide.
Open the CUSTOMER dimension in the Data Object Editor and update the Region Name level attribute to 100 character length.
Open the MAP_CUSTOMER mapping that connects the source to the target. For both the REGIONS table operator and the CUSTOMER dimension operator, perform an inbound synchronization from data object to mapping operator.

The mapping operators must be synchronized with the mapping objects they represent in order to generate code based on the updated objects.

You have now completed updating the metadata associated with Change #1.

For Change #2, since it introduced a new column, you do not need to update the data flow the same way you did for Change #1. Make sure you perform an inbound synchronization on all the mappings that use an ORDER_ITEMS table operator. From the Impact Analysis Diagram for the ORDER_ITEMS table shown in Figure 21–3, we can see that this is only the mapping MAP_SALES.

Figure 21-3 Impact Analysis Diagram for ORDER_ITEMS

Description of "Figure 21-3 Impact Analysis Diagram for ORDER_ITEMS"

Step 5: Redesign your Target Schema

Since Change #2 introduced the new LOT_SIZE_NUMBER column to the ORDER_ITEMS table, you need to redesign your WH target schema to incorporate this new data into your cube. You can do this by adding a new measure called NUMBER_OF_IND_UNITS to your SALES cube.

To redesign the target schema:

Add the measure NUMBER_OF_IND_UNITS with the NUMBER data type, precision of 8, and scale of 0 to the SALES cube.
View the lineage diagram for the SALES cube to determine which mappings contain the SALES cube. Perform an inbound synchronization on all SALES cube mapping operators.
Open the mapping MAP_SALES and ensure that the table ORDER_ITEMS is synchronized inbound.
Connect the LOT_SIZE_NUMBER column in the ORDER_ITEMS table to the JOIN, and then to the SETOP, and then add it to the AGG operators. Ensure that you are doing a sum operation in the AGG operator.
Finally, connect the LOT_SIZE_NUMBER output attribute of the AGG operator to the NUMBER_OF_IND_UNITS input attribute of the SALES cube.

Step 6: Re-Deploy Scripts

After the mappings have been debugged, use the Design Center to regenerate and re-deploy scripts. Use the Control Center Manager to discover the default deployment action. Warehouse Builder detects the type of deployment to run.

Step 7: Test the New ETL Logic

After you have reconciled all objects and ensured that the WH target schema has been updated to reflect all changes, test the ETL logic that is be generated from the mappings. Use the Mapping Debugger to complete this task. If you find any errors, resolve them and re-deploy the scripts.

Step 8: Update Your Discoverer EUL

If you are using Discoverer as your reporting tool, proceed by updating your EUL.

To update your Discoverer EUL:

Identify the objects that need to be updated in the EUL because of changes made to their structure or data. In this case, the changed objects are the REGIONS and SALES_ITEMS tables and the SALES cube.
In the Project Explorer, select all the objects identified in step 1, right-click and select Derive.

The Perform Derivation Wizard displays and updates these object definitions in the Business Definition Module that contains these objects.
Expand the Item Folders node in the Business Definition Module that contains these changed objects.
Select the objects identified in Step 1, right-click and select Deploy.

The changes to the objects are updated in the Discover EUL.

Step 9: Execute the ETL Logic

After the mappings have been deployed, execute and load data to the target.