overcoming the limitations of set row
The SET ROW clause has been available since Oracle 9i. This clause enables us to update a target record using a PL/SQL record and streamlined syntax of either of the following forms.
UPDATE table_name SET ROW = plsql_record WHERE ... FORALL i IN array_of_records.FIRST .. array_of_records.LAST UPDATE table_name SET ROW = array_of_records(i) WHERE ...
On the face of it, this feature appears to be quite useful, especially in reducing the amount of code we need to write and maintain. However, there are two (major) shortcomings with this feature, as follows.
- the target table's primary key is updated on each record touched by the SET ROW clause, even though its value has probably not changed (or at least it shouldn't change); and
- it is unlikely that many of the columns will have changed their values, so for updates to a small number of columns in a target table, this clause does far too much work. Most of the updates will be performed against columns that were not intended (i.e. their values haven't changed).
This article will outline two techniques to avoid both of these issues while using the SET ROW clause. For readers who are unfamiliar with the SET ROW syntax and its shortcomings, read this oracle-developer.net article.
setup
For the examples in this article, we will create a simple table with 5 columns based on 1,000 records from ALL_OBJECTS, as follows. Note that we have a single-column primary key. This table will serve as both the source and target table for our examples.
SQL> CREATE TABLE update_table (id PRIMARY KEY, name, type, owner, created) 2 AS 3 SELECT ROWNUM, object_name, object_type, owner, created 4 FROM all_objects 5 WHERE ROWNUM <= 1000;
Table created.
a note on the examples
The examples that follow are deliberately simplified. We will assume that the sample table has been partially loaded and 2 of the 5 columns in each record must be updated by some "post-processing" (this potentially expensive technique is surprisingly quite common in many ETL systems). We suppose that these 2 columns must undergo some complex transformations that make this operation too complicated for a single SQL UPDATE.
All examples will use bulk fetching and binding. Again, to keep it simple, we will bulk fetch our data in one pass, rather than use LOOP and LIMIT (an absolute necessity when dealing with large volumes of data).
set row limitations
We will begin with an example of the limitations of the SET ROW clause. We will fetch the source data into an array of records and transform the two target columns in each record, before applying a single bulk update with the SET ROW clause.
SQL> DECLARE 2 3 CURSOR c_source IS 4 SELECT * 5 FROM update_table; 6 7 TYPE aat_whole_row IS TABLE OF c_source%ROWTYPE 8 INDEX BY PLS_INTEGER; 9 10 TYPE aat_id IS TABLE OF update_table.id%TYPE 11 INDEX BY PLS_INTEGER; 12 13 aa_rows aat_whole_row; 14 aa_ids aat_id; 15 16 BEGIN 17 18 /* Fetch in one go as we are using small demo volumes... */ 19 OPEN c_source; 20 FETCH c_source BULK COLLECT INTO aa_rows; 21 CLOSE c_source; 22 23 /* Process batch of data... */ 24 FOR i IN 1 .. aa_rows.COUNT LOOP 25 26 --[complex processing goes here]-- 27 aa_rows(i).name := LOWER(aa_rows(i).name); 28 aa_rows(i).type := LOWER(aa_rows(i).type); 29 30 /* Assign IDs to PK array... */ 31 aa_ids(i) := aa_rows(i).id; 32 33 END LOOP; 34 35 /* FORALL update using SET ROW. Will update primary key... */ 36 FORALL i IN 1 .. aa_rows.COUNT 37 UPDATE update_table 38 SET ROW = aa_rows(i) 39 WHERE id = aa_ids(i); 40 41 DBMS_OUTPUT.PUT_LINE( SQL%ROWCOUNT || ' rows updated.' ); 42 43 END; 44 /
1000 rows updated.
Some comments on this example are as follows.
- Lines 27-28: these represent our "complex logic" that transforms the data for the two update columns;
- Line 31: we maintain a separate array of primary keys to enable us to target each update correctly; and
- Line 38: we use the SET ROW clause to update each target record directly from its corresponding PL/SQL record.
If we trace this operation using SQL Trace or event 10046, we see the following recursive SQL.
===================== PARSING IN CURSOR #40 len=100 dep=1 uid=57 oct=6 lid=57 tim=87989151677 hv=2983097369 ad='20e931b4' UPDATE UPDATE_TABLE SET ID = :B1 ,NAME = :B2 ,TYPE = :B3 ,OWNER = :B4 ,CREATED = :B5 WHERE ID = :B1 END OF STMT =====================
This trace file clearly highlights the problem with SET ROW. We have updated the primary key ID column (even though we did not modify it). We have also needlessly updated the CREATED and OWNER columns (also using their existing values). We have potentially high levels of redundancy and resource waste here. Not only do we update the primary key (which will generate locks on child tables) but we also generate more undo and redo than is necessary, thereby reducing the efficiency, concurrency and overall scalability of our system.
solutions to the problem
There are two methods we can adopt to make the SET ROW clause useful and avoid the issues described above. We will see both methods below, starting with a technique that will work on 10g upwards (this will compile on 9i but a known bug generates an ORA-03113 error on execution). Both methods take advantage of the ability to perform DML on an in-line view, introduced in Oracle 9i. We will use this technique to project only the columns we wish to update and use the SET ROW clause on these partial records accordingly.
solution for 10g+
As stated above, this workaround will generate an ORA-03113 on 9i so should be only be used for 10g upwards. In this workaround, we will fetch the source data and create a primary key array in the same manner as before. In addition, we will copy the transformed data into a new array of records based on just the two columns we wish to update. This third array will be used in the SET ROW clause against an in-line projection of the UPDATE_TABLE. The example is as follows.
SQL> DECLARE 2 3 CURSOR c_source IS 4 SELECT * 5 FROM update_table; 6 7 TYPE aat_whole_row IS TABLE OF c_source%ROWTYPE 8 INDEX BY PLS_INTEGER; 9 10 TYPE rt_update_cols IS RECORD 11 ( name update_table.name%TYPE 12 , type update_table.type%TYPE 13 ); 14 15 TYPE aat_update_cols IS TABLE OF rt_update_cols 16 INDEX BY PLS_INTEGER; 17 18 TYPE aat_id IS TABLE OF update_table.id%TYPE 19 INDEX BY PLS_INTEGER; 20 21 aa_rows aat_whole_row; -- bulk fetch source data 22 aa_ids aat_id; -- primary key array 23 aa_updates aat_update_cols; -- update data array 24 25 BEGIN 26 27 /* Fetch in one go as we are using small demo volumes... */ 28 OPEN c_source; 29 FETCH c_source BULK COLLECT INTO aa_rows; 30 CLOSE c_source; 31 32 /* Process batch of data... */ 33 FOR i IN 1 .. aa_rows.COUNT LOOP 34 35 --[complex processing goes here]-- 36 aa_rows(i).name := LOWER(aa_rows(i).name); 37 aa_rows(i).type := LOWER(aa_rows(i).type); 38 39 /* Assign IDs to PK array... */ 40 aa_ids(i) := aa_rows(i).id; 41 42 /* Assign updateable data to update array... */ 43 aa_updates(i).name := aa_rows(i).name; 44 aa_updates(i).type := aa_rows(i).type; 45 46 END LOOP; 47 48 /* 49 || FORALL update using SET ROW. 50 || This will not update primary key... 51 */ 52 FORALL i IN 1 .. aa_rows.COUNT 53 UPDATE ( SELECT name, type 54 FROM update_table 55 WHERE id = aa_ids(i) ) 56 SET ROW = aa_updates(i); 57 58 DBMS_OUTPUT.PUT_LINE( SQL%ROWCOUNT || ' rows updated.' ); 59 60 END; 61 /
1000 rows updated. PL/SQL procedure successfully completed.
Some comments on this technique are as follows.
- Lines 10-16: we define a record type based on the columns we wish to update and then base an associative array type on this structure;
- Lines 43-44: once the transformations are complete, we copy the update data to an array that will be used in our SET ROW clause later on;
- Lines 53-55: we update an in-line projection of just the target columns, limiting each in-line view by primary key; and
- Line 56: we update the projection of two columns using SET ROW based on a PL/SQL record of two attributes.
If we examine the SQL Trace file, we can see that Oracle only updates the columns we have modified. We have avoided both the primary key updates and the redundant updates against columns that do not change.
===================== PARSING IN CURSOR #7 len=89 dep=1 uid=57 oct=6 lid=57 tim=81572960151 hv=2362111366 ad='21cee5a8' UPDATE ( SELECT NAME, TYPE FROM UPDATE_TABLE WHERE ID = :B1 ) SET NAME = :B2 ,TYPE = :B3 END OF STMT =====================
Note that this 10g+ solution targets the updates via the primary key values. This means, therefore, that the source data doesn't necessarily have to come from the table that is to be updated, even though this is how the example is structured.
solution for 9i release 2+
We will now see a similar technique that will work on 9i Release 2 (9.2) upwards. This alternative solution uses ROWIDs, rather than primary keys, to target the updates. This method works when the source and target tables are the same or the target table is at least included in the source cursor (i.e. to be able to fetch the ROWIDs). There are some subtle differences to the previous 10g+ solution, described after the following example.
SQL> DECLARE 2 3 CURSOR c_source IS 4 SELECT ROWID AS rid, name, type 5 FROM update_table; 6 7 TYPE aat_whole_row IS TABLE OF c_source%ROWTYPE 8 INDEX BY PLS_INTEGER; 9 10 TYPE rt_update_cols IS RECORD 11 ( name update_table.name%TYPE 12 , type update_table.type%TYPE 13 ); 14 15 TYPE aat_update_cols IS TABLE OF rt_update_cols 16 INDEX BY PLS_INTEGER; 17 18 TYPE aat_rowid IS TABLE OF ROWID 19 INDEX BY PLS_INTEGER; 20 21 aa_rows aat_whole_row; -- bulk fetch source data 22 aa_rowids aat_rowid; -- ROWID array 23 aa_updates aat_update_cols; -- update data array 24 25 BEGIN 26 27 /* Fetch in one go as we are using small demo volumes... */ 28 OPEN c_source; 29 FETCH c_source BULK COLLECT INTO aa_rows; 30 CLOSE c_source; 31 32 /* Process batch of data... */ 33 FOR i IN 1 .. aa_rows.COUNT LOOP 34 35 --[complex processing goes here]-- 36 aa_rows(i).name := LOWER(aa_rows(i).name); 37 aa_rows(i).type := LOWER(aa_rows(i).type); 38 39 /* Assign ROWID to ROWID array... */ 40 aa_rowids(i) := aa_rows(i).rid; 41 42 /* Assign updateable columns to update array... */ 43 aa_updates(i).name := aa_rows(i).name; 44 aa_updates(i).type := aa_rows(i).type; 45 46 END LOOP; 47 48 /* 49 || FORALL update using SET ROW and ROWIDs. 50 || This will not update primary key... 51 */ 52 FORALL i IN 1 .. aa_rows.COUNT 53 UPDATE ( SELECT name, type 54 FROM update_table ) 55 SET ROW = aa_updates(i) 56 WHERE ROWID = aa_rowids(i) ; 57 58 DBMS_OUTPUT.PUT_LINE( SQL%ROWCOUNT || ' rows updated.' ); 59 60 END; 61 /
1000 rows updated. PL/SQL procedure successfully completed.
Note the following differences with this 9i technique.
- Lines 3-5: the source cursor includes the ROWID of the table that is to be updated;
- Lines 7-8: we base our associative array type on the source cursor%ROWTYPE as in the 10g+ example. Note that support for this this was introduced in the 9.2.0.3 patchset. Readers running 9.2.0.1 or 9.2.0.2 will therefore need to create a record type that matches the cursor columns and base the array type on this instead;
- Lines 18-19: we create an associative array type of ROWID, rather than the primary key column;
- Line 40: the ROWID from each element is copied to a separate array for use in the update; and
- Line 53-56: we update the projection of two columns with SET ROW based on a PL/SQL record of two attributes, but this time the predicate (based on ROWID) is outside the in-line view.
This technique works on 9i for two main reasons. First, the predicate is outside the in-line view, which avoids the ORA-03113 issue. Second, ROWID is a pseudo-column so does not need to be projected in the in-line view to be referenced outside. If we wished to use the primary key instead of the ROWID, then we would need to project it. Of course, this would in turn require us to include the primary key in the PL/SQL update records, thereby negating the value of this technique.
To confirm that this technique works as designed, we can examine the SQL Trace file, which shows us the following recursive SQL. Again we confirm that only the two target columns are updated.
===================== PARSING IN CURSOR #3 len=92 dep=1 uid=59 oct=6 lid=59 tim=81106093837 hv=2939398511 ad='1e7f2930' UPDATE ( SELECT NAME, TYPE FROM UPDATE_TABLE ) SET NAME = :B1 ,TYPE = :B2 WHERE ROWID = :B1 END OF STMT =====================
validating the techniques
The recursive SQL statments in the trace files are reasonably conclusive that no superfluous updates are occurring when we execute our SET ROW solutions. For further validation that this technique avoids all primary key updates, readers can try the following tests.
- first, create a child table with a foreign key to the sample table;
- second, lock this child table in exclusive mode (LOCK TABLE child_table IN EXCLUSIVE MODE); and
- third, in a new session, attempt to run each of the examples in this article.
Readers will find that the two solutions presented in this article will complete without issue because they do not update the primary key. If they did, they would be blocked by the lock on the child table in the other session, because a primary key update will attempt to take share locks on child tables (which is one of the reasons why indexed foreign keys are so important). The "limitations" example will be blocked by the lock on the child table because it updates every column including the primary key, as we saw earlier.
acknowledgements
Thanks to David Sharp for the ROWID workaround for 9i.
source code
The source code for the examples in this article can be downloaded from here.
Adrian Billington, June 2007
Back to Top