overcoming the limitations of set row

The SET ROW clause has been available since Oracle 9i. This clause enables us to update a target record using a PL/SQL record and streamlined syntax of either of the following forms.

   UPDATE table_name
   SET ROW = plsql_record
   WHERE  ...
   
   FORALL i IN array_of_records.FIRST .. array_of_records.LAST
      UPDATE table_name
      SET ROW = array_of_records(i)
      WHERE  ...

On the face of it, this feature appears to be quite useful, especially in reducing the amount of code we need to write and maintain. However, there are two (major) shortcomings with this feature, as follows.

the target table's primary key is updated on each record touched by the SET ROW clause, even though its value has probably not changed (or at least it shouldn't change); and
it is unlikely that many of the columns will have changed their values, so for updates to a small number of columns in a target table, this clause does far too much work. Most of the updates will be performed against columns that were not intended (i.e. their values haven't changed).

This article will outline two techniques to avoid both of these issues while using the SET ROW clause. For readers who are unfamiliar with the SET ROW syntax and its shortcomings, read this oracle-developer.net article.

setup

For the examples in this article, we will create a simple table with 5 columns based on 1,000 records from ALL_OBJECTS, as follows. Note that we have a single-column primary key. This table will serve as both the source and target table for our examples.

SQL> CREATE TABLE update_table (id PRIMARY KEY, name, type, owner, created)
  2  AS
  3     SELECT ROWNUM, object_name, object_type, owner, created
  4     FROM   all_objects
  5     WHERE  ROWNUM <= 1000;

Table created.

a note on the examples

The examples that follow are deliberately simplified. We will assume that the sample table has been partially loaded and 2 of the 5 columns in each record must be updated by some "post-processing" (this potentially expensive technique is surprisingly quite common in many ETL systems). We suppose that these 2 columns must undergo some complex transformations that make this operation too complicated for a single SQL UPDATE.

All examples will use bulk fetching and binding. Again, to keep it simple, we will bulk fetch our data in one pass, rather than use LOOP and LIMIT (an absolute necessity when dealing with large volumes of data).

set row limitations

We will begin with an example of the limitations of the SET ROW clause. We will fetch the source data into an array of records and transform the two target columns in each record, before applying a single bulk update with the SET ROW clause.

SQL> DECLARE
  2
  3     CURSOR c_source IS
  4        SELECT *
  5        FROM   update_table;
  6
  7     TYPE aat_whole_row IS TABLE OF c_source%ROWTYPE
  8        INDEX BY PLS_INTEGER;
  9
 10     TYPE aat_id IS TABLE OF update_table.id%TYPE
 11        INDEX BY PLS_INTEGER;
 12
 13     aa_rows aat_whole_row;
 14     aa_ids  aat_id;
 15
 16  BEGIN
 17
 18     /* Fetch in one go as we are using small demo volumes... */
 19     OPEN c_source;
 20     FETCH c_source BULK COLLECT INTO aa_rows;
 21     CLOSE c_source;
 22
 23     /* Process batch of data... */
 24     FOR i IN 1 .. aa_rows.COUNT LOOP
 25
 26        --[complex processing goes here]--
 27        aa_rows(i).name := LOWER(aa_rows(i).name);
 28        aa_rows(i).type := LOWER(aa_rows(i).type);
 29
 30        /* Assign IDs to PK array... */
 31        aa_ids(i) := aa_rows(i).id;
 32
 33     END LOOP;
 34
 35     /* FORALL update using SET ROW. Will update primary key... */
 36     FORALL i IN 1 .. aa_rows.COUNT
 37        UPDATE update_table
 38        SET    ROW = aa_rows(i)
 39        WHERE  id  = aa_ids(i);
 40
 41     DBMS_OUTPUT.PUT_LINE( SQL%ROWCOUNT || ' rows updated.' );
 42
 43  END;
 44  /

1000 rows updated.

Some comments on this example are as follows.

Lines 27-28: these represent our "complex logic" that transforms the data for the two update columns;
Line 31: we maintain a separate array of primary keys to enable us to target each update correctly; and
Line 38: we use the SET ROW clause to update each target record directly from its corresponding PL/SQL record.

If we trace this operation using SQL Trace or event 10046, we see the following recursive SQL.

=====================
PARSING IN CURSOR #40 len=100 dep=1 uid=57 oct=6 lid=57 tim=87989151677 hv=2983097369 ad='20e931b4'
UPDATE UPDATE_TABLE SET ID = :B1 ,NAME = :B2 ,TYPE = :B3 ,OWNER = :B4 ,CREATED = :B5 WHERE ID = :B1 
END OF STMT
=====================

This trace file clearly highlights the problem with SET ROW. We have updated the primary key ID column (even though we did not modify it). We have also needlessly updated the CREATED and OWNER columns (also using their existing values). We have potentially high levels of redundancy and resource waste here. Not only do we update the primary key (which will generate locks on child tables) but we also generate more undo and redo than is necessary, thereby reducing the efficiency, concurrency and overall scalability of our system.

solutions to the problem

There are two methods we can adopt to make the SET ROW clause useful and avoid the issues described above. We will see both methods below, starting with a technique that will work on 10g upwards (this will compile on 9i but a known bug generates an ORA-03113 error on execution). Both methods take advantage of the ability to perform DML on an in-line view, introduced in Oracle 9i. We will use this technique to project only the columns we wish to update and use the SET ROW clause on these partial records accordingly.

solution for 10g+

As stated above, this workaround will generate an ORA-03113 on 9i so should be only be used for 10g upwards. In this workaround, we will fetch the source data and create a primary key array in the same manner as before. In addition, we will copy the transformed data into a new array of records based on just the two columns we wish to update. This third array will be used in the SET ROW clause against an in-line projection of the UPDATE_TABLE. The example is as follows.

SQL> DECLARE
  2
  3     CURSOR c_source IS
  4        SELECT *
  5        FROM   update_table;
  6
  7     TYPE aat_whole_row IS TABLE OF c_source%ROWTYPE
  8        INDEX BY PLS_INTEGER;
  9
 10     TYPE rt_update_cols IS RECORD
 11     ( name update_table.name%TYPE
 12     , type update_table.type%TYPE
 13     );
 14
 15     TYPE aat_update_cols IS TABLE OF rt_update_cols
 16        INDEX BY PLS_INTEGER;
 17
 18     TYPE aat_id IS TABLE OF update_table.id%TYPE
 19        INDEX BY PLS_INTEGER;
 20
 21     aa_rows     aat_whole_row;    -- bulk fetch source data
 22     aa_ids      aat_id;           -- primary key array
 23     aa_updates  aat_update_cols;  -- update data array
 24
 25  BEGIN
 26
 27     /* Fetch in one go as we are using small demo volumes... */
 28     OPEN c_source;
 29     FETCH c_source BULK COLLECT INTO aa_rows;
 30     CLOSE c_source;
 31
 32     /* Process batch of data... */
 33     FOR i IN 1 .. aa_rows.COUNT LOOP
 34
 35        --[complex processing goes here]--
 36        aa_rows(i).name := LOWER(aa_rows(i).name);
 37        aa_rows(i).type := LOWER(aa_rows(i).type);
 38
 39        /* Assign IDs to PK array... */
 40        aa_ids(i)          := aa_rows(i).id;
 41
 42        /* Assign updateable data to update array... */
 43        aa_updates(i).name := aa_rows(i).name;
 44        aa_updates(i).type := aa_rows(i).type;
 45
 46     END LOOP;
 47
 48     /*
 49     || FORALL update using SET ROW.
 50     || This will not update primary key...
 51     */
 52     FORALL i IN 1 .. aa_rows.COUNT
 53        UPDATE ( SELECT name, type
 54                 FROM   update_table
 55                 WHERE  id = aa_ids(i) )
 56        SET    ROW = aa_updates(i);
 57
 58     DBMS_OUTPUT.PUT_LINE( SQL%ROWCOUNT || ' rows updated.' );
 59
 60  END;
 61  /

1000 rows updated.

PL/SQL procedure successfully completed.

Some comments on this technique are as follows.

Lines 10-16: we define a record type based on the columns we wish to update and then base an associative array type on this structure;
Lines 43-44: once the transformations are complete, we copy the update data to an array that will be used in our SET ROW clause later on;
Lines 53-55: we update an in-line projection of just the target columns, limiting each in-line view by primary key; and
Line 56: we update the projection of two columns using SET ROW based on a PL/SQL record of two attributes.

If we examine the SQL Trace file, we can see that Oracle only updates the columns we have modified. We have avoided both the primary key updates and the redundant updates against columns that do not change.

=====================
PARSING IN CURSOR #7 len=89 dep=1 uid=57 oct=6 lid=57 tim=81572960151 hv=2362111366 ad='21cee5a8'
UPDATE ( SELECT NAME, TYPE FROM UPDATE_TABLE WHERE ID = :B1 ) SET NAME = :B2 ,TYPE = :B3 
END OF STMT
=====================

Note that this 10g+ solution targets the updates via the primary key values. This means, therefore, that the source data doesn't necessarily have to come from the table that is to be updated, even though this is how the example is structured.

solution for 9i release 2+

We will now see a similar technique that will work on 9i Release 2 (9.2) upwards. This alternative solution uses ROWIDs, rather than primary keys, to target the updates. This method works when the source and target tables are the same or the target table is at least included in the source cursor (i.e. to be able to fetch the ROWIDs). There are some subtle differences to the previous 10g+ solution, described after the following example.

SQL> DECLARE
  2
  3     CURSOR c_source IS
  4        SELECT ROWID AS rid, name, type
  5        FROM   update_table;
  6
  7     TYPE aat_whole_row IS TABLE OF c_source%ROWTYPE
  8        INDEX BY PLS_INTEGER;
  9
 10     TYPE rt_update_cols IS RECORD
 11     ( name update_table.name%TYPE
 12     , type update_table.type%TYPE
 13     );
 14
 15     TYPE aat_update_cols IS TABLE OF rt_update_cols
 16        INDEX BY PLS_INTEGER;
 17
 18     TYPE aat_rowid IS TABLE OF ROWID
 19        INDEX BY PLS_INTEGER;
 20
 21     aa_rows     aat_whole_row;    -- bulk fetch source data
 22     aa_rowids   aat_rowid;        -- ROWID array
 23     aa_updates  aat_update_cols;  -- update data array
 24
 25  BEGIN
 26
 27     /* Fetch in one go as we are using small demo volumes... */
 28     OPEN c_source;
 29     FETCH c_source BULK COLLECT INTO aa_rows;
 30     CLOSE c_source;
 31
 32     /* Process batch of data... */
 33     FOR i IN 1 .. aa_rows.COUNT LOOP
 34
 35        --[complex processing goes here]--
 36        aa_rows(i).name := LOWER(aa_rows(i).name);
 37        aa_rows(i).type := LOWER(aa_rows(i).type);
 38
 39        /* Assign ROWID to ROWID array... */
 40        aa_rowids(i)        := aa_rows(i).rid;
 41
 42        /* Assign updateable columns to update array... */
 43        aa_updates(i).name := aa_rows(i).name;
 44        aa_updates(i).type := aa_rows(i).type;
 45
 46     END LOOP;
 47
 48     /*
 49     || FORALL update using SET ROW and ROWIDs.
 50     || This will not update primary key...
 51     */
 52     FORALL i IN 1 .. aa_rows.COUNT
 53        UPDATE ( SELECT name, type
 54                 FROM   update_table )
 55        SET   ROW = aa_updates(i)
 56        WHERE ROWID = aa_rowids(i) ;
 57
 58     DBMS_OUTPUT.PUT_LINE( SQL%ROWCOUNT || ' rows updated.' );
 59
 60  END;
 61  /

1000 rows updated.

PL/SQL procedure successfully completed.

Note the following differences with this 9i technique.

Lines 3-5: the source cursor includes the ROWID of the table that is to be updated;
Lines 7-8: we base our associative array type on the source cursor%ROWTYPE as in the 10g+ example. Note that support for this this was introduced in the 9.2.0.3 patchset. Readers running 9.2.0.1 or 9.2.0.2 will therefore need to create a record type that matches the cursor columns and base the array type on this instead;
Lines 18-19: we create an associative array type of ROWID, rather than the primary key column;
Line 40: the ROWID from each element is copied to a separate array for use in the update; and
Line 53-56: we update the projection of two columns with SET ROW based on a PL/SQL record of two attributes, but this time the predicate (based on ROWID) is outside the in-line view.

This technique works on 9i for two main reasons. First, the predicate is outside the in-line view, which avoids the ORA-03113 issue. Second, ROWID is a pseudo-column so does not need to be projected in the in-line view to be referenced outside. If we wished to use the primary key instead of the ROWID, then we would need to project it. Of course, this would in turn require us to include the primary key in the PL/SQL update records, thereby negating the value of this technique.

To confirm that this technique works as designed, we can examine the SQL Trace file, which shows us the following recursive SQL. Again we confirm that only the two target columns are updated.

=====================
PARSING IN CURSOR #3 len=92 dep=1 uid=59 oct=6 lid=59 tim=81106093837 hv=2939398511 ad='1e7f2930'
UPDATE ( SELECT NAME, TYPE FROM UPDATE_TABLE ) SET NAME = :B1 ,TYPE = :B2 WHERE ROWID = :B1 
END OF STMT
=====================

validating the techniques

The recursive SQL statments in the trace files are reasonably conclusive that no superfluous updates are occurring when we execute our SET ROW solutions. For further validation that this technique avoids all primary key updates, readers can try the following tests.

first, create a child table with a foreign key to the sample table;
second, lock this child table in exclusive mode (LOCK TABLE child_table IN EXCLUSIVE MODE); and
third, in a new session, attempt to run each of the examples in this article.

Readers will find that the two solutions presented in this article will complete without issue because they do not update the primary key. If they did, they would be blocked by the lock on the child table in the other session, because a primary key update will attempt to take share locks on child tables (which is one of the reasons why indexed foreign keys are so important). The "limitations" example will be blocked by the lock on the child table because it updates every column including the primary key, as we saw earlier.

acknowledgements

Thanks to David Sharp for the ROWID workaround for 9i.

source code

The source code for the examples in this article can be downloaded from here.

Adrian Billington, June 2007