associative arrays in oracle 9i release 2

Arrays have been available in PL/SQL since its very early versions, when Oracle called them "PL/SQL Tables". Of course, they behave nothing like a table because they are essentially an array structure, certainly in terms of how we interact with them. Somewhere around Oracle 8, PL/SQL Tables were re-badged as "index-by tables", which referred to the syntax we used to declare the array types with (TYPE array IS TABLE OF datatype INDEX BY BINARY INTEGER). This rename was probably because Oracle added "collections", which referred to the then-new nested table and varray SQL types.

With the release of 9.2 (9i Release 2), Oracle has re-badged index-by tables as "associative arrays" and it is these that this article will focus on. They are still PL/SQL-only arrays as we know them, with the difference being that we can now index them in three ways instead of one.

indexing associative arrays

Associative arrays in Oracle 9.2 can be indexed by any of the following types:

BINARY_INTEGER;
PLS_INTEGER; and
VARCHAR2.

PLS_INTEGER is a supposedly faster or equivalent implementation of BINARY_INTEGER but the last of these indexing methods is more likely to draw our attention. For the first time, we can index arrays by strings in PL/SQL. This simple fact makes arrays in PL/SQL far more flexible than in previous versions with more practical use for our program data.

index by integer

Indexing PL/SQL arrays by integer is as old as the technology itself, so we shall not spend much time on this. The following example performs a simple timing comparison of loading and accessing two arrays: one indexed by BINARY_INTEGER and the other by PLS_INTEGER.

SQL> DECLARE
  2
  3     TYPE aat_binary IS TABLE OF VARCHAR2(10)
  4        INDEX BY BINARY_INTEGER;
  5     aa_binary aat_binary;
  6
  7     TYPE aat_pls IS TABLE OF VARCHAR2(10)
  8        INDEX BY PLS_INTEGER;
  9     aa_pls aat_pls;
 10
 11     v_dummy VARCHAR2(10);
 12
 13  BEGIN
 14
 15     timer.snap;
 16     FOR i IN 1 .. 1000000 LOOP
 17        aa_binary(i) := RPAD('X',10);
 18     END LOOP;
 19     timer.show('Load BINARY');
 20
 21     timer.snap;
 22     FOR i IN 1 .. 1000000 LOOP
 23        aa_pls(i) := RPAD('X',10);
 24     END LOOP;
 25     timer.show('Load PLS');
 26
 27     timer.snap;
 28     FOR i IN 1 .. 1000000 LOOP
 29        v_dummy := aa_binary(i);
 30     END LOOP;
 31     timer.show('Access BINARY');
 32
 33     timer.snap;
 34     FOR i IN 1 .. 1000000 LOOP
 35        v_dummy := aa_pls(i);
 36     END LOOP;
 37     timer.show('Access PLS');
 38
 39  END;
 40  /

[Load BINARY] 1.56 seconds
[Load PLS] 1.81 seconds
[Access BINARY] 0.70 seconds
[Access PLS] 0.71 seconds

PL/SQL procedure successfully completed.

The timings show that there is very little difference between the two indexing types so we should not be too concerned about converting all of our BINARY_INTEGER arrays to PLS_INTEGER!

index by string

Prior to the introduction of bulk fetching and binding in Oracle 8i (that requires arrays or collections to be used), PL/SQL developers generally made little use of arrays. Since then, we have become far more comfortable with using these structures, both for array processing and for small sets of program or reference data. String-indexed associative arrays provide us with more possibilities for array-usage in PL/SQL. Before we see some examples of their flexibility, we will see how to assign and use these structures.

In the following example, we will assign a small associative array with some hard-coded literal index values. We will also see how we define the type in our declaration. The index can be VARCHAR2 or a subtype of VARCHAR2, up to the usual PL/SQL limit of 32767 bytes. The CHAR type cannot be used.

SQL> DECLARE
  2
  3     TYPE aat_days_in_month IS TABLE OF NUMBER
  4        INDEX BY VARCHAR2(30);
  5
  6     aa_days aat_days_in_month;
  7
  8  BEGIN
  9
 10     /* Literal assignments... */
 11     aa_days('JANUARY') := 31;
 12     aa_days('FEBRUARY') := EXTRACT(DAY FROM LAST_DAY(ADD_MONTHS(TRUNC(SYSDATE,'YEAR'),1)));
 13     -- and so on --
 14     aa_days('DECEMBER') := 31;
 15
 16     /* Literal access... */
 17     DBMS_OUTPUT.PUT_LINE(
 18        'There are ' || aa_days('FEBRUARY') || ' days in February this year.'
 19        );
 20
 21  END;
 22  /

There are 28 days in February this year.

PL/SQL procedure successfully completed.

Note that the array will be organised according to the alphabetical order of the index values, although the array is not contiguous as such. This means we cannot use FOR LOOP or FORALL with string-indexed associative arrays. Instead, we step though the array using the NEXT (or PRIOR if going in reverse) pseudo-methods as follows.

SQL> DECLARE
  2
  3     TYPE aat_days_in_month IS TABLE OF NUMBER
  4        INDEX BY VARCHAR2(30);
  5
  6     aa_days aat_days_in_month;
  7
  8     v_start DATE;
  9     v_indx  VARCHAR2(30);
 10     v_days  NUMBER;
 11
 12  BEGIN
 13
 14     /* Looping assignment... */
 15     FOR i IN 0 .. 11 LOOP
 16
 17        v_start := ADD_MONTHS(TRUNC(SYSDATE,'YEAR'),i);
 18        v_indx := TRIM(TO_CHAR(v_start,'MONTH'));
 19        v_days := EXTRACT(DAY FROM LAST_DAY(v_start));
 20
 21        aa_days(v_indx) := v_days;
 22
 23     END LOOP;
 24
 25     /* Looping access... */
 26     v_indx := aa_days.FIRST;
 27     WHILE v_indx IS NOT NULL LOOP
 28        DBMS_OUTPUT.PUT_LINE(
 29           'Days in ' || v_indx || ' = ' || aa_days(v_indx)
 30           );
 31        v_indx := aa_days.NEXT(v_indx);
 32     END LOOP;
 33
 34  END;
 35  /

Days in APRIL = 30
Days in AUGUST = 31
Days in DECEMBER = 31
Days in FEBRUARY = 28
Days in JANUARY = 31
Days in JULY = 31
Days in JUNE = 30
Days in MARCH = 31
Days in MAY = 31
Days in NOVEMBER = 30
Days in OCTOBER = 31
Days in SEPTEMBER = 30

PL/SQL procedure successfully completed.

We can see that the elements are returned in index-order, rather than the order in which we assigned them. We cannot change this behaviour without intervening ourselves (for example, we could prepend each month with its month number from 01 to 12) or we could maintain a separate ordering "index". But generally we do not require string-indexed associative arrays to maintain any specific order.

caching reference data with associative arrays

Many developers will read about string-indexed associative arrays and immediately see the potential for caching lookup data. This has been a common application for the older index-by tables, but these had the limitation of requiring integer-based keys to the lookups (though there were some complicated "solutions" to this using hashing of strings to get a numeric index for string keys).

We will look at how we can use string-indexed associative arrays to cache reference data and compare the lookups for performance against direct table access. Of course, it should be noted at this stage that if the reference table can be joined to the source query, then this will be preferable. Cached lookups, using associative arrays, tend to be a useful tuning technique when the original table lookup is encapsulated in a shared function or when the reference table cannot be joined directly to the driving query.

In the following example we will create a small lookup table based on ALL_TABLES. This will have approximately 900 rows. We will then create a small package with two functions to lookup this reference table and use these to run some time comparisons. First we can create the "reference table" as follows.

SQL> CREATE TABLE lookup_table
  2  CACHE
  3  AS
  4     SELECT owner || '.' || table_name AS primary_key
  5     ,      DBMS_RANDOM.STRING('u',50) AS description
  6     FROM   all_tables;

Table created.

SQL> ALTER TABLE lookup_table ADD (
  2     CONSTRAINT lookup_table_pk
  3     PRIMARY KEY (primary_key)
  4     );

Table altered.

SQL> exec DBMS_STATS.GATHER_TABLE_STATS(USER,'LOOKUP_TABLE',cascade=>TRUE);

PL/SQL procedure successfully completed.

SQL> SELECT COUNT(*) FROM lookup_table;

  COUNT(*)
----------
       929

1 row selected.

Our lookup functions are contained in the following package. Note that the cache-based function (using the associative array) needs to be initialised. We include the code for this in the package initialisation section, which is executed the first time the package is called.

SQL> CREATE PACKAGE lookup_pkg AS
  2
  3     FUNCTION table_lookup(
  4              p_key IN lookup_table.primary_key%TYPE
  5              ) RETURN lookup_table.description%TYPE;
  6
  7     FUNCTION array_lookup(
  8              p_key IN lookup_table.primary_key%TYPE
  9              ) RETURN lookup_table.description%TYPE;
 10
 11  END lookup_pkg;
 12  /

Package created.

SQL> CREATE PACKAGE BODY lookup_pkg AS
  2
  3     TYPE aat_lookup IS TABLE OF lookup_table.description%TYPE
  4        INDEX BY lookup_table.primary_key%TYPE;
  5
  6     ga_lookups aat_lookup;
  7
  8     -----------------------------------------------------------
  9
 10     FUNCTION table_lookup(
 11              p_key IN lookup_table.primary_key%TYPE
 12              ) RETURN lookup_table.description%TYPE IS
 13
 14        v_return lookup_table.description%TYPE;
 15
 16     BEGIN
 17
 18        SELECT description INTO v_return
 19        FROM   lookup_table
 20        WHERE  primary_key = p_key;
 21        RETURN v_return;
 22
 23     EXCEPTION
 24        WHEN NO_DATA_FOUND THEN
 25           RETURN NULL;
 26     END table_lookup;
 27
 28     -----------------------------------------------------------
 29
 30     FUNCTION array_lookup(
 31              p_key IN lookup_table.primary_key%TYPE
 32              ) RETURN lookup_table.description%TYPE IS
 33     BEGIN
 34        RETURN ga_lookups(p_key);
 35     EXCEPTION
 36        WHEN NO_DATA_FOUND THEN
 37           RETURN NULL;
 38     END array_lookup;
 39
 40     -----------------------------------------------------------
 41
 42  BEGIN
 43
 44     /* Initialisation... */
 45     FOR r IN (SELECT primary_key, description FROM lookup_table) LOOP
 46        ga_lookups(r.primary_key) := r.description;
 47     END LOOP;
 48
 49  END lookup_pkg;
 50  /

Package body created.

We can now run a comparison of the two functions, using an oracle-developer.net version of Tom Kyte's RUNSTATS utility. We will run the associative array example first because we want this to include the cost of pre-loading the array. We will execute each function approximately 90,000 times by "inflating" the data we use as our source for the lookup keys.

SQL> exec runstats_pkg.rs_start;

PL/SQL procedure successfully completed.

SQL> DECLARE
  2     v_description lookup_table.description%TYPE;
  3  BEGIN
  4     FOR r IN (SELECT primary_key AS key
  5               FROM   lookup_table
  6               ,     (SELECT ROWNUM FROM dual CONNECT BY ROWNUM <= 100))
  7     LOOP
  8        v_description := lookup_pkg.array_lookup(r.key);
  9     END LOOP;
 10  END;
 11  /

PL/SQL procedure successfully completed.

Now we can run the same example but using the table lookup.

SQL> exec runstats_pkg.rs_middle;

PL/SQL procedure successfully completed.

SQL> DECLARE
  2     v_description lookup_table.description%TYPE;
  3  BEGIN
  4     FOR r IN (SELECT primary_key AS key
  5               FROM   lookup_table
  6               ,     (SELECT ROWNUM FROM dual CONNECT BY ROWNUM <= 100))
  7     LOOP
  8        v_description := lookup_pkg.table_lookup(r.key);
  9     END LOOP;
 10  END;
 11  /

PL/SQL procedure successfully completed.

SQL> exec runstats_pkg.rs_stop(1000);

Run1 ran in 273 hsecs
Run2 ran in 844 hsecs
Run1 ran in 32.35% of the time


Name                                  Run1        Run2        Diff
STAT..calls to get snapshot sc           8      92,906      92,898
STAT..execute count                      6      92,906      92,900
STAT..index fetch by key                 0      92,900      92,900
STAT..recursive calls               93,406     186,306      92,900
STAT..rows fetched via callbac           0      92,900      92,900
STAT..table fetch by rowid               0      92,900      92,900
LATCH.library cache                    557     185,990     185,433
LATCH.library cache pin                237     185,879     185,642
STAT..buffer is not pinned cou           1     185,801     185,800
LATCH.cache buffers chains           3,064     280,588     277,524
STAT..consistent gets                  937     279,635     278,698
STAT..session logical reads            937     279,635     278,698
STAT..consistent gets - examin           0     278,700     278,700


Run1 latches total versus run2 -- difference and pct
Run1        Run2        Diff        Pct
5,740     653,197     647,457       .88%

PL/SQL procedure successfully completed.

We can see that the associative array lookup is approximately three times faster in this example. We can repeat the comparison but this time in SQL (using Autotrace to limit the screen output).

SQL> set autotrace traceonly statistics

SQL> exec runstats_pkg.rs_start;

PL/SQL procedure successfully completed.

SQL> SELECT lookup_pkg.array_lookup(primary_key) AS description
  2  FROM   lookup_table
  3  ,     (SELECT ROWNUM FROM dual CONNECT BY ROWNUM <= 100);

92900 rows selected.


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        935  consistent gets
          0  physical reads
          0  redo size
    1261660  bytes sent via SQL*Net to client
      68622  bytes received via SQL*Net from client
       6195  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
      92900  rows processed

SQL> exec runstats_pkg.rs_middle;

PL/SQL procedure successfully completed.

SQL> SELECT lookup_pkg.table_lookup(primary_key) AS description
  2  FROM   lookup_table
  3  ,     (SELECT ROWNUM FROM dual CONNECT BY ROWNUM <= 100);

92900 rows selected.


Statistics
----------------------------------------------------------
      92900  recursive calls
          0  db block gets
     279635  consistent gets
          0  physical reads
          0  redo size
    1261660  bytes sent via SQL*Net to client
      68622  bytes received via SQL*Net from client
       6195  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
      92900  rows processed

SQL> exec runstats_pkg.rs_stop(1000);

Run1 ran in 224 hsecs
Run2 ran in 836 hsecs
Run1 ran in 26.79% of the time


Name                                  Run1        Run2        Diff
STAT..index fetch by key                 0      92,900      92,900
STAT..recursive calls                  503      93,403      92,900
STAT..rows fetched via callbac           0      92,900      92,900
STAT..table fetch by rowid               0      92,900      92,900
STAT..execute count                      4      92,905      92,901
STAT..calls to get snapshot sc           6      92,909      92,903
STAT..buffer is not pinned cou           1     185,802     185,801
LATCH.library cache pin             24,827     210,648     185,821
LATCH.library cache                 43,434     229,269     185,835
STAT..consistent gets - examin           0     278,700     278,700
STAT..consistent gets                  935     279,638     278,703
STAT..session logical reads            935     279,638     278,703
LATCH.cache buffers chains           1,870     280,588     278,718


Run1 latches total versus run2 -- difference and pct
Run1        Run2        Diff        Pct
82,726     733,358     650,632     11.28%

PL/SQL procedure successfully completed.

In SQL, using the cached lookup is almost four times faster than repeated table access. The RUNSTATS output shows the amount of additional work done by the table lookups, whereby the time taken by the associative array access is largely accounted for by the context-switching incurred by accessing PL/SQL functions from SQL.

a final note on cached lookups

Caching lookups using associative arrays, indexed by string or integer, is an extremely useful technique for tuning PL/SQL and SQL when our business rules or reference data are encapsulated in functions. Of course, the quickest method for accessing reference data is to join to the lookup table directly in our cursors and SQL statements. The following example is a useful reminder of why we should try to use SQL as much as possible. The SQL-only solution takes less than half the time of our quickest associative array solutions.

SQL> set autotrace traceonly statistics

SQL> set timing on

SQL> SELECT (SELECT lt2.description
  2          FROM   lookup_table lt2
  3          WHERE  lt1.primary_key = lt2.primary_key) AS description
  4  FROM   lookup_table lt1
  5  ,     (SELECT ROWNUM FROM dual CONNECT BY ROWNUM <= 100);

92900 rows selected.

Elapsed: 00:00:01.00

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       3724  consistent gets
          0  physical reads
          0  redo size
    1261660  bytes sent via SQL*Net to client
      68622  bytes received via SQL*Net from client
       6195  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
      92900  rows processed

source code

The source code for the examples in this article can be downloaded from here.

Adrian Billington, May 2002

oracle-developer.net