binding in-lists in 10g

This article demonstrates a new method for binding IN-lists in 10g, using the new collection condition MEMBER OF. Also included is a common pre-10g method and a performance comparison (with dramatic results which leads me to conclude that MEMBER OF is much simpler to use but should only be used with very small sets or lists).

Incidentally, how to bind IN-lists (i.e. how to pass a list of values in one parameter and have Oracle treat the single parameter as a list) is a common problem cited on many web forums. Unfortunately, what many developers do is construct a dynamic SQL statement and concatenate a delimited string to the end to represent the list of values. In addition to shifting from static to dynamic SQL, this approach also makes it impossible to use bind variables and, for form-based applications, also leaves the application wide open to SQL injection. An alternative to this is to turn the delimited string into a collection first and reference the collection in the SQL.

In this article, we will use examples that pass collections of data as IN-lists, rather than lists of values composed as delimited strings.

setup

First we will create some simple collection types to use in our examples, as follows.

SQL> CREATE OR REPLACE TYPE varchar2_ntt AS TABLE OF VARCHAR2(4000);
  2  /

Type created.

SQL> CREATE OR REPLACE TYPE number_ntt AS TABLE OF NUMBER;
  2  /

Type created.

in-list binding in previous versions

The correct way to bind an in-list, regardless of Oracle version from 8.0 onwards, is to have the list-values passed as a collection. We can then cast the collection to a pseudo-table (using the TABLE operator). We will see an example of this below. First we will create a simple procedure to find all objects in a list of categories. The list of categories will be passed to our procedure as a collection.

SQL> CREATE PROCEDURE which_objects( which_types_in IN varchar2_ntt ) AS
  2  BEGIN
  3     FOR r IN (SELECT object_name
  4               FROM   user_objects
  5               WHERE  object_type IN (SELECT column_value
  6                                      FROM   TABLE(which_types_in)))
  7     LOOP
  8        DBMS_OUTPUT.PUT_LINE( r.object_name );
  9     END LOOP;
 10  END;
 11  /

Procedure created.

Our client program generates the "in-list" collection and calls the procedure, as follows.

SQL> DECLARE
  2     nt_types varchar2_ntt := varchar2_ntt('TABLE','TYPE','PACKAGE');
  3  BEGIN
  4     which_objects( which_types_in => nt_types );
  5  END;
  6  /
TGT
PLSQL_PROFILER_DATA
PLSQL_PROFILER_UNITS
SRC
PROF_REPORT_UTILITIES
PLSQL_PROFILER_RUNS
PKG
XPLAN_OT
XPLAN_NTT
XPLAN
NUMBER_NTT
VARCHAR2_NTT
ETL_PKG

PL/SQL procedure successfully completed.

in-list binding in 10g

We will now see the 10g alternative using the new MEMBER OF condition for collections. The syntax is very simple and doesn't require us to cast the collection to a pseudo-table.

SQL> CREATE PROCEDURE which_objects_10g( which_types_in IN varchar2_ntt ) AS
  2  BEGIN
  3     FOR r IN (SELECT object_name
  4               FROM   user_objects
  5               WHERE  object_type MEMBER OF which_types_in)
  6     LOOP
  7        DBMS_OUTPUT.PUT_LINE( r.object_name );
  8     END LOOP;
  9  END;
 10  /

Procedure created.

We call this version of the procedure in the same way, as follows.

SQL> DECLARE
  2     nt_types varchar2_ntt := varchar2_ntt('TABLE','TYPE','PACKAGE');
  3  BEGIN
  4     which_objects_10g( which_types_in => nt_types );
  5  END;
  6  /
TGT
PLSQL_PROFILER_DATA
PLSQL_PROFILER_UNITS
SRC
PROF_REPORT_UTILITIES
PLSQL_PROFILER_RUNS
PKG
XPLAN_OT
XPLAN_NTT
XPLAN
NUMBER_NTT
VARCHAR2_NTT
ETL_PKG

PL/SQL procedure successfully completed.

a simple performance test

We will compare the new and old methods for IN-list binding under a couple of simple scenarios, as follows:

test one: filtering a table based on an in-list

For our first performance test, we will compare methods for binding an IN-list to a SQL statement, using a version of Tom Kyte's RUNSTATS utility. We will create a table with 1 million rows and query it with an IN-list collection of approximately 100 keys. First, we create the sample table as follows.

SQL> CREATE TABLE million_rows
  2  AS
  3     SELECT ROWNUM AS id
  4     ,      RPAD('x',100) AS data
  5     FROM   dual
  6     CONNECT BY ROWNUM < 1000000;

Table created.

SQL> ALTER TABLE million_rows ADD
  2     CONSTRAINT million_rows_pk
  3     PRIMARY KEY (id);

Table altered.

SQL> BEGIN
  2     DBMS_STATS.GATHER_TABLE_STATS(
  3        USER, 'MILLION_ROWS', estimate_percent=>100
  4        );
  5  END;
  6  /

PL/SQL procedure successfully completed.

We will now create a small procedure to compare the new and old methods. This procedure will load a collection with a parameterised sample of IDs to test the TABLE operator and MEMBER OF condition at different volumes.

SQL> CREATE PROCEDURE compare_in_list( sample_in IN NUMBER ) IS
  2
  3     nt_ids number_ntt := number_ntt();
  4
  5     TYPE million_rows_aat IS TABLE OF million_rows%ROWTYPE
  6        INDEX BY PLS_INTEGER;
  7     aa_rows million_rows_aat;
  8
  9  BEGIN
 10
 11     /* Fetch a sample of data to use as the in-list... */
 12     EXECUTE IMMEDIATE
 13        'SELECT id FROM million_rows SAMPLE (' || sample_in || ')'
 14     BULK COLLECT INTO nt_ids;
 15
 16     DBMS_OUTPUT.PUT_LINE( 'IN-list size : ' || nt_ids.COUNT );
 17
 18     runstats_pkg.rs_start;
 19
 20     /* Fetch the matching records using the TABLE method... */
 21     SELECT * BULK COLLECT INTO aa_rows
 22     FROM   million_rows
 23     WHERE  id IN (SELECT column_value FROM TABLE(CAST(nt_ids AS number_ntt)));
 24
 25     runstats_pkg.rs_middle;
 26
 27     /* ...and the 10g MEMBER method... */
 28     SELECT * BULK COLLECT INTO aa_rows
 29     FROM   million_rows
 30     WHERE  id MEMBER OF nt_ids;
 31
 32     runstats_pkg.rs_stop(1000);
 33
 34  END;
 35  /

Procedure created.

We will test with a small sample of the MILLION_ROWS table (representing an IN-list of approximately 100 rows), as follows.

SQL> exec compare_in_list( 0.01 );
IN-list size : 107
Run1 ran in 110 hsecs
Run2 ran in 1606 hsecs
Run1 ran in 6.85% of the time


Name                                  Run1        Run2        Diff
STAT..recursive cpu usage                3       1,230       1,227
LATCH.multiblock read objects            0       2,156       2,156
LATCH.simulator hash latch             111       3,907       3,796
STAT..physical reads cache pre           0      14,112      14,112
STAT..free buffer inspected            128      15,169      15,041
STAT..consistent gets                  323      15,396      15,073
STAT..consistent gets from cac         323      15,396      15,073
STAT..session logical reads            323      15,396      15,073
STAT..free buffer requested            101      15,195      15,094
STAT..physical reads                   101      15,195      15,094
STAT..physical reads cache             101      15,195      15,094
LATCH.cache buffers lru chain          105      15,264      15,159
STAT..no work - consistent rea           1      15,385      15,384
STAT..table scan blocks gotten           0      15,385      15,385
LATCH.object queue header oper         210      30,458      30,248
LATCH.cache buffers chains             623      46,141      45,518
STAT..table scan rows gotten             0   1,000,000   1,000,000


Run1 latches total versus run2 -- difference and pct
        Run1        Run2        Diff        Pct
       1,435      99,876      98,441      1.44%

PL/SQL procedure successfully completed.

We can see a dramatic difference in performance between the two methods, with the new MEMBER OF condition taking 16 times longer than when we queried the collection as a pseudo-table. The key to this lack of performance is evident in the statistics, particularly the "table scan rows gotten". This indicates that Oracle is using a full table scan with the MEMBER OF method. We can verify this below with EXPLAIN PLAN (although we will use a literal collection rather than a bind variable for convenience).

SQL> EXPLAIN PLAN SET STATEMENT_ID = 'MEMBER' FOR
  2     SELECT *
  3     FROM   million_rows
  4     WHERE  id MEMBER OF number_ntt(1,2,3);

Explained.

SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY(NULL, 'MEMBER'));

PLAN_TABLE_OUTPUT
--------------------------------------------------
Plan hash value: 1095651555

-------------------------------------------------- ...
| Id  | Operation         | Name         | Rows  | ...
-------------------------------------------------- ...
|   0 | SELECT STATEMENT  |              | 50000 | ...
|*  1 |  TABLE ACCESS FULL| MILLION_ROWS | 50000 | ...
-------------------------------------------------- ...

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ID"MEMBER OF"NUMBER_NTT"(1,2,3))

13 rows selected.

As indicated by the RUNSTATS results, Oracle has chosen a full table scan for the query, with an estimated cardinality of 50,000 (a 20% selectivity) for the filter. We can compare this with the CBO's treatment of a normal IN-list, as follows.

SQL> EXPLAIN PLAN SET STATEMENT_ID = 'IN' FOR
  2     SELECT *
  3     FROM   million_rows
  4     WHERE  id IN (1,2,3);

Explained.

SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY(NULL, 'IN'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------
Plan hash value: 127701680

---------------------------------------------------------------- ...
| Id  | Operation                    | Name            | Rows  | ...
---------------------------------------------------------------- ...
|   0 | SELECT STATEMENT             |                 |     3 | ...
|   1 |  INLIST ITERATOR             |                 |       | ...
|   2 |   TABLE ACCESS BY INDEX ROWID| MILLION_ROWS    |     3 | ...
|*  3 |    INDEX RANGE SCAN          | MILLION_ROWS_PK |     3 | ...
---------------------------------------------------------------- ...

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("ID"=1 OR "ID"=2 OR "ID"=3)

15 rows selected.

Oracle is able to recognise the list of literal values in this case and make the correct cardinality estimates, leading to an index range scan. Of course, this article is about parameterised IN-lists, so if we compare the new MEMBER method with the original TABLE operator method, we can see that Oracle handles the collection in a very different way.

SQL> EXPLAIN PLAN SET STATEMENT_ID = 'TABLE' FOR
  2     SELECT *
  3     FROM   million_rows
  4     WHERE  id IN (SELECT column_value FROM TABLE(number_ntt(1,2,3)));

Explained.

SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY(NULL, 'TABLE'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------
Plan hash value: 3090652353

--------------------------------------------------------------------------- ...
| Id  | Operation                               | Name            | Rows  | ...
--------------------------------------------------------------------------- ...
|   0 | SELECT STATEMENT                        |                 |   255 | ...
|   1 |  NESTED LOOPS                           |                 |   255 | ...
|   2 |   SORT UNIQUE                           |                 |       | ...
|   3 |    COLLECTION ITERATOR CONSTRUCTOR FETCH|                 |       | ...
|   4 |   TABLE ACCESS BY INDEX ROWID           | MILLION_ROWS    |     1 | ...
|*  5 |    INDEX UNIQUE SCAN                    | MILLION_ROWS_PK |     1 | ...
--------------------------------------------------------------------------- ...

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("ID"=VALUE(KOKBF$))

17 rows selected.

With the TABLE operator method, Oracle chooses a nested loops join using the unique index on MILLION_ROWS, hence the major performance improvement. This is due to the semantic difference between the two methods. With the TABLE operator, Oracle unnests the values from the collection and then uses these to probe the MILLION_ROWS_PK index. With the MEMBER condition, however, we only have a filter condition, not a join. We are asking for records from MILLION_ROWS where the ID is in the collection, rather than the other way round. This means that Oracle must probe the collection with values from the MILLION_ROWS.ID column. This has a dramatic effect on performance and there seems to be no way to reverse this behaviour.

There are a number of ways we can try to correct the cardinality or force Oracle to use an index, for example:

With any of these methods, Oracle will not rewrite our query in any way. There simply is no way around the fact that we are filtering MILLION_ROWS against the collection. The best we can achieve is an index full scan, forced by an appropriate hint, as follows.

SQL> EXPLAIN PLAN SET STATEMENT_ID = 'INDEX' FOR
  2     SELECT /*+ INDEX(million_rows) */ *
  3     FROM   million_rows
  4     WHERE  id MEMBER OF number_ntt(1,2,3);

Explained.

SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY(NULL, 'INDEX'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------
Plan hash value: 857746450

--------------------------------------------------------------- ...
| Id  | Operation                   | Name            | Rows  | ...
--------------------------------------------------------------- ...
|   0 | SELECT STATEMENT            |                 | 50000 | ...
|   1 |  TABLE ACCESS BY INDEX ROWID| MILLION_ROWS    | 50000 | ...
|*  2 |   INDEX FULL SCAN           | MILLION_ROWS_PK | 50000 | ...
--------------------------------------------------------------- ...

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("ID"MEMBER OF"NUMBER_NTT"(1,2,3))

14 rows selected.

It appears that an index range scan is simply not available as an access path when using this new condition, because Oracle will not reverse the direction of the filter.

The above performance tests were conducted using many probes of a small collection. It follows, therefore, that the larger the collection, the greater the disparity in performance between TABLE and MEMBER, with the latter becoming almost unusable. Some example timings are as follows.

Sample Size IN-List Size IN TABLE() (s) MEMBER OF (s)
0.001 5 0.1 5.1
0.01 107 1.1 16.1
0.1 984 13.7 107.2
1.0 10,000 21.4 >1,200

Note that this behaviour was last tested on an 11g Release 1 database (11.1.0.6) and it is unchanged.

test two: probing an in-list with a single value

The previous tests measured the performance of a parameterised IN-list in SQL, which is the most common application for this technique. However, we can also use the MEMBER condition in PL/SQL tests. For our second performance test, we will repeatedly probe a parameterised list with a single value, again comparing the original TABLE operator with the new MEMBER condition. For completeness, we will also measure the performance of a loop through the parameterised list to search for the single value. We will use an anonymous block for these tests, as follows.

SQL> DECLARE
  2     in_list  number_ntt := number_ntt();
  3     v_cnt    PLS_INTEGER;
  4     v_hits   PLS_INTEGER := 0;
  5  BEGIN
  6
  7     /* Add 100 elements to the collection "IN-list"... */
  8     in_list.EXTEND(100);
  9     FOR i IN 1 .. 100 LOOP
 10        in_list(i) := i;
 11     END LOOP;
 12
 13     /* TABLE operator... */
 14     timer.snap;
 15     FOR i IN 1 .. 10000 LOOP
 16        SELECT COUNT(*) INTO v_cnt FROM TABLE(in_list) WHERE column_value = i;
 17        IF v_cnt > 0 THEN
 18           v_hits := v_hits + 1;
 19        END IF;
 20     END LOOP;
 21     timer.show('TABLE  (hits=' || v_hits || ')');
 22
 23     /* MEMBER condition... */
 24     v_hits := 0;
 25     timer.snap;
 26     FOR i IN 1 .. 10000 LOOP
 27        IF i MEMBER OF in_list THEN
 28           v_hits := v_hits + 1;
 29        END IF;
 30     END LOOP;
 31     timer.show('MEMBER (hits=' || v_hits || ')');
 32
 33     /* Collection loop... */
 34     v_hits := 0;
 35     timer.snap;
 36     FOR i IN 1 .. 10000 LOOP
 37        FOR ii IN 1 .. in_list.COUNT LOOP
 38           IF i = in_list(ii) THEN
 39              v_hits := v_hits + 1;
 40              EXIT;
 41           END IF;
 42        END LOOP;
 43     END LOOP;
 44     timer.show('LOOP   (hits=' || v_hits || ')');
 45
 46  END;
 47  /
[TABLE  (hits=100)] 1.22 seconds
[MEMBER (hits=100)] 0.04 seconds
[LOOP   (hits=100)] 0.21 seconds

PL/SQL procedure successfully completed.

As there is no table access involved, we have used a simple TIMER package to measure relative performance. We can see that for this type of IN-list probing, the new MEMBER condition is more efficient and effective than any other method, so it might be worth using this technique under certain circumstances.

further reading

For a good discussion of techniques for binding IN-lists prior to 10g, particularly when they are supplied as delimited strings instead of collections, see this article by William Robertson. For more information on new collection conditions and operators in 10g, see this oracle-developer.net article. The TIMER and RUNSTATS packages used in the performance tests for this article can be found on the Utilities page of this site.

source code

The source code for the examples in this article can be downloaded from here.

Adrian Billington, June 2004 (updated September 2008)

Back to Top