Being Too Clever For Your Own Good

I was with a client the other week and was asked to look at a particular SQL statement that kept failing due to lack of TEMP space. It looked something like this (names changed to protect the innocent, etc.)

CREATE TABLE lookup_table
AS
SELECT
    destn,
	studref,
    min(pi) pi
fROM (
SELECT
     pct1.DES       as container_type
    ,pc1.ID                 as pi
    ,pc1.SPN as pn
    ,pct2.DES       as Desc2
    ,pc2.DES        as studref
    ,pct3.des       as desc3
    ,pc3.dest_code        as destn
FROM
     container pc1
    ,element  e1
    ,container_type pct1
    ,element  e2
    ,container pc2
    ,container_type pct2
    ,element  e3
    ,container pc3
    ,container_type pct3
WHERE
    pc1.CONTAINER_TYPE_ID = 3
AND e1.CONTAINER_ID       = pc1.id
AND pct1.ID                       = pc1.CONTAINER_TYPE_ID
AND e2.id                         = e1.PARENT_ID
AND pc2.ID                        = e2.CONTAINER_ID
AND pct2.id                       = pc2.CONTAINER_TYPE_ID
AND e3.id                         = e2.PARENT_ID
AND pc3.ID                        = e3.CONTAINER_ID
AND pct3.id                       = pc3.CONTAINER_TYPE_ID
) pages
GROUP BY destn, studref
;

which when executed gave the following error:

CREATE TABLE lookup_table
*
ERROR at line 1:
ORA-12801: error signaled in parallel query server P030
ORA-01652: unable to extend temp segment by 128 in tablespace TEMP

Now of course my first reaction was "get the TEMP tablespace extended" but it had already just been extended to 8GB (made up of 4 tempfiles). So what could be causing the issue?

The first thing I did was to run an explain plan on the query, as my suspicion was that the joins were perhaps going to be hash joins, which can make heavy use of the TEMP tablespace if the hash can't be built in memory (defined by the HASH_AREA_SIZE parameter). The explain plan looked like this:

SQL> SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY());

PLAN_TABLE_OUTPUT


| Id | Operation | Name | Rows | Bytes | Cost | TQ |IN-OUT| PQ Distrib |

| 0 | CREATE TABLE STATEMENT | | 1 | 120 | 406 | | | |
| 1 | LOAD AS SELECT | | | | | | | |
| 2 | PX COORDINATOR | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10004 | 1 | 120 | 406 | Q1,04 | P->S | QC (RAND)
| 4 | SORT GROUP BY | | 1 | 120 | 406 | Q1,04 | PCWP | |
| 5 | PX RECEIVE | | 1 | 120 | 405 | Q1,04 | PCWP | |
| 6 | PX SEND HASH | :TQ10003 | 1 | 120 | 405 | Q1,03 | P->P | HASH |
| 7 | NESTED LOOPS | | 1 | 120 | 405 | Q1,03 | PCWP | |
| 8 | HASH JOIN | | 1 | 112 | 405 | Q1,03 | PCWP | |
| 9 | PX RECEIVE | | 1 | 100 | 305 | Q1,03 | PCWP | |
| 10 | PX SEND BROADCAST | :TQ10002 | 1 | 100 | 305 | Q1,02 | P->P | BROADCAST |
| 11 | NESTED LOOPS | | 1 | 100 | 305 | Q1,02 | PCWP | |
| 12 | NESTED LOOPS | | 1 | 87 | 305 | Q1,02 | PCWP | |
| 13 | NESTED LOOPS | | 1 | 77 | 305 | Q1,02 | PCWP | |
| 14 | HASH JOIN | | 1 | 65 | 305 | Q1,02 | PCWP | |
| 15 | PX RECEIVE | | 1 | 47 | 206 | Q1,02 | PCWP | |
| 16 | PX SEND BROADCAST | :TQ10001 | 1 | 47 | 206 | Q1,01 | P->P | BROADCAST |
| 17 | HASH JOIN | | 1 | 47 | 206 | Q1,01 | PCWP | |
| 18 | BUFFER SORT | | | | | Q1,01 | PCWC | |
| 19 | PX RECEIVE | | 1 | 26 | 0 | Q1,01 | PCWP | |
| 20 | PX SEND BROADCAST | :TQ10000 | 1 | 26 | 0 | | S->P | BROADCAST |
| 21 | NESTED LOOPS | | 1 | 26 | 0 | | | |
| 22 | INDEX UNIQUE SCAN | PK_CONTAINER_TYPE | 1 | 13 | 0 | | |
| 23 | INDEX FULL SCAN | PK_CONTAINER_TYPE | 1 | 13 | | | |
| 24 | PX BLOCK ITERATOR | | 2708K| 54M| 202 | Q1,01 | PCWC | |
| 25 | TABLE ACCESS FULL | CONTAINER | 2708K| 54M| 202 | Q1,01 | PCWP | |
| 26 | PX BLOCK ITERATOR | | 2340K| 40M| 96 | Q1,02 | PCWC | |
| 27 | TABLE ACCESS FULL | ELEMENT | 2340K| 40M| 96 | Q1,02 | PCWP | |
| 28 | TABLE ACCESS BY INDEX ROWID| ELEMENT | 1 | 12 | 2 | Q1,02 | PCW
| 29 | INDEX UNIQUE SCAN | PK_ELEMENT | 1 | | 1 | Q1,02 | PCWP |
| 30 | TABLE ACCESS BY INDEX ROWID | CONTAINER | 1 | 10 | 2 | Q1,02 | PC
| 31 | INDEX UNIQUE SCAN | PK_CONTAINER | 1 | | 1 | Q1,02 | PCWP |
| 32 | INDEX UNIQUE SCAN | PK_CONTAINER_TYPE | 1 | 13 | 0 | Q1,02 | PCWP |
| 33 | PX BLOCK ITERATOR | | 2340K| 26M| 96 | Q1,03 | PCWC | |
| 34 | TABLE ACCESS FULL | ELEMENT | 2340K| 26M| 96 | Q1,03 | PCWP | |
| 35 | TABLE ACCESS BY INDEX ROWID | CONTAINER | 1 | 8 | 2 | Q1,03 | PCWP |
| 36 | INDEX UNIQUE SCAN | PK_CONTAINER | 1 | | 1 | Q1,03 | PCWP | |
----------------------------------------------------------------------------------------------------

A couple of things jumped out at me: firstly, parallel query is being used (spot the PX RECEIVE, PX SEND BROADCAST and PX BLOCK ITERATOR operations), and secondly, there's lots of hash joins going on (three to be precise, which matches the number of tables in the query). So, a couple more bits of information to find out : roughly how big are the tables that we're joining (the smallest of which will determine the size of hash table built) and how big is the HASH_AREA_SIZE?

SQL> show parameter hash_area_size

NAME TYPE VALUE


hash_area_size integer 131072

SQL> select bytes from dba_segments where owner='STAGING' and segment_name = 'CONTAINER';

 BYTES

318767104

SQL> select bytes from dba_segments where owner='STAGING' and segment_name = 'ELEMENT';

 BYTES

150994944


SQL> select bytes from dba_segments where owner='STAGING' and segment_name = 'CONTAINER_TYPE';

 BYTES

65536

So, from looking at these figures, the HASH_AREA_SIZE isn't too big, but the smallest table (CONTAINER_TYPE) isn't that big anyway, and according to this article, you only need about 1.6 x the size of the smallest (driving) table available in the hash area.

From speaking to a couple of people, the main finger of suspicion was pointing towards parallel query. The consensus was that parallel query can multiply the actual amount of hash joins going on by the number of parallel slaves being used, and this could well end up taking more memory than if a single hash table was built. Looking at the initialization parameters, parallel query was enabled and with quite a high PARALLEL_MAX_SERVERS:

SQL> show parameter parallel_max_servers

NAME TYPE VALUE


parallel_max_servers integer 240

SQL> show parameter parallel_threads_per_cpu

NAME TYPE VALUE


parallel_threads_per_cpu integer 2

The server we were using had 12 SPARC processors, and all the tables and indexes were defined using PARALLEL = DEFAULT, so they should pick up the default degree of parallelism for the instance, which should be 12 (processors) x 2 (parallel threads per CPU) = 24. So what was happening when we ran our query? I ran it again, but in a separate session queried V$PX_PROCESS to see how many slaves were running:

SQL> SELECT * FROM v$px_process;

SERV STATUS PID SPID SID SERIAL#


P023 IN USE 52 11852 318 441
P012 IN USE 38 11830 327 7723
P000 IN USE 24 10078 348 2064
P004 IN USE 29 10086 357 1433
P006 IN USE 31 10090 366 2312
P007 IN USE 32 10092 367 431
P018 IN USE 44 11842 371 1366
P003 IN USE 28 10084 372 292
P008 IN USE 33 10095 373 453
P014 IN USE 40 11834 374 500
P001 IN USE 26 10080 381 348
P011 IN USE 36 10111 401 1170
P017 IN USE 43 11840 414 1002
P020 IN USE 46 11846 416 21671
P002 IN USE 27 10082 445 7451
P019 IN USE 45 11844 465 344
P016 IN USE 42 11838 474 626
P009 IN USE 34 10097 489 723
P010 IN USE 35 10106 503 254
P013 IN USE 39 11832 504 265
P015 IN USE 41 11836 505 1848
P021 IN USE 47 11848 512 6012
P005 IN USE 30 10088 519 170
P022 IN USE 49 11850 546 1334

24 rows selected.

which was what I was expecting given the above settings; however running the query at this degree of parallelism was causing the statement to fail, running out of temp space. So what if we reduced the DOP, would that have any effect?

SQL> drop table lookup_table;

Table dropped.

Elapsed: 00:00:00.04
SQL> CREATE TABLE lookup_table parallel (degree 24)
  2  AS
  3  SELECT
  4      destn,
  5   studref,
  6      min(pi) pi
  7  fROM (
  8  SELECT
  9       pct1.DES       as container_type
 10      ,pc1.ID                 as pi
 11      ,pc1.SPN as pn
 12      ,pct2.DES       as Desc2
 13      ,pc2.DES        as studref
 14      ,pct3.des       as desc3
 15      ,pc3.dest_code        as destn
 16  FROM
 17       container pc1
 18      ,element  e1
 19      ,container_type pct1
 20      ,element  e2
 21      ,container pc2
 22      ,container_type pct2
 23      ,element  e3
 24      ,container pc3
 25      ,container_type pct3
 26  WHERE
 27      pc1.CONTAINER_TYPE_ID = 3
 28  AND e1.CONTAINER_ID       = pc1.id
 29  AND pct1.ID                       = pc1.CONTAINER_TYPE_ID
 30  AND e2.id                         = e1.PARENT_ID
 31  AND pc2.ID                        = e2.CONTAINER_ID
 32  AND pct2.id                       = pc2.CONTAINER_TYPE_ID
 33  AND e3.id                         = e2.PARENT_ID
 34  AND pc3.ID                        = e3.CONTAINER_ID
 35  AND pct3.id                       = pc3.CONTAINER_TYPE_ID
 36  ) pages
 37  GROUP BY destn, studref
 38  ;
CREATE TABLE lookup_table parallel (degree 24)
*
ERROR at line 1:
ORA-12801: error signaled in parallel query server P030
ORA-01652: unable to extend temp segment by 128 in tablespace TEMP

What was interesting here was that when I queried V$PX_PROCESS in a different session whilst this statement was running, 48 slaves were running. Doug Burns explains why this is in his Parallel Query paper - basically Oracle will actually kick off sets of PX slaves for a given action, one as a producer of rows, and one as a consumer, which led to 48 PX slaves being listed in V$PX_PROCESS, and the query still ran out of temp space. So what if we ran it again, with a DOP of 12?

SQL> drop table lookup_table;

SQL> CREATE TABLE lookup_table parallel (degree 12)
2 AS
3 SELECT
4 destn,
5 studref,
6 min(pi) pi
7 fROM (
8 SELECT
9 pct1.DES as container_type
10 ,pc1.ID as pi
11 ,pc1.SPN as pn
12 ,pct2.DES as Desc2
13 ,pc2.DES as studref
14 ,pct3.des as desc3
15 ,pc3.dest_code as destn
16 FROM
17 container pc1
18 ,element e1
19 ,container_type pct1
20 ,element e2
21 ,container pc2
22 ,container_type pct2
23 ,element e3
24 ,container pc3
25 ,container_type pct3
26 WHERE
27 pc1.CONTAINER_TYPE_ID = 3
28 AND e1.CONTAINER_ID = pc1.id
29 AND pct1.ID = pc1.CONTAINER_TYPE_ID
30 AND e2.id = e1.PARENT_ID
31 AND pc2.ID = e2.CONTAINER_ID
32 AND pct2.id = pc2.CONTAINER_TYPE_ID
33 AND e3.id = e2.PARENT_ID
34 AND pc3.ID = e3.CONTAINER_ID
35 AND pct3.id = pc3.CONTAINER_TYPE_ID
36 ) pages
37 GROUP BY destn, studref
38 ;

Table created.

Elapsed: 00:05:51.07

and this time it is created without any problem. So now that we know that a DOP of 12, and consequently 24 PX slaves, is OK, how about taking the DEGREE clause of of the CREATE TABLE statement and instead limiting down the PARALLEL_MAX_SERVERS parameter to "throttle" it to a maximum of 24. Would that work?

SQL> conn sys/password@orcl as sysdba
Connected.
SQL> alter system set parallel_max_servers = 24 scope=memory;

System altered.

Elapsed: 00:00:00.00
SQL> conn STAGING/STAGING@orcl
Connected.

SQL> CREATE TABLE lookup_table
2 AS
3 SELECT
4 destn,
5 studref,
6 min(pi) pi
7 fROM (
8 SELECT
9 pct1.DES as container_type
10 ,pc1.ID as pi
11 ,pc1.SPN as pn
12 ,pct2.DES as Desc2
13 ,pc2.DES as studref
14 ,pct3.des as desc3
15 ,pc3.dest_code as destn
16 FROM
17 container pc1
18 ,element e1
19 ,container_type pct1
20 ,element e2
21 ,container pc2
22 ,container_type pct2
23 ,element e3
24 ,container pc3
25 ,container_type pct3
26 WHERE
27 pc1.CONTAINER_TYPE_ID = 3
28 AND e1.CONTAINER_ID = pc1.id
29 AND pct1.ID = pc1.CONTAINER_TYPE_ID
30 AND e2.id = e1.PARENT_ID
31 AND pc2.ID = e2.CONTAINER_ID
32 AND pct2.id = pc2.CONTAINER_TYPE_ID
33 AND e3.id = e2.PARENT_ID
34 AND pc3.ID = e3.CONTAINER_ID
35 AND pct3.id = pc3.CONTAINER_TYPE_ID
36 ) pages
37 GROUP BY destn, studref
38 ;

Table created.

Elapsed: 00:08:34.07

OK, that's good, although the query time has gone up - this could just be down to load on the server though. The principle is fine, by setting PARALLEL_MAX_SERVERS to a lower amount, we can limit the amount of parallelism taking place. How about going the whole hog and turning of PQ completely? How would that affect it?

SQL> conn sys/password@orcl as sysdba
Connected.
SQL> alter system set parallel_max_servers = 0 scope=memory;

System altered.

Elapsed: 00:00:00.00
SQL> conn STAGING/STAGING@orcl
Connected.
SQL> drop table lookup_table;

Table dropped.

Elapsed: 00:00:00.01
SQL> CREATE TABLE lookup_table
2 AS
3 SELECT
4 destn,
5 studref,
6 min(pi) pi
7 fROM (
8 SELECT
9 pct1.DES as container_type
10 ,pc1.ID as pi
11 ,pc1.SPN as pn
12 ,pct2.DES as Desc2
13 ,pc2.DES as studref
14 ,pct3.des as desc3
15 ,pc3.dest_code as destn
16 FROM
17 container pc1
18 ,element e1
19 ,container_type pct1
20 ,element e2
21 ,container pc2
22 ,container_type pct2
23 ,element e3
24 ,container pc3
25 ,container_type pct3
26 WHERE
27 pc1.CONTAINER_TYPE_ID = 3
28 AND e1.CONTAINER_ID = pc1.id
29 AND pct1.ID = pc1.CONTAINER_TYPE_ID
30 AND e2.id = e1.PARENT_ID
31 AND pc2.ID = e2.CONTAINER_ID
32 AND pct2.id = pc2.CONTAINER_TYPE_ID
33 AND e3.id = e2.PARENT_ID
34 AND pc3.ID = e3.CONTAINER_ID
35 AND pct3.id = pc3.CONTAINER_TYPE_ID
36 ) pages
37 GROUP BY destn, studref
38 ;

Table created.

Elapsed: 00:04:31.08

So Parallel Query's the culprit then, eh? If you turn it down a notch, the query now runs within the TEMP space, and if you turn it off completely, it actually runs faster than if it's enabled. Er... well no actually, there's one more twist to it.

Whilst we were performing heroics with PARALLEL_MAX_SERVERS and explain plans, one of the other DBAs noticed that the CONTAINER_TYPE table was redundant in the SQL statement - it was included in the join, but the columns it provided could be sourced entirely from the other two tables. CONTAINER_TYPE could therefore be removed from the join, potentially reducing the amount of memory and temp space taken up by the hash join. The SQL was rewritten to remove this table:

CREATE TABLE lookup_table
AS
SELECT
     pc3.dest_code        as destn
    ,pc2.DES        as studref
    ,MIN(pc1.ID)            as pi
FROM
     container pc1
    ,element  e1
    ,element  e2
    ,container pc2
    ,element  e3
    ,container pc3
WHERE
    pc1.CONTAINER_TYPE_ID = 3
AND e1.CONTAINER_ID       = pc1.id
AND e2.id                         = e1.PARENT_ID
AND pc2.ID                        = e2.CONTAINER_ID
AND e3.id                         = e2.PARENT_ID
AND pc3.ID                        = e3.CONTAINER_ID
GROUP BY pc3.dest_code, pc2.des
;

and then run again - with an successful execution and a run time of just 1 min 17 seconds. Which was a bit of improvement. This of course was with PARALLEL_MAX_SERVERS set to 0, so what would happen if it was increased back to the original value, 240?

SQL> conn sys/password@orcl as sysdba
Connected.
SQL> alter system set parallel_max_servers = 240 scope=memory;

System altered.

Elapsed: 00:00:00.02
SQL> conn STAGING/STAGING@orcl
Connected.
SQL> CREATE TABLE lookup_table
2 AS
3 SELECT
4 pc3.dest_code as destn
5 ,pc2.DES as studref
6 ,MIN(pc1.ID) as pi
7 FROM
8 container pc1
9 ,element e1
10 ,element e2
11 ,container pc2
12 ,element e3
13 ,container pc3
14 WHERE
15 pc1.CONTAINER_TYPE_ID = 3
16 AND e1.CONTAINER_ID = pc1.id
17 AND e2.id = e1.PARENT_ID
18 AND pc2.ID = e2.CONTAINER_ID
19 AND e3.id = e2.PARENT_ID
20 AND pc3.ID = e3.CONTAINER_ID
21 GROUP BY pc3.dest_code, pc2.des
22 ;

Table created.

Elapsed: 00:01:12.04
SQL>

And there you go - it runs within the temp space constraints, and runs (just about) faster than with PARALLEL_MAX_SERVERS set to 0.

So, the moral of the story? Well, I guess if you're having issues with temp space, and you're hash joining several (big) tables with parallel query enabled, and you've got lots of processors, consider limiting down the degree of parallelism as a quick fix way to limit the amount of temp space being used. Alternatively, consider increasing HASH_AREA_SIZE if you've got memory spare. However, and I guess this is the true moral of the story - before you get all fancy and start playing around with initialisation parameters, DOPs and so forth, don't ignore the little guy over in the corner, waving his hand and saying "perhaps this statement could be written better", as probably the biggest improvement you'll ever get is taking the time out to write the query more efficiently in the first place, rather than rely on clever quick fixes to paper over the problem.

Any comments, aspects I've missed or misinterpreted etc, let me know and I'll update as neccessary.