Performance Issues with DML Error Logging and Conventional Path Inserts
December 4th, 2005 by Mark Rittman
I’ve recently been putting together an article for Oracle Magazine on the new
DML Error Logging feature in Oracle 10gR2, which allows you to add a LOG ERRORS
clause to most DML statements so that rows that would otherwise cause the
statement to fail are instead copied to an error logging table. A typical use of
LOG ERRORS is like this:
INSERT /*+ APPEND */
INTO table_a
SELECT *
FROM table_b
LOG ERRORS
REJECT LIMIT UNLIMITED;
The rows that would cause an error are instead written to a table called
ERR$_TABLE_A that gets created for you by the DBMS_ERRLOG.CREATE_ERROR_LOG
procedure. I won’t go into the full details here but Tim Hall has put together a
nice write-up that you can view here.
Anyway, for the article I put a set of examples together, where I used the
data in the SH.SALES table to create a “source table” copy in a new
schema, created another copy based off of this as a “target table”,
then introduced some errors in to the source table and constraints into the
target so that I could show how the feature worked. Finally I used the
DBMS_ERRLOG package to create the error logging table.
SQL*Plus: Release 10.2.0.1.0 - Production on Sun Dec 4 12:38:54 2005
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
SQL> create sequence sales_id_seq;
Sequence created.
SQL> create table sales_src
2 as
3 select sales_id_seq.nextval as "SALES_ID"
4 , prod_id
5 , cust_id
6 , channel_id
7 , time_id
8 , promo_id
9 , amount_sold
10 , quantity_sold
11 from sh.sales
12 /
Table created.
SQL> create table sales_target
2 as
3 select *
4 from sales_src
5 where 1=0
6 /
Table created.
SQL> alter table sales_src
2 modify promo_id null
3 /
Table altered.
SQL> select min(sales_id)
2 , max(sales_id)
3 from sales_src
4 /
MIN(SALES_ID) MAX(SALES_ID)
------------- -------------
1 918843
SQL> update sales_src
2 set promo_id = null
3 where sales_id in (5000,6000,7000)
4 /
3 rows updated.
SQL> update sales_src
2 set amount_sold = 0
3 where sales_id between 1000 and 1005
4 /
6 rows updated.
SQL> commit;
Commit complete.
SQL> alter table sales_target
2 add constraint amount_sold_chk
3 check (amount_sold > 0)
4 enable
5 validate
6 /
Table altered.
SQL> desc sales_src
Name Null? Type
----------------------------------------- -------- ----------------------------
SALES_ID NUMBER
PROD_ID NOT NULL NUMBER
CUST_ID NOT NULL NUMBER
CHANNEL_ID NOT NULL NUMBER
TIME_ID NOT NULL DATE
PROMO_ID NUMBER
AMOUNT_SOLD NOT NULL NUMBER(10,2)
QUANTITY_SOLD NOT NULL NUMBER(10,2)
SQL> desc sales_target
Name Null? Type
----------------------------------------- -------- ----------------------------
SALES_ID NUMBER
PROD_ID NOT NULL NUMBER
CUST_ID NOT NULL NUMBER
CHANNEL_ID NOT NULL NUMBER
TIME_ID NOT NULL DATE
PROMO_ID NOT NULL NUMBER
AMOUNT_SOLD NOT NULL NUMBER(10,2)
QUANTITY_SOLD NOT NULL NUMBER(10,2)
SQL> begin
2 dbms_errlog.create_log_table('SALES_TARGET');
3 end;
4 /
dbms_errlog.create_log_table('SALES_TARGET');
*
ERROR at line 2:
ORA-06550: line 2, column 16:
PLS-00302: component 'CREATE_LOG_TABLE' must be declared
ORA-06550: line 2, column 4:
PL/SQL: Statement ignored
SQL> begin
2 dbms_errlog.create_error_log('SALES_TARGET');
3 end;
4 /
PL/SQL procedure successfully completed.
Next then I demonstrated a direct path insert from the source into the target
table, using the LOG ERRORS clause.
SQL> insert /*+ APPEND */ 2 into sales_target 3 select * 4 from sales_src 5 log errors 6 reject limit unlimited 7 / 918834 rows created. Elapsed: 00:00:05.20 SQL> truncate table sales_target; Table truncated. Elapsed: 00:00:00.39 SQL> truncate table err$_sales_target; Table truncated. Elapsed: 00:00:00.07
I then compared it to a PL/SQL routine that you’d normally use to handle this
sort of situation.
SQL> DECLARE
2 TYPE array IS TABLE OF sales_target%ROWTYPE
3 INDEX BY BINARY_INTEGER;
4 sales_src_arr ARRAY;
5 errors NUMBER;
6 bulk_error EXCEPTION;
7 l_cnt NUMBER := 0;
8 PRAGMA exception_init
9 (bulk_error, -24381);
10 CURSOR c IS
11 SELECT * FROM sales_src;
12 BEGIN
13 OPEN c;
14 LOOP
15 FETCH c
16 BULK COLLECT
17 INTO sales_src_arr
18 LIMIT 100;
19 BEGIN
20 FORALL i IN 1 .. sales_src_arr.count
21 SAVE EXCEPTIONS
22 INSERT INTO sales_target VALUES sales_src_arr(i);
23 EXCEPTION
24 WHEN bulk_error THEN
25 errors :=
26 SQL%BULK_EXCEPTIONS.COUNT;
27 l_cnt := l_cnt + errors;
28 FOR i IN 1..errors LOOP
29 DBMS_OUTPUT.PUT_LINE
30 ('Error iteration ' ||
31 SQL%BULK_EXCEPTIONS(i).ERROR_INDEX ||
32 ' Oracle error is ' ||
33 SQL%BULK_EXCEPTIONS(i).ERROR_CODE );
34 END LOOP;
35 END;
36 EXIT WHEN c%NOTFOUND;
37
38 END LOOP;
39 CLOSE c;
40 DBMS_OUTPUT.PUT_LINE
41 ( l_cnt || ' total errors' );
42 END;
43 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:26.01
The example results came back as you’d expect – the direct path insert using DML error logging took about 5 seconds for 900k rows, whilst the PL/SQL routine, which used BULK COLLECT, FORALL and SAVE EXCEPTIONS, took about 26 seconds. As an experiment, I also ran the LOG ERRORS version using just a conventional path insert, like this:
SQL> insert into sales_target 2 select * 3 from sales_src 4 log errors 5 reject limit unlimited 6 / 918834 rows created. Elapsed: 00:00:58.79 SQL> truncate table sales_target; Table truncated. Elapsed: 00:00:02.46
The conventional path version came back in just under a minute – considerably
longer than the PL/SQL version. I must admit I put this down to the fact that I
wasn’t in direct path mode, coupled with the amount of rows I was loading, and
just mentally filed it away as one of those things.
I passed the article out for review to Tim Hall, Doug Burns and Tom Kyte as
this was the first “server tech” article I’d written for Oracle
Magazine, and I thought it worth getting a couple of extra opinions before
submitting it for publishing. The reviews were OK except that Tom picked up on
the fact that the conventional path version ran so slow – this seemed too slow
for him and something didn’t look right. Tom said he’d take a look at it but
also suggested that I contact Jonathan Lewis, who might have an idea on this.
I got in touch with Jonathan who said that he’d not had a chance to take a
proper look at DML error logging yet, but said that the figures suggested that
the conventional path load might have managed to fall back into some single-row processing mode for undo, redo and latching.
Jonathan suggested that I run the tests again, but this time using a view and a
statistics gathering package that he’d written to get a better idea of the
activity during the DML execution. I’ve had a look on Jonathan’s site and I
think that the view and code that I used is on this “Snapshots
Revisited” page, so you can take a look and run the tests yourselves if
you like.
Going back to SQL*Plus, I installed the view and package from Jonathan’s code
and ran the tests again, but this time snapshotting the stats to see if clues
suggested themselves:
SQL> exec snap_my_stats.start_snap; PL/SQL procedure successfully completed. Elapsed: 00:00:00.11 SQL> insert /*+ APPEND */ 2 into sales_target 3 select * 4 from sales_src 5 log errors 6 reject limit unlimited 7 / 918834 rows created. Elapsed: 00:00:07.00 SQL> exec snap_my_stats.end_snap; --------------------------------- Session stats - 04-Dec 12:57:32 Interval:- 43 seconds --------------------------------- Name Value ---- ----- opened cursors cumulative 178 user calls 12 recursive calls 2,496 recursive cpu usage 30 session logical reads 12,823 CPU used when call started 296 CPU used by this session 298 DB time 703 user I/O wait time 218 session uga memory 65,464 session pga memory 393,216 enqueue requests 303 enqueue releases 301 physical read total IO requests 73 physical read total multi block requests 56 physical read total bytes 10,158,080 physical write total IO requests 213 physical write total multi block requests 211 physical write total bytes 41,500,672 db block gets 6,814 db block gets from cache 1,748 db block gets direct 5,066 consistent gets 6,009 consistent gets from cache 6,009 consistent gets - examination 246 physical reads 1,230 physical reads cache 1,230 physical read IO requests 68 physical read bytes 10,076,160 db block changes 1,439 consistent changes 57 physical writes 5,066 physical writes direct 5,066 physical write IO requests 213 physical write bytes 41,500,672 physical writes non checkpoint 5,066 free buffer requested 1,438 commit cleanouts 119 commit cleanouts successfully completed 119 CR blocks created 110 physical reads cache prefetch 1,162 calls to kcmgas 175 calls to get snapshot scn: kcmgss 560 redo entries 1,196 redo size 178,172 undo change vector size 43,096 data blocks consistent reads - undo records applied 57 no work - consistent read gets 5,424 rollbacks only - consistent read gets 57 deferred (CURRENT) block cleanout applications 114 IMU Flushes 1 table scans (short tables) 55 table scans (long tables) 2 table scan rows gotten 933,374 table scan blocks gotten 5,253 table fetch by rowid 8 cluster key scans 116 cluster key scan block gets 226 rows fetched via callback 6 index fetch by key 122 heap block compress 1 session cursor cache hits 170 cursor authentications 1 buffer is not pinned count 318 workarea executions - optimal 2 parse time cpu 6 parse time elapsed 48 parse count (total) 178 parse count (hard) 3 execute count 187 bytes sent via SQL*Net to client 794 bytes received via SQL*Net from client 1,225 SQL*Net roundtrips to/from client 8 PL/SQL procedure successfully completed. Elapsed: 00:00:01.07 SQL> truncate table sales_target; Table truncated. Elapsed: 00:00:00.46 SQL> truncate table err$_sales_target; Table truncated.
Well that was the direct path version. What about the conventional path
version?
Elapsed: 00:00:00.21 SQL> exec snap_my_stats.start_snap; PL/SQL procedure successfully completed. Elapsed: 00:00:00.15 SQL> insert into sales_target 2 select * 3 from sales_src 4 log errors 5 reject limit unlimited 6 / 918834 rows created. Elapsed: 00:00:57.45 SQL> exec snap_my_stats.end_snap; --------------------------------- Session stats - 04-Dec 12:59:34 Interval:- 86 seconds --------------------------------- Name Value ---- ----- opened cursors cumulative 232 user calls 12 recursive calls 2,927 recursive cpu usage 105 session logical reads 983,732 CPU used when call started 2,793 CPU used by this session 2,793 DB time 5,752 user I/O wait time 522 messages sent 264 enqueue requests 890 enqueue releases 888 physical read total IO requests 120 physical read total multi block requests 116 physical read total bytes 26,796,032 db block gets 973,216 db block gets from cache 973,216 consistent gets 10,516 consistent gets from cache 10,516 consistent gets - examination 4,002 physical reads 3,271 physical reads cache 3,271 physical read IO requests 120 physical read bytes 26,796,032 db block changes 1,872,467 consistent changes 57 change write time 1,511 free buffer requested 16,066 pinned buffers inspected 2 hot buffers moved to head of LRU 45 free buffer inspected 11,997 commit cleanouts 119 commit cleanouts successfully completed 119 CR blocks created 110 write clones created in foreground 7 physical reads cache prefetch 3,151 calls to kcmgcs 5,048 calls to kcmgas 181 calls to get snapshot scn: kcmgss 1,076 redo entries 945,196 redo size 263,407,300 redo buffer allocation retries 16 redo log space requests 19 redo log space wait time 1,014 undo change vector size 58,811,884 data blocks consistent reads - undo records applied 57 no work - consistent read gets 5,532 rollbacks only - consistent read gets 57 rollback changes - undo records applied 6 deferred (CURRENT) block cleanout applications 114 active txn count during cleanout 3,756 cleanout - number of ktugct calls 3,756 IMU Flushes 1 table scans (short tables) 109 table scans (long tables) 2 table scan rows gotten 934,130 table scan blocks gotten 5,361 table fetch by rowid 8 cluster key scans 116 cluster key scan block gets 226 rows fetched via callback 6 index fetch by key 122 session cursor cache hits 224 buffer is pinned count 913,776 buffer is not pinned count 318 parse time cpu 5 parse time elapsed 5 parse count (total) 232 parse count (hard) 3 execute count 241 bytes sent via SQL*Net to client 810 bytes received via SQL*Net from client 1,210 SQL*Net roundtrips to/from client 8 PL/SQL procedure successfully completed. Elapsed: 00:00:01.06 SQL> truncate table sales_target; Table truncated. Elapsed: 00:00:02.15 SQL> truncate table err$_sales_target; Table truncated.
Note the difference in the redo generated between the two versions – 178Kb of
redo for the direct path insert, compared to 263Mb with the conventional path
insert.
I also then ran a couple of extra tests; firstly I removed the constraints
from the target table and ran the conventional path version again, this time
without using LOG ERRORS:
SQL> alter table sales_target
2 modify column promo_id null
3 /
modify column promo_id null
*
ERROR at line 2:
ORA-00905: missing keyword
Elapsed: 00:00:00.09
SQL> alter table sales_target
2 modify promo_id null
3 /
Table altered.
Elapsed: 00:00:00.34
SQL> alter table sales_target
2 drop constraint amount_sold_chk
3 /
Table altered.
Elapsed: 00:00:00.07
SQL> exec snap_my_stats.start_snap;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.15
SQL> insert into sales_target
2 select *
3 from sales_src
4 /
918843 rows created.
Elapsed: 00:00:14.07
SQL> exec snap_my_stats.end_snap;
---------------------------------
Session stats - 04-Dec 13:02:10
Interval:- 54 seconds
---------------------------------
Name Value
---- -----
opened cursors cumulative 197
user calls 12
recursive calls 2,776
recursive cpu usage 38
session logical reads 59,375
CPU used when call started 458
CPU used by this session 458
DB time 1,418
user I/O wait time 783
messages sent 43
enqueue requests 766
enqueue releases 764
physical read total IO requests 221
physical read total multi block requests 197
physical read total bytes 41,443,328
db block gets 46,747
db block gets from cache 46,747
consistent gets 12,628
consistent gets from cache 12,628
consistent gets - examination 1,371
physical reads 5,059
physical reads cache 5,059
physical read IO requests 221
physical read bytes 41,443,328
db block changes 38,770
consistent changes 57
change write time 77
free buffer requested 10,653
dirty buffers inspected 590
hot buffers moved to head of LRU 7
free buffer inspected 8,689
commit cleanouts 110
commit cleanouts successfully completed 110
CR blocks created 110
physical reads cache prefetch 4,838
shared hash latch upgrades - no wait 19
calls to kcmgcs 5,048
calls to kcmgas 165
calls to get snapshot scn: kcmgss 5,816
redo entries 28,400
redo size 43,719,772
redo buffer allocation retries 3
redo log space requests 3
redo log space wait time 126
undo change vector size 2,463,156
data blocks consistent reads - undo records applied 57
no work - consistent read gets 5,474
rollbacks only - consistent read gets 57
deferred (CURRENT) block cleanout applications 110
active txn count during cleanout 1,120
cleanout - number of ktugct calls 1,120
IMU Flushes 1
table scans (short tables) 58
table scans (long tables) 2
table scan rows gotten 933,416
table scan blocks gotten 5,259
table fetch by rowid 24
cluster key scans 115
cluster key scan block gets 225
index fetch by key 116
index scans kdiixs1 19
session cursor cache hits 182
buffer is pinned count 913,776
buffer is not pinned count 372
workarea executions - optimal 2
parse time cpu 11
parse time elapsed 125
parse count (total) 182
parse count (hard) 1
execute count 198
bytes sent via SQL*Net to client 810
bytes received via SQL*Net from client 1,176
SQL*Net roundtrips to/from client 8
sorts (memory) 7
sorts (rows) 16
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.10
SQL> truncate table sales_target;
Table truncated.
This time the conventional path insert generated 43Mb of redo – this should
be considered the benchmark figure for a conventional path insert, therefore the
LOG ERRORS clause is generating just under 200Mb of extra redo, which explains
why it takes so much longer to run. What about then if I remove all the errors
from the data, then reinstate the LOG ERRORS clause – will having this clause in
itself cause the extra redo to be generated?
Elapsed: 00:00:05.03
SQL> alter table sales_target
2 modify promo_id not null
3 /
Table altered.
Elapsed: 00:00:00.09
SQL> alter table sales_target
2 add constraint amount_sold_chk
3 check (amount_sold > 0)
4 enable
5 validate
6 /
Table altered.
Elapsed: 00:00:00.09
SQL> delete from sales_src
2 where amount_sold <=0
3 /
6 rows deleted.
Elapsed: 00:00:03.09
SQL> delete from sales_src
2 where promo_id is null
3 /
3 rows deleted.
Elapsed: 00:00:00.35
SQL> commit;
Commit complete.
Elapsed: 00:00:00.03
SQL> exec snap_my_stats.start_snap;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.18
SQL> insert into sales_target
2 select *
3 from sales_src
4 log errors
5 reject limit unlimited
6 /
918834 rows created.
Elapsed: 00:01:11.04
SQL> exec snap_my_stats.end_snap;
---------------------------------
Session stats - 04-Dec 13:08:45
Interval:- 109 seconds
---------------------------------
Name Value
---- -----
opened cursors cumulative 230
user calls 12
recursive calls 2,909
recursive cpu usage 103
session logical reads 983,504
CPU used when call started 2,920
CPU used by this session 2,922
DB time 7,113
user I/O wait time 1,659
messages sent 258
enqueue requests 877
enqueue releases 875
physical read total IO requests 225
physical read total multi block requests 199
physical read total bytes 41,590,784
db block gets 973,166
db block gets from cache 973,166
consistent gets 10,338
consistent gets from cache 10,338
consistent gets - examination 3,812
physical reads 5,077
physical reads cache 5,077
physical read IO requests 225
physical read bytes 41,590,784
db block changes 1,872,411
consistent changes 57
change write time 1,788
free buffer requested 17,868
hot buffers moved to head of LRU 26
free buffer inspected 11,199
commit cleanouts 110
commit cleanouts successfully completed 110
CR blocks created 110
write clones created in foreground 7
physical reads cache prefetch 4,852
calls to kcmgcs 5,048
calls to kcmgas 172
calls to get snapshot scn: kcmgss 1,076
redo entries 945,174
redo size 263,400,952
redo buffer allocation retries 24
redo log space requests 18
redo log space wait time 853
undo change vector size 58,810,660
data blocks consistent reads - undo records applied 57
no work - consistent read gets 5,536
rollbacks only - consistent read gets 57
deferred (CURRENT) block cleanout applications 110
active txn count during cleanout 3,587
cleanout - number of ktugct calls 3,587
IMU Flushes 1
table scans (short tables) 113
table scans (long tables) 2
table scan rows gotten 934,186
table scan blocks gotten 5,369
table fetch by rowid 5
cluster key scans 110
cluster key scan block gets 220
rows fetched via callback 1
index fetch by key 111
session cursor cache hits 224
cursor authentications 1
buffer is pinned count 913,767
buffer is not pinned count 306
parse time cpu 6
parse time elapsed 90
parse count (total) 230
parse count (hard) 2
execute count 231
bytes sent via SQL*Net to client 810
bytes received via SQL*Net from client 1,210
SQL*Net roundtrips to/from client 8
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.03
SQL> select count(*)
2 from err$_sales_target;
COUNT(*)
----------
0
Elapsed: 00:00:00.12
SQL>
So yes, just adding the LOG ERRORS clause causes the extra redo to be
generated, even though no errors have been written to the error logging table. According
to Jonathan, what is in effect happening is that when you switch to conventional
path with LOG ERRORS, the “block at a time” optimisation for undo and
redo dissappears. In effect, the conventional path switches to single row mode,
which you can tell by
- the session logical I/Os in the stats = number of rows
- redo entries = number of rows
- db block changes = 2* number of rows (one for the row, one for the undo
entry
It’s worth saying here that I’m working with 10.2.0.1 on Windows XP SP2 in
noarchivelog mode (MR – I originally had this as 10.1.0.2, a typo I corrected on 5/11). I’ll raise it as a bug on metalink on Monday (assuming it’s a
bug and not an intended feature) and we’ll see what happens. Thanks again to
Jonathan and Tom for pointing it out, very interesting and it shows the value of
getting your article peer reviewed.


December 4th, 2005 at 10:10 pm
Hi Mark,
nice to see you on such low level of BI:)
A small addendum: a care should be taken if the number of exceptional rows is not very low, i.e. if a non trivial number of rows is written in the err$ table. See test case below comparing the LOG ERRORS feature with a self implemented check on a 200k rows table with 100k exceptions. (Note the difference in recursive calls – one per exception row – and db block gets statistics; in addition to redo size).
Yes, I know exceptions *should* be rare, but this could be a negative feedback, you have a really bad data and you get a really slow running job.
Regard,
Jaromir
– table src has 200k rows; the half of then is not valid and is written in the err$ table
– process high number of exceptions is relatively expensive
December 5th, 2005 at 9:44 pm
Hi Jaromir
Good to hear from you!
Thanks for the heads up on the number of exception rows issue – I kind of expected that this might be an issue, thanks for the test case.
I don’t know why the comments were rejected – they all got through and I uploaded one of them this morning. I’ve got an issue with the software used to run my site, this probably sounds like it playing up again. Thanks for persevering though.
all the best
Mark
December 6th, 2005 at 10:04 pm
These reverse engineering articles are pretty tricky stuff, i find. I didn’t realise how much work and thought they require before I started on them myself.
Good job Mark: have you had a look through trace files for the differences between the scenarios? I always find them illuminating.
January 31st, 2008 at 5:02 pm
[...] wrote about DML Error Logging in a blog post last year and in this article for Oracle Magazine. One of the drawbacks in the 10gR2 implementation of DML [...]