Oracle Performance Tuning Using Event 10046

If you've got an interest in Oracle performance tuning, you'll probably have seen references to something called event 10046. Event 10046 is an undocumented, but well known feature within the Oracle database that generates detailed information on how a statement is parsed, what the values of bind variables are, and what wait events occurred during a particular session. Activating Event 10046 gives us a more detailed trace that that provided by alter session set sql_trace=true and lets the DBA determine exactly why the database is performing in a certain way, by documenting how a statement is parsed and noting what wait events occured during the statement execution.

Wait events are the things that stop your SQL statement being processed quickly, and are typically caused by statements being hard parsed, data having to be retrieved or written to disks which are suffering from contention, and other such events which in theory can be minimised in a well designed and tuned system. You've probably come across the top five wait events section in STATSPACK which gives you an idea which parts of your system are causing bottlenecks, and activating extended trace using event 10046 allows us to analyze each individual wait event, in the context of the set of SQL statements executed in a particular session.

Two of the main advocates for using event 10046 are Cary Millsap and Jeff Holt, who's company, Hotsos, has kindly made available a detailed paper on analysing Oracle performance using event 10046. Entitled "Oracle System Performance Analysis Using Oracle Event 10046", the paper introduces the technique as:

"The Oracle database kernel is instrumented with over four hundred so-called pseudo-error debugging events. [1] One of the most important of these events for the system performance analyst is event 10046, enable SQL statement timing. Activating this event for an Oracle session instructs the Oracle kernel to print detailed timing information for that session to an Oracle trace file. A sample of 10046 data is provided later in this document.

Event 10046 is an attribute of an Oracle session that you may set at different levels, depending upon the type of diagnostic output you wish to obtain. Many database administrators are already familiar with Oracle's sql_trace facility, which emits performance information about Oracle parse, execute, fetch, commit, and rollback database calls. Using sql_trace is actually the equivalent of using event 10046 set at level 1.

However, there are entire categories of system performance problems that cannot be diagnosed with level-1 data alone. For example, you cannot unambiguously diagnose and repair a system plagued by contention for latches, locks, networks, or even storage devices by looking only at SQL trace data. Event 10046 takes Oracle performance instrumentation a significant step further by detailing the Oracle kernel's executions of over two hundred internal function calls."

Activating event 10046 is actually quite straightforward for your current session, and takes the form of

alter session set events '10046 trace name context forever, level num';

Where num is either 1,4,8 or 12, with 1 being the same as a normal set sql_trace=true, 4 including values of bind variables, 12 including wait events, and 12 including both bind variables and wait events.

The paper then goes on to explain how event 10046 is activated and deactivated, and then how to find and interpret the trace file. Interpreting the trace file seems to be the interesting bit, with several approaches possible. The most basic way is to go through the raw trace files manually, with the article pointing to a useful Metalink note No. 39817.1 entitled "Interpreting Raw SQL_TRACE and DBMS_SUPPORT.START_TRACE output" which goes through the basics of interpreting the file. Also, If you've got Tom Kyte's "Expert One-On-One" book, turn to Chapter 10 page 464 where there's a whole section on using and interpreting raw trace files. Of course you can also put the results through TKPROF, which from version 9.0.1 also processes trace results generated using the 10046 event. Interpreting the trace file is of course where the skill lies and beyond the scope of this article, but take a look through the article and Tom's book if you've got it, or just do a google search for "10046 event interpreting" and you'll come across plenty of useful examples.

The reason that so much attention has been placed on the event 10046 is that it gives us an opportunity to tune performance by examining individual wait events, rather than relying on the traditional tuning ratios that often hide the real cause of poor database performance. As such, it's quite a revolutionary approach and not one you'll generally find in official Oracle documentation. However, if you're interesting in reading more about event 10046, take a look at Cary Millsap's and Jeff Holt's book "Optimizing Oracle Performance" which is the reference text for wait event tuning, together with a forthcoming book by Stephen Andert called "Oracle Wait Event Tuning" which looks due out in October 2004. OTN also ran an article by Robin Schumacher entitled "Exploring the Oracle Database 10g Wait Interface" on some new V$ views in Oracle Database 10g that provide an interface into database wait events, and this article by Thiru Vadivelu gives a nice introduction to SQL tracing in general.

Lastly, as an alternative to generating and interpreting (what can be) cumbersome raw trace files, Don Burleson recently wrote an Oramag article entitled "Perfect Pitch" which advocates an additiona system wait tuning approach that carries out real-time analysis of physical I/O wait events using views such as V$SYSTEM_EVENT and V$SESSION_WAIT, capturing details about the objects that experience physical read waits, which, when identified, can then be used together with Statspack to identify problem SQL for tuning. According to the article,

"System wait tuning has become very popular because it can show you those wait events that are the primary bottleneck for your system. Some experts like the 10046 wait event (level 8 and higher) analysis technique and Oracle MetaLink now has an analysis tool called trcanlzr.sql (see Next Steps) to interpret bottlenecks via 10046 trace dumps. However, some Oracle professionals find dumps cumbersome and prefer to sample real-time wait events.

When doing wait analysis, it is critical to remember that all Oracle databases experience wait events, and that the presence of waits does not always indicate a problem. In fact, all well-tuned databases have some bottleneck. (For example, a computationally intensive database may be CPU-bound and a data warehouse may be bound by disk-read waits.) In theory, any Oracle database will run faster if access to hardware resources associated with waits is increased.

This article explores a small subset of wait analysis, but it illustrates a critical concept of Oracle tuning: the fact that all Oracle databases wait on some kind of system resource, and it's the Oracle professional's job to determine whether the database is I/O bound, CPU bound, memory bound, or bound waiting on latches or locks. When the source of the bottleneck has been identified, the savvy Oracle professional must then determine the causes of these events and attempt to remove them."

All in all, it's an interesting area and it certainly seems that system wait tuning, using the event 10046 and the real-time v$ wait event views, is the preferred 'scientific approach' for diagnosing and improving system performance. If you're interested in tuning, take a look at the articles when you get a chance, and see whether the event 10046 could help you in diagnosing system bottlenecks.

UPDATE: I've just remembered that Nicholas Goodman dropped me a line the other day to let me know that Veritas are making several chapters from "Optimizing Oracle Performance" available as a free download. It's a marketing exercise so you'll probably need to register, but it's a good chance to take a look at part of the book for free.