Easter Roundup
March 28th, 2005 by Mark Rittman
It’s the Easter break over here at the moment, and I’m off work for the next
week with a bit of paternity leave. Things are a bit hectic as I’m sure you can
imagine, with a two and a half-year old and a two week old baby, but it’s all
good fun and a nice break from travelling all over the country and staying in
hotels. Although I’ve not been able to update the blog much, I’ve kept an eye on
what’s been going on and bookmarked a few sites to look at later on.
First up, if you took an interest in the predictive modelling debate on
Asktom,
Don Burleson’s forum and the
Dizwell forum,
you might be interested in a similar debate going on at Don’s forum around
Empirical vs. Research
DBAs. Personally I don’t agree with Don’s assertion, that Oracle research
primarily conducted on laptops using SQL scripts and single-user systems does
not translate to the world of multi-terabyte, highly concurrent systems, and
therefore the best way to tune a database is to use observation and correlation
to "tease out" the inner workings of the system, as my belief is that the danger
in basing decisions on intuition, correlation and "hunches" is that you make the
right decision for the wrong reasons, and then go on to apply this "lucky guess"
to subsequent situations where in fact it doesn’t work, and you don’t know why
because you don’t know why it worked in the first place. The danger when you
propose "silver bullets"
without supporting evidence as to why they work is that the approach can’t
properly be reviewed, and if the internals of the Oracle database change, and
you don’t retest them against the new version, the danger is that this approach
may no longer be valid and it runs the risk of becoming an
Oracle myth.
The other problem I have with the empirical approach is that it relies on
many, many years of experience and observation of different types of Oracle
database, and the "sixth sense" you therefore develop for coming up with
performance tuning fixes, and therefore it’s an approach that by definition
cannot be taught to novices in any reasonable amount of time, and cannot be
carried out by anyone other than very experienced DBAs. By contrast, the
scientific method, an example of which is the "response time" tuning method put
forward by Cary Millsap and Jeff Holt in
"Optimising Oracle Performance", can be taught and used successfully right
from the start as it bases its approach on instrumenting your system at a low
level, recording and measuring what’s going on, applying just the right fix to
address the problem, and then recording and measuring again. Unlike the
empirical approach - observe at a high-level, make some correlations, try an
approach, see what happens, if it works, great, otherwise try something else -
the scientific approach, although it might involve a bit more work at the start,
and a closer understanding of Oracle internals - at least involves basing
recommendations on facts, not "lore", which sounds to me a better way to tune.
The other problem as I see it with the empirical approach - which let’s remember
bases it’s approach on "it works but I don’t know why" - is that many of the
solutions put forward by proponents of this approach are soon after
discredited by
members of the scientific community, with lots of documentary evidence to say
why. It’s all well and good to say that research done by DBAs using laptops and
SQL scripts doesn’t prove anything, but if they then prove that your
empirically-derived solutions don’t work, and you don’t offer any evidence of
your own to back up your original assertion and instead change the debate to one
about
credentials and education, then this just looks like a
smokescreen and
undermines the strength of your original argument. All credit though to Don for
having this debate though, for taking the time to put forward his thoughts and
for publishing opinions on his website that both support and contradict his
approach - Don often attracts a lot of (unneccessary) flak but it’s debates like
this that help (relative) novices like me understand the bigger picture. Also,
my gut feeling is that most of the time Don is actually right in that his
approach did work, but what I can’t understand is why (with his undoubted
knowledge and experience) he doesn’t just back up his assertions with
reproducible test cases (even if these involved simulating an instance with a
high concurrent load), which would validate his empirical observations and
negate most of the arguments against what he’s proposing. As I’ve said before,
all of this is way beyond my level of experience, but it seems obvious to me
that there’s an easy way to address the concerns of the scientific Oracle
community which is to provide test cases and evidence to back up every
empirically derived solution, and until that’s done there’ll always be debate
about whether a hunch or a correlation is in fact valid.
Moving on now, a bunch of new Oracle blogs have started up recently that are
worth bookmarking and taking a look at.
Mike Ault and Robert Freeman,
who work with Don at BC Oracle Consulting,
have started new blogs, and Don himself has got his own
personal blog that talks about his
life outside of oracle consulting. I met Don, Mike and John Garmany last year
whilst at Openworld and whilst we operate in different areas and presumably have
different approaches, they’re all actually great fun and good company to go out
with - thanks Don for the meal! Doug Burns, who you might remember as the author
of "Suck It Dry - Tuning Parallel
Execution", an excellent paper on parallel execution that I mentioned last
year, has started a blog with
one of the first postings on the
art-vs.-science debate. Note also Doug’s
Technical Papers and
Presentations page with a number of other interesting Oracle papers. Also,
not new but new to me is a weblog by
Jonathan Gennick, author of
many OTN articles and a number of O’Reilly books.
Jonathan Lewis has put another article up on his
Miscellaneous page,
this time on
"Bottlenecks", which as usual is worth a quick read and reminded me to take
a proper look at hash clusters, which can improve the performance of
applications that retrieve values from lookup tables - more details can be found
in the
online documentation,
this article by
Connor McDonald and
this
one by Don Burleson. This article also jogged my memory about something I’d
read about star schema dimension tables as IOTs (index-organised tables) - this
was originally a
comment on one
of my blog postings, and I subsequently
looked it up at Asktom where it appears to be a valid approach to improve
performance.
Rounding up those articles I’ve bookmarked and not had a chance to check out
properly. "Introducing
Materialized Views as mechanism for Business Rule implementation (complex
declarative constraints)" on the Amis blog looks like another thorough
investigation into Oracle SQL that could prove useful on a future project.
"An
Oracle 10g Upgrade Case Study: Looking at System Performance Before and After
the Upgrade" would have been useful a few weeks ago for me when I was asked
to detail the benefits (and potential issues) around an 8i to 10g migration.
Robin Schumacher (who I also met at the same
dinner with Don) recently wrote an article for OTN entitled
"Response Time Analysis Made Easy in Oracle Database
10g" that examines some of the new instrumentation that you
get with Oracle 10g, and over at Oracle-base.com there’s an excellent round-up
of the
performance tuning enhancements in Oracle Database 10g.
Finally, mirroring the art-vs.-science debate in Oracle performance tuning
circles, within the data warehousing world there’s a
similar,
perennial argument going on around the best way to design data warehouses.
One faction, based around the approach put forward by
Bill Inmon, believes that data warehouses
are no different to regular databases and are best designed in a
third-normal-form, as this is the most flexible and performant way to build a
relational database. The other faction, based around the writings of
Ralph Kimball, believe that the most
important thing is query performance and making the data model easy for users to
navigate, and build their warehouses using star schemas instead. Me - I’m a
Kimballite, if only because Oracle’s data warehouse performance enhancements
presuppose a star schema and
carry out joins and data retrieval more efficiently when you’ve got one -
but if you’re interested there’s a
debate going on
over at the B-Eye Network, and Kent Graziano (author of the
Agile Methods paper I
reviewed recently, and a supported of the Inmon "Enterprise Data Warehouse"
architecture) pointed me in the direction of
this article
which sums up the Inmon EDW philosophy.
Back in a few days.

April 6th, 2005 at 3:35 pm
Hi Mark,
You managed to pick up the first page of the unfinished HTML version of the PX paper and link to it! Sloppy work on my part, but at least your blog posting made me pull my finger out and get it finished. So http://doug.burns.tripod.com/px.html actually contains the full article now! There’s also a PDF version on the site too.
Thanks for your support,
Doug