Wednesday, April 13, 2011

It's always X, except when it's not

I drill into my DBAs heads that the first step in tuning any process is to get a trace.  Don't show me the query plan until you show me what the database actually executed.  The trace is the fundamental clue in any performance investigation.  The wait events tell you exactly what's going on and there is no question where the bulk of the time is being spent.

When we see a large number of waits on the "db file sequential read" in my environment, it usually means that the wrong index is being used.  Not usually in the 5 out of 10 times sense, but usually in the degree of 99 times out of 100.  Most of the time we find that the developer has "discovered" hints and is trying to outsmart the optimizer or the optimizer just picked the wrong index.  Very rarely, a poor response time for "db file sequential read" may mean we have a bad disk somewhere in the volume.

We had an issue the other day where a particular query had a high number of "db file sequential read" waits.  None of the normal things applied and we needed to dig a little deeper.  After looking at the raw trace file, we saw that the query was waiting on a "db file sequential read" from the UNDO datafiles.  Oh yeah, the other .05% of the time it's a consistent read problem!

Understand what the causes for a particular event are.  Sometimes there is more than one cause for Oracle to throw a particular event.  Also, the event thrown may be dependent on what else is going on in the database at the time.  Also keep and open mind that even though it's X 99% of the time, there is a slight possibility it could be Y.

No comments: