Friday, May 18, 2007

Oracle 10g upgrade gotcha

Everybody knows the Cost Based Optimizer needs decent statistics to figure out what his query plan should be. Sure, it takes time, but everybody does it. Some estimate, some compute, some think they have a better way.

After upgrading from to, one of our systems suddenly slowed down. Not directly after the upgrade, but three days later. After some investigation, we found that some of the statistics had been estimated and not computed. That was pretty much a shock because my analyze method usually specifies estimate_percent=>null to make sure we compute. (Note, I already figured out that the default estimate_percent changed from 9i to 10g, so that wasn't it.)

I checked my custom analyze method and it reported that the table was never analyzed by me.

Hmmm, now it's getting real interesting. Just for kicks, I reran dbms_stats.gather_table_stats without an estimate_percent parameter and it estimated at about 8.3%. Aha! Must be using the AUTO_SAMPLE_SIZE for some reason.

So why?

Then I looked at the last_analyzed time of the tables that had been estimated. They all were run between 22:00 and 01:00. Aha, a schedule. Then I looked in dba_scheduler_jobs and there it was. A new job that Oracle automatically sets up to analyze stale objects.

When I checked in the morning, the objects didn't report as being stale. Then, after the day's data, Oracle decided the statistics were stale and analyzed the tables. When I went to gather statistics for stale objects the next morning, they had already been analyzed and weren't stale anymore. Except the statistics weren't optimal which made our queries "go bad".

Fortunately, there is a way you can disable this job just by tweaking the schedule.



mdinh said...

I am just curious.

Would changing optimizer_features_enable correct the problem?

optimizer_features_enable =
Change to:

Don said...

Heh I ran into the same surprise when upgrading from 9i to 10gR2. In my case I was more than happy to just stop doing my manual calls and let the gather_stats_job do it for me.

Jeff Hunter said...

mdinh said..
Would changing optimizer_features_enable correct the problem?

Probably not, since it's not really an optimizer problem, but a statistics gathering problem.

Don said...
...I was more than happy to just stop doing my manual calls and let the gather_stats_job do it for me.

In my environment, I've got two problems with that. First off, the automatic statistics collection does the wrong thing. I'm not saying I can't change that behavior, but as a quick fix, I just disabled it. Also, the statistics gathering was running at a time where I needed maximum resources. Again, not something I can't change, but a quick fix was in order to get back to normal.

Anonymous said...

Your blog entry said "pretty much a shock?".

As far as I know the oracle default to attempt to automatically gather statistics is one of the most widely known consequences of any upgrade path to 10.x.

Noons said...

another thing you might want to look at is the lock statistics procedure of dbms_stats.

I've found it useful in handling a small number of odd cases of tables that must have a fixed set of stats, not what the auto-gather finds during the maintenance window.