Tuesday, June 14, 2005

Building a better dbms_stats

"Everybody Knows" that to get the Cost Based Optimizer (CBO) to choose a good execution plan you need accurate statistics. In the ideal world you would have up-to-date statistics at every moment. However, as tables shrink and grow their statistics no longer reflect the true condition of the table. The optimizer may choose a "wrong" path based on the information it has.

Prior to 8.1.5, calculating statistics used to be a trade off of time spent calculating statistics vs. the amount of time saved when your queries ran faster. Oracle provided the ANALYZE command or the dbms_utility package that you used to calculate statistics on your data. Automating the process of collecting statistics was either a script generating a bunch of ANALYZE commands or a call to dbms_utility.

The enhancements to the optimizer in 8.1.5 made calculating statistics a necessary evil. Oracle provided dbms_stats to assist with statistics gathering. My first attempt at using dbms_stats was replacing calls to dbms_utility with calls to dbms_stats. Needless to say, this method failed horribly wrong for a couple reasons:
  1. dbms_stats didn't work exactly the same as dbms_utility.
  2. dbms_stats calculation took longer than dbms_utility.
Once I learned what options for dbms_stats were good for my environment, I setup my analyze jobs and let them run. After a while, the weekly statistics calculation was taking 20 hours to run. Surely there must be a better way. My four requirements for a "better" dbms_stats were:
  • The new method must be able to analyze more than one object at a time. The degree option of dbms_stats is great, but I found my CPUs were at most 50% busy during statistics gathering, even with degree=2*CPU. I've got multiple processors and I want them 100% busy during statistics gathering.
  • The new method must be able to give me feedback on where it is and how long it took to analyze each object. I don't want to issue a command and wait for 5 hours to tell me "done".
  • 20 hours is totally unacceptable. I was analyzing every table every weekend when probably 90% of them never changed. I'll admit it, I was lazy when converting from another version.
  • When the optimizer chose a radically different execution path, I had to be able to "restore" the previous stats.

I read more about dbms_stats. I talked to people. I read what others were doing for their statistics collection. I read just about every link on AskTom about dbms_stats. I experimented.

First, I turned on monitoring and am letting Oracle tell me what objects I need to gather statistics for. This will immediately speed up my process because I don't have to analyze tables that don't change. Less work = faster.

Next, I configured job queues so I could use dbms_job. The job queues will give me the ability to run multiple analyze jobs at the same time.

Last, I had to write a little code. I basically broke up the processing into two pieces; one step to figure out what needs to be analyzed and submit those jobs to the job queue and another step to analyze the object. Below is the pseudo code I used to figure out which objects had to be analyzed:

BEGIN

Get A list of objects (o) that need analyzing;
FOREACH o LOOP
Record attributes about the object you are going to analyze;
Setup a DBMS_JOB;
DBMS_JOB.SUBMIT;
END LOOP;

END;

Then, each dbms_job was a simple call to a piece of code that used this logic:
BEGIN
Record the time this job started;

Export the stats for this object to a "history" table;
IF the table is partitioned THEN
IF the object to analyze is the partitioned table/index THEN
dbms_stats.gather_table/index_stats(
ownname=>o.ownname,
tabname=>o.tabname,
granularity=>'GLOBAL');
ELSE
dbms_stats.gather_table/index_stats(
ownname=>o.ownname,
tabname=>o.tabname,
partname =>o.partname);
END IF;
ELSE
dbms_stats.gather_table/index_stats(
ownname=>o.ownname,
tabname=>o.tabname);
END;
Record time job ended;
END;
I'm testing this in development right now to see how it compares to the old method. Right now it looks promising.

2 comments:

Anonymous said...

Jeff,

I'm trying to apply this logic to a process I'm currently manually execute

begin
for x in ( select username
from dba_users
where username not in ( 'SYS', 'SYSTEM' ) )
loop
dbms_stats.gather_SCHEMA_stats(
ownname=>'user_one',
tabname=>'big_fact_table',
partname=>'NULL',
estimate_percent=> 25,
degree => 3,
cascade => TRUE)
end loop;
end;

Would you please advise how to enable this inside the logic your showing?

Alf

Jeff Hunter said...

I don't understand what you are trying to do. Why would you loop through all your users and keep passing the same owner/table_name?