Thursday, May 24, 2007

What is going on in Oracle's QA department?

I've been an Oracle DBA since the days of 7.0.16 (on Netware, but that's a story for a different day). I'd like to think I have a decent amount of experience with Oracle and it's nuances. Sure, some things are quirky and don't work the way you think they should, but that's just getting used to the software. I can count on one hand the number of one-off patches I had to install over a base patchset in 7.x to 9.2. (I'm talking RDBMS here, not Oracle Apps.)

I specifically waited to move to 10g until it had been out a while. Our move to 10.2.0.3 has been quite disappointing. Sure, we did a lot of testing. But production is different than testing. After about 30 days of one db being live, we've had to install three one-off patches. And we just got slammed with two more in the last couple days to bring the total up to five. If we were on 10.2.0.0, I'd understand, but 10.2.0.3? Five freaking one-off patches.

C'mon Oracle, get your act together. We're trying to run a business here.

Friday, May 18, 2007

Oracle 10g upgrade gotcha

Everybody knows the Cost Based Optimizer needs decent statistics to figure out what his query plan should be. Sure, it takes time, but everybody does it. Some estimate, some compute, some think they have a better way.

After upgrading from 9.2.0.7 to 10.2.0.3, one of our systems suddenly slowed down. Not directly after the upgrade, but three days later. After some investigation, we found that some of the statistics had been estimated and not computed. That was pretty much a shock because my analyze method usually specifies estimate_percent=>null to make sure we compute. (Note, I already figured out that the default estimate_percent changed from 9i to 10g, so that wasn't it.)

I checked my custom analyze method and it reported that the table was never analyzed by me.

Hmmm, now it's getting real interesting. Just for kicks, I reran dbms_stats.gather_table_stats without an estimate_percent parameter and it estimated at about 8.3%. Aha! Must be using the AUTO_SAMPLE_SIZE for some reason.

So why?

Then I looked at the last_analyzed time of the tables that had been estimated. They all were run between 22:00 and 01:00. Aha, a schedule. Then I looked in dba_scheduler_jobs and there it was. A new job that Oracle automatically sets up to analyze stale objects.

When I checked in the morning, the objects didn't report as being stale. Then, after the day's data, Oracle decided the statistics were stale and analyzed the tables. When I went to gather statistics for stale objects the next morning, they had already been analyzed and weren't stale anymore. Except the statistics weren't optimal which made our queries "go bad".

Fortunately, there is a way you can disable this job just by tweaking the schedule.

BEGIN
DBMS_SCHEDULER.DISABLE('GATHER_STATS_JOB');
END;
/

Monday, May 14, 2007

Keywords

One thing that Statcounter provides me is how often certain keywords are found in my blog. Some of the more popular are:
  • ORA-27054: I assume this is by people trying to backup using RMAN over NFS. I hope they found my problem and most importantly, my solution.
  • Some combination of DB Links and ORA-01017: I'm sure people found my distributed issues with 9i and 10g working together. But that's not always the issue. If you are using what I call "unauthenticated" database links (no password in the definition) then the user authentication on both the source and remote database need to be the same. If we're talking an OS Authenticated user, then that user has to be OS Authenticated on both. If it's a regular db user, the passwords have to match.
  • Lots of people looking for some varient of backup and methodology. I hope they found lots of info in the Backup Top 11, but I don't think I've ever described my backup methodology for Oracle. Maybe in a blog post in the future.
  • I get lots of hits on the Quick and Dirty Backup for MySQL. Seems like a lot of people are looking for an easy solution to MySQL without having to fork over some cash for backup software.
  • I also get lots of hits on The MySQL to an Oracle DBA series, but not too many through keyword searches.
  • For the guy who's searching "mysql php round differently", I'd look at your datatype in the database to make sure you're using the proper number type.
  • Looking for information on "_db_writer_flush_imu"? Better know what you're doing.
  • Oh, and the person looking for "ingres porn server"...good luck on that.

Wednesday, May 09, 2007

Recognize your mistakes

I was reading an article in Business Week by Jack and Suzy Welch last night about the qualities of successful managers. Of the various qualities, one in particular stood out:
We would also add two other qualities to the must-have list. One is a heavy-duty resilience, a requirement because anyone who is really in the game messes up at some point. You're not playing hard enough if you don't! But when your turn comes, don't make the all-too-human mistake of thinking getting ahead is about minimizing what happened. The most successful people in any job always own the failures, learn from them, regroup, and then start again with renewed speed, vigor, and conviction.

That's pretty good advice. Admit your mistakes, correct them, and move on. In my company, that's just the way it works. We don't waste time finger-pointing. If there was a failure, we put appropriate measures in place so it doesn't happen again and move on.

Gary Kaltbaum frequently says "Recognize your mistakes early." If you make a mistake, understand how you got into this situation and correct it as quickly as possible. Don't let your mistake fester and become a bigger problem down the road.