Friday, March 31, 2006

New Blog

Welcome Beth to the Oracle Blogging community. Check out her blog, Confessions of a database geek.

Thursday, March 30, 2006

Oracle SQL Developer

I got used to using Oracle's Raptor when it first came out. It had some quirks, but in general it filled a niche that was missing in the Linux/Unix development environment.

I have installed the production version of Oracle SQL Developer (aka Raptor) and have let a limited group of users know about it. It seems like a lot of the Java exception errors have been fixed and the pointer control is pretty good.

One thing that I found was it is registers a funky PID in v$session.process when you login. The v$session view usually shows a fake PID (1234) for other JDBC Thin sessions, but somehow SQL Developer registers the process id as "pid@hostname". That definitely helps in identifying sessions.

OS Authenticated users don't work, but I wouldn't expect them to work with the Thin driver anyway. Maybe there's an option to use the OCI driver where available?

If the beta users find it stable, I'll unleash it on the general public Monday.

Wednesday, March 29, 2006

Community Support

I truly believe that you can get great advice from the community at large in technology matters. I, myself, have received lots of advice from quality sites and qualified people over the years. The first time I posted on c.d.o.s I received a reply from some guy named Tom Kyte that suggested I use a new feature of Oracle 8 called "Instead-of" triggers.

I try to make a meaningful contribution where I can. I think there's a lot of people out there that foster this commuity spirit. That's why I was a little frustrated when I saw Tim Hall's blog this morning. Here's a guy who has more than enough things to do and is working on a draft article on how you might install Oracle on FC5 that happens to get indexed by Google. And then he gets lambasted in the newsgroups. Not cool. Not cool at all....

Monday, March 27, 2006

Comparison of MySQL and Oracle Client install

In my environment, I like to install the server software on the server and install a client version of the software in a common area. If you need to link to certain libraries or use programs to connect to the database, I'll put them in /usr/local for everybody to use. My authentication is in the database, so I don't care if you can run a program but not connect to the database.

It's no secret that I'm a MySQL newbie and I've had my share of issues installing the MySQL client. That was kind of a special case since I wasn't using a mainstream compiler, but even so, I was able to get the problem fixed in relatively short order.

One of my tasks lately has been to upgrade the Oracle client to version 10.2. I've installed the client software a couple dozen times and the server software maybe 100 times if I want to be conservative. So when I fired up the installer, I wasn't expecting many surprises.

As expected, I installed the base 10.2.0.1 release in about 20 minutes. I then applied the 10.2.0.2 patch on top of it in another 10 minutes. I ran sqlplus as oracle and was able to connect to a database, so I pushed it out to a group of test users.

Two seconds later: "Ah, sqlplus doesn't work."

"I just ran it, your environment variables must not be setup correctly."

"Ah, sqlplus still doesn't work."

So I logout as oracle and try it from my workstation:
sqlplus system@foo.bar
ld.so.1: sqlplus: fatal: /usr/local/oracle/lib/libsqlplus.so: Permission denied
Killed

sonofa...WTF can be wrong?

Sure enough, I didn't have permission on the library. I logged in as oracle, and I had permission on the file, but world didn't have any permissions.

Hmm, That never happened before. I must have screwed something up.

I re-installed and had the same problem.

I logged a TAR to Oracle support and seems like it's a known bug (4516865).

Another great example of why you get so much more value with Oracle than MySQL. Sigh.

Saturday, March 25, 2006

Working on my ORA-00202

I’ve been going crazy fighting an ORA-00202 error for the last couple of weeks.

Whenever we ran a Level 0 backup, some of the monitoring processes were getting a timeout while waiting on the control file. The worse part is the error wasn’t consistent. About 90 minutes into a 3 hour backup we would get this error message thrown in the alert.log.

The message in the trace file seemed to suggest one of the control files was corrupt. Our initial research showed that when we brought the database down in immediate mode, that “diff” reported the control files were indeed different.

My first inclination was that somehow rman was corrupting the control file during the backup. I know, pretty unlikely, but that’s where the symptoms pointed. I created a TAR and provided some info to OCS about my problem. I also learned about the debug option of rman (which is pretty cool, although I couldn’t tell you how to read the file). OCS confirmed that rman wasn’t corrupting the control file (like they would say it was anyway). They suggested it was our monitoring software that was causing a problem. Since I have this software running against multiple databases and this is the only one we’re having a problem with, I didn’t think that made much sense either.

I started investigating the workload on the server. This instance is not used very heavily and is on a pretty beefy server. While the backup was running, the vmstat statistics showed that the filesystem cache kept increasing and the available memory was decreasing. Eventually, the system exhibited classic swapping symptoms, but without the typical indicators.

“Hmm”, I thought, “that shouldn’t happen because of direct I/O.”

Just for kicks, I verified that direct I/O was on by shutting down the database, starting it up in NOMOUNT, starting strace on the dbwriter process, and opening the database. I know that if the datafiles are opened with the O_DIRECT flag, you’re using direct I/O.

But wait, there was no O_DIRECT flag in the trace file.

Aha. I searched Metalink and sure enough, 9.2.0.7 doesn’t include direct I/O out of the box. I already had the filesystemio_options=setall, but apparently, you have to apply two patches on top of 9.2.0.7 for Linux. I applied the two patches in a new $OH, started the database in the new $OH, and retested my backups. Sure enough, the ORA-00202 error went away. My theory is that this "semi-swapping" caused extreme slowdowns on the system which caused the monitoring software to experience long waits for the controlfile. I’m still not clear on why the control files were different when I brought the db down, but the database could be started with either the “good” or the “bad” control file.

Friday, March 24, 2006

How the mighty have fallen

I spent a fair amout of time in a regional telecom company back in the mid to late 90's. We did several services (paging, internet ISP, cable, frame relay, etc.) but the big cash cow was long distance.

The Telecommunications Act of 1996 (TA96) was supposed to change all that. That bill opened up the stranglehold that the baby bells had on the local markets and was supposed to offer up competition for the consumer. By the time I left that company in 2000, the telecom industry was one of the strongest industries in the country.

A couple years later, the company I worked for got gobbled up by some company in Georgia.

The mighty AT&T was bought by SBC, a company it originally spun off.

The troubles at WorldCom/MCI are well publicized.

Now I see Lucent and Alcatel are thinking of getting together.

All this in six short years. Wow, how time flies.

Thursday, March 23, 2006

At least it's not Dook...

OK, the Tarheels might not be National Champs this year, but neither will the Dookies.

Sunday, March 19, 2006

Necessary Changes

You may notice the blog has undergone a change.

When I checked today, I only saw the links for the feeds. I republished the template again, and still only got the links. I quickly changed back to one of the standard blogger templates and added my links back in. I think it's OK now, but will monitor closely the next couple of days.

Saturday, March 18, 2006

Blogspot Issues



I thought maybe I was the only one having issues getting to the blog yesterday until the geek pointed out he was having problems. Then I noticed my RSS reader didn't have my new post from yesterday.

This morning I got a message from Mr. Parallel Query indicating my links were screwed up. Doug suggested I republish to fix the links and sure enough, that worked.

I checked my stats for yesterday and visits were way down. I hope blogspot's problems are fixed from the last couple days. Then again, what should I expect for free?

Friday, March 17, 2006

Reading "Cost-Based Oracle Fundamentals"

I started reading Cost-Based Oracle Fundamentals by Jonathan Lewis last night. In the very beginning he suggests reading the book in its entirety as a first pass and then reading it again for more detail.

"Bah", I thought to myself, "I need to get on to Tom Kyte's book, I don't have time to read it twice."

On page 2 I pulled out the highlighter.

Page 6, eight highlighted passages.

By page 9 I had abandonded the highlighter and proceeded with casual reading. Tom will have to wait.

Tuesday, March 14, 2006

Interviewing Pet Peeves

Interviewing is one of those areas that you become pretty good at if you're in IT. Over the last couple of years, I have come to interview quite a few candidates for various job. Here are some of my pet peeves of interviewing candidates:

  1. When I schedule a time for you to come in, don't arrive 30 minutes early. Go get a cup of cofee or read the newspaper in your car. I specifically scheduled you at that time because I've got a bunch of things I have to do first thing in the morning and I need to make sure everybody is off and running before I take 2 hours out to talk to you.
  2. If I ask you to come in, don't ask me for directions. Sure, I'll give you the address, but we're all big girls and boys and we know how to use Mapquest, right?
  3. I'll take about 2 minutes to go over the company and what we do, but I expect you to have some background on the company before I talk to you at length. We have this thing called the Internet now.
  4. Don't ask me specifics about benefits. Sure we have a 401(k) and a Medical plan, but I don't have the slightest idea about the specifics. That's why we have HR.
  5. When I schedule a time to call you, don't give me your cell phone number. I'm calling you about a job, not to go out for a beer. If you don't want to talk at work, I'm cool with that. In fact, I'd rather call you at home where we can let the conversation go where it wants to.

Monday, March 13, 2006

Oracle Backup (aka Reliaty)

From what I can gather from this (starting on pg. 2) and this (pg. 20), Oracle has a product called "Oracle Backup" that used to be a backup product called "Reliaty". I can't find a lick of information on it though. If anybody can point me in the right direction, I'd appreciate it.

Update: 3/14/2006
Straight from the source, I like it: (posted in comments)

Oracle Secure Backup is scheduled to release by end of FY06 (May 06). I apologize for the confusion, but stay tuned to OTN main page over the next few weeks for more information on Oracle's newest data protection product.

Thanks!

Posted by Tammy Bednar
From the slides, it looks like it is a complement to RMAN, which would make sense. Now I'm hoping it can fill a void that NFS has left in my backup strategy. (Hopefully, the "low cost" in the slides will take the same path as the recent products from Oracle and come out as "no cost").

Sunday, March 12, 2006

That time of year

It's that time of the year in the US where passionate College Basketball fans glue themselves to the TV all weekend long. The field of 65 has been drawn as we begin scrutinizing the brackets for the office pool. Yes, your employers probably cringe during this time of year as we use company resources to track our picks and scores, but hey, it's March Madness.

I'm a Tarheel at heart, so I can guarantee you they'll be one of my final four picks. All my neighbors will be rooting for the Huskies in my backyard, and the Orangemen are a dangerous team. But lets face it, after the Tarheels, I'm rooting for whoever is playing the Dookies.

Anybody got an E-Size DeskJet to print your brackets out?

Thursday, March 09, 2006

I'm Tops on Google!!!

Search for dbms_stats.alter_database_tab_monitoring on Google, and I come on top!

Interesting ORA-00202

I have been getting an interesting ORA-00202 errors thrown to my alert.log during a backup on one of my databases. It's only happened a couple of times, but I am curious about the cause. The text of the ORA-00202 error is:

$ oerr ora 00202
00202, 00000, "controlfile: '%s'"
// *Cause: This message reports the name file involved in other messages.
// *Action: See associated error messages for a description of the problem.

While the rman level 0 backup is running, my monitoring software encounters the ORA-00202 error when trying to query v$log and dba_data_files. The message in the alert.log is:

Errors in file /oracle/admin/db1/udump/db1_ora_5953.trc:
ORA-00202: controlfile: '/u01/oradata/db1/control01.ctl'

When I look at the trace file, I see:

*** SESSION ID:(11.32) 2006-03-08 17:44:26.211
***
Corrupt block relative dba: 0x00000001 (file 0, block 1)
Fractured block found during control file header read
Data in bad block -
type: 21 format: 2 rdba: 0x00000001
last change scn: 0xffff.0017641a seq: 0x1 flg: 0x04
consistency value in tail: 0x64191501
check value in block header: 0x24a8, computed block checksum: 0x3
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
*** 2006-03-08 17:44:27.245
ksedmp: internal or fatal error
ORA-00202: controlfile: '/u01/oradata/db1/control01.ctl'
Current SQL statement for this session:
SELECT file_name,
tablespace_name
FROM dba_data_files
WHERE autoextensible='YES'
AND maxblocks-blocks <= increment_by AND maxblocks-blocks <> 0

"Ah Ha", I say to myself, "must be a corrupt control file." I shutdown the database immediate and diffed the two control files. Although there were the same size, diff reported that they were different. Sure enough, a corrupt control file. So I copied my good control file over the bad one and restarted my database.

A couple days later, same problem. Same control file. The same control file seems unlikely unless there is an OS related error on that filesystem. I scour the messages file and there's no I/O related errors. I shutdown the database, diffed the control files again, and sure enough, they're different.

I recopied my good control file over my bad one again, started the database, switched logfiles, and shutdown immediate. Then I tested my theory that the control files should be the same and diff reported that they were the same. This time, however, I moved control01.ctl to another filesystem just for kicks to see what would happen.

My theory is that my monitoring software can't get a consistent view of the control file while rman is writing to it. At this point I'm unsure if the control file is corrupt or Oracle is just throwing the wrong error. We'll see what Oracle says...

Check back in 5 days (if I'm lucky)...

Update: I've done some testing and come to some conclusions.

Wednesday, March 08, 2006

A good year

No matter what their record is at the end of the year, as long as Carolina beats Duke at least once a year, it's a good year.

Andy Katz
Game Recap