Tuesday, May 13, 2008

Something to be aware of (10.2.0.4)

My standard install of 10.2.0.3 included 23 patches at last count. When the 10.2.0.4 patchset came out a few weeks ago I checked it and decided that since my 23 patches were included in 10.2.0.4, it would be a good idea to upgrade to that version.

I first got underway in my development environment on x86_64. I have several small dbs with 2G SGAs on a single box. The patch/upgrade process went flawless and I encountered no issues at all during the upgrade.

I usually like to run the patch in development for a month or so before I apply it to my production enviornment. However, I was fighting a nasty bug on one of my 10.2.0.3 production dbs that was fixed in 10.2.0.4 but a backport wasn't available to 10.2.0.3 at the time. So after a week in development, I decided to upgrade that production db to 10.2.0.4. That db usually runs with a 16G SGA and when I tried to start the instance with the 10.2.0.4 software, I got the venerable:

SQL> startup
ORA-27102: out of memory
Linux-x86_64 Error: 28: No space left on device

OK, so I know that usually means that my kernel parameters are off somewhere. It was a Saturday and I didn't have a sysadmin available, so I bumped the SGA down to 12G and the instance started right away and upgraded fine. I figured there was some shared memory parameter that I would research on Monday and we'd be back in business the next week.

I looked at Note 301830.1 thinking maybe my shmall paramter was off, but sure enough I had enough configured. I submitted a TAR and basically let Oracle Support stew on it for a few days.

In the meantime, another DBA in my group got the task to setup a new x86_64 box with 10.2.0.4 so we can move a db. He got the software installed OK, but couldn't start an instance with an SGA of more than about 2G of memory. The kernel parameters were exactly the same as my other box, but he still couldn't start an instnace with any decent amount of memory.

This was just too co-incidental. So we played around with the 10.2.0.4 installation a little more and found that 10.2.0.4 was allocating multiple shared memory segments instead of just one big segment. I used ipcs to find my shmid for the SGA and then used pmap to find out that to allocate a 2G SGA, 10.2.0.4 used about 260 shared memory segments of between 15M and 5M.

Then we installed 10.2.0.3 and tried to start with the SAME EXACT parameter file, and sure enough, one shared memory segment of 2G. In fact, we could allocate almost all the way up to our shmall in one segment.

I'm bug 7016155 and I know others have run into this problem as well. I'm sure bug 7019967 think's he's alone in this, but he's not. As usual, my bug has been sitting out there for about 12 days and nobody has looked at it.

Part II

Click the 10.2.0.4 label below to follow all threads on this issue and the eventual solution.