Monday, June 23, 2008

Something to be aware of, Part II

We've passed the one month anniversary for Something to be aware of ( and still no meaningful results. I've been going back and forth with Oracle Support about switching NUMA off. In a synopsis:

OS: Switch NUMA off and retest.
Me: But that never changed.
OS: Switch NUMA off and retest.
Me: We take kernel changes very seriously. It will take approximately 2 months to re-certify that configuration. Besides, that NUMA didn't change.
OS: Switch NUMA off and retest.
Me: Please direct me to the note that says requires NUMA changes.
OS: Switch NUMA off and retest.
Me: I've already proven works as expected in this configuration, NUMA settings never changed.
OS: Switch NUMA off and retest.

My next step is talking to a Duty manager. I really don't want to do that.

Click the label below to follow all threads on this issue and the eventual solution.


jason said...

IMHO, I think you should escalate for all it's worth, this seems like a real screw up by Oracle development.

I wonder if you can tell me if you are using Hugepages?

Just about to go to myself from, it'll be a bad day if we encounter similar problems!


Frits Hoogland said...

I don't know if it's something, but in 11g memory can be allocated both from shared memory and from tmpfs. That's definitely a change. In 11g tmpfs allocation of shared memory is needed for max_memory parameter.

Amit said...

I would suggest escalating to the Duty Manager and expressing the concern. At the same time, if you have a test environment where it can be tested, then it will help.Anyways once you will setup this system too, Oracle Development would say to give a testcase including init.ora , etc ;) So escalating makes sense ..


Kevin Closson said...

if this is a Proliant box you can also disable NUMA in the BIOS which is ever-so-slightly different than booting linux with numa-off.

what does numactl -hardware show directly before you try to start up the SGA and what is in your init.ora?

Michael said...


Any word from oracle development on a fix? Have they agree its something on their end and not redhat/linux?

We will be increasing our sga as our db grows so this issue is a serious one. Please keep us posted on what oracle comes up with.


Jeff Hunter said...

No word yet. However, Kevin Closson has been able to get a 16G SGA started on HP/AMD Opteron, so it's possible, just not obvious what is wrong. More updates as I get them...

Michael said...


Did you notice Mary's ipcs -m result?

$ ipcs -m
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 0 root 644 72 2
0x00000000 32769 root 644 16384 2
0x00000000 65538 root 644 280 2
0x1420f290 393219 oracle 600 17702060032 12

Its a single share memory segment for oracle. I've have 4 segment used by oracle and in your case you seem to have quite a few more.

Wonder how Mary got oracle to use a single shared-memory segment like