Friday, May 11, 2012

Optimizer Funkiness...

I have two simple tables that I want to join in an outer join:

SQL> @bug32052.sql
SQL> set feedback on
SQL> 
SQL> drop table jh1;


Table dropped.


SQL> create table jh1 (x number(10));


Table created.


SQL> 
SQL> drop table jh2;


Table dropped.


SQL> create table jh2 (x number(10));


Table created.


SQL> 
SQL> insert into jh1 values (1);


1 row created.


SQL> commit;


Commit complete.


SQL> 
SQL> select count(*) from jh1;


  COUNT(*)                                                                      
----------                                                                      
         1                                                                      


1 row selected.


SQL> 
SQL> select count(*) from jh2;


  COUNT(*)                                                                      
----------                                                                      
         0                                                                      


1 row selected.


SQL> 

I write a simple OUTER JOIN query...

SQL> select jh1.x, jh2.x from jh1
  2  full outer join jh2 on jh2.x = jh1.x;
...and I expect to get a "1" from jh1 and a null value from jh2:

         X          X                                                           
---------- ----------                                                           
         1                                                                      

1 row selected.

SQL> 

Perfect, that's what I am expecting.  Except in one of my dbs, I get this:

SQL> select jh1.x, jh2.x from jh1
  2  full outer join jh2 on jh2.x = jh1.x;

no rows selected

SQL> 

Hmm, interesting.  After a little detective work, I find on one database my execution plan is:

SQL> select jh1.x, jh2.x from jh1
  2  full outer join jh2 on jh2.x = jh1.x;

         X          X
---------- ----------
         1

1 row selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1813927128

----------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |     1 |    26 |     5  (20)| 00:00:01 |
|   1 |  VIEW                 | VW_FOJ_0 |     1 |    26 |     5  (20)| 00:00:01 |
|*  2 |   HASH JOIN FULL OUTER|          |     1 |    26 |     5  (20)| 00:00:01 |
|   3 |    TABLE ACCESS FULL  | JH2      |       |       |     2   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL  | JH1      |     1 |    13 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("JH2"."X"="JH1"."X")

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
         10  recursive calls
          0  db block gets
         17  consistent gets
          0  physical reads
          0  redo size
        391  bytes sent via SQL*Net to client
        359  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

and on the db where the query comes back with no rows, it's:

SQL> select jh1.x, jh2.x from jh1
  2  full outer join jh2 on jh2.x = jh1.x;

no rows selected


Execution Plan
----------------------------------------------------------
Plan hash value: 2793306587

-----------------------------------------
| Id  | Operation            | Name     |
-----------------------------------------
|   0 | SELECT STATEMENT     |          |
|   1 |  VIEW                | VW_FOJ_0 |
|   2 |   MERGE JOIN         |          |
|   3 |    SORT JOIN         |          |
|   4 |     TABLE ACCESS FULL| JH2      |
|*  5 |    SORT JOIN         |          |
|   6 |     TABLE ACCESS FULL| JH1      |
-----------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("JH2"."X"="JH1"."X")
       filter("JH2"."X"="JH1"."X")

Note
-----
   - rule based optimizer used (consider using cbo)


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        298  bytes sent via SQL*Net to client
        353  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          0  rows processed

So it looks like we have an issue with the Rule Based Optimizer.  I confirm the issue by passing the /*+ rule */ hint on both databases and get the wrong results.  I then use the /*+ dynamic_sampling=7 */ hint on both and get the correct results.

While I know I can get around the issue by setting optimizer_dynamic_sampling=7 in my init.ora file, I am reluctant because I don't know what other queries it will mess up.  I have an SR with Oracle to try and explain why it works one way on one db and one way on the other.

Environment:
  • 11.2.0.2 with the same patches applied on both OHs.
  • Linux x86_64
  • Basically same parameters on both dbs (which is probably where I am going wrong).

Monday, March 12, 2012

When will RHEL6 be certified?

Red Hat Enterprise Linux has been out since November of 2010.  Here we are 16 months later and it's not certified for database use by Oracle yet.

Yes, I know the database can be installed on RHEL6.  In fact, I have it running in my development environment right now with no issues related to the OS/DB integration.  Try to get Oracle Support to help you on those platforms for an issue unrelated to the OS and they still tell you they can't help you because the OS isn't certified.

Red Hat has submitted the results of their testing to Oracle almost a year ago!

Granted, Tikanga is fine for most stuff.  But when the rest of your organization has moved on to Santiago, it's hard to justify why the 5% of Oracle machines need to be on the old version of the OS.

Some interesting posts on the topic by Tim Hall:
What if Oracle 11gR2 never gets certified on RHEL6?
Oracle Database 11gR2 on RHEL6: Certified or Not?

Tuesday, January 24, 2012

Resumes & Job Objectives

I've been reviewing a lot of resume's lately.  Please tell me, what is the purpose stating your "Job Objective"?  Isn't it implied that your objective is to find a new job, specifically my job? I assume you're dynamic and technical and your vast expertise will help my company conquer the world.

Also, what is the purpose of summarizing your experience on the first page and then giving me eight more pages of the stuff you just summarized?  Seriously, doesn't resume mean summary?

I've been at this for 20+ years, my resume is 2 pages.  I don't need to know you did export/import at every one of your jobs for the last 12 years.

Back to the stack...

Tuesday, November 01, 2011

Is SQL*Plus Resumable?

I am using resumable space for a lot of my operations. However, it seems as though the SQL*Plus copy command doesn't respect the resumable feature?

Note, I am issuing an "alter session enable resumable timeout 14400" through a login trigger. I know this works because if I do the same type of operation through an INSERT/SELECT through a dblink, my session waits for more space to be added.
user@whouse1.us> select sysdate from dual;

SYSDATE
-------------------
11/01/2011 15:48:45

Elapsed: 00:00:00.03
user@whouse1.us> COPY FROM user1/pw@proddb.us TO user/password@whouse1.us 
append dest_table (f1, f2, f3) using select f1, f2, f3 from source_table;

Array fetch/bind size is 1000. (arraysize is 1000)
Will commit after every 1000 array binds. (copycommit is 1000)
Maximum long size is 80. (long is 80)

ERROR:
ORA-01653: unable to extend table USER.DEST_TABLE by 4096 in tablespace JH_TEST

user@whouse1.us> select sysdate from dual;

SYSDATE
-------------------
11/01/2011 15:49:01

Elapsed: 00:00:00.03
Is RESUMABLE space not available with SQL*Plus copy?

Wednesday, September 28, 2011

Interesting Error

We just upgraded one of our 10.2.0.3 dbs to 11.2.0.2 plus some patches.  Today an interesting ORA-07445 was thrown to the alert.log:

ORA-07445: exception encountered: core dump [__intel_new_memcpy()+382] [SIGILL] [ADDR:0x3FE5CAE] [PC:0x3FE5CAE] [Illegal operand] []

Searching metalink didn't get us anywhere, so we opened a TAR (or iTar, or SR, or ServReq, or whatever the heck they are calling them these days).  While Oracle Support was looking at the issue, I looked at the query generating the error:

SELECT nvl(t.value, null) FROM table t WHERE t.id = 12345;

Huh?  If the value is null, substitute a null?  WTF?

Presto-changeo...

SELECT t.value FROM table t WHERE t.id = 12345;

...and the error goes away.

Is it a bug on Oracle's side? Sure.

Is it a stupid thing to do on our side? You betcha.

Thursday, September 15, 2011

Common Sense

Developer:
I am running this query:
DELETE FROM tab;
It's been running since 2PM yesterday.  Can you make it go faster?

DBA:
You shouldn't delete 80M rows in a single transaction.  You should have used TRUNCATE instead of DELETE.  I have killed your session and it will be a while before it completes.

Developer (5 minutes later):
I tried TRUNCATE but it says my table is locked.

DBA:
I know, the table will be locked up for a while while the transaction rolls back.

Developer:
Can you make it go faster?

DBA:
I have turned on _transmogrify_delete_queries on so that is the best I can do.

Developer:
Thanks!! That should definitely help!!!