Wednesday, April 04, 2007

10gR2/9iR2 Distributed bugs

Our environment is highly distributed on a bunch of nodes partitioned out by business area. When a business area needs information from another business area, we create a Materialized View and pull the data across a database link. Most systems don't need up-to-the-minute data and when they do, there's other ways to get it.

Now, we're in a somewhat unique position because we're migrating Solaris 9iR2 dbs to Linux 10gR2. In fact, we just migrated one this weekend and experienced anywhere from 100% to 500% increase in performance depending on how much database work was actually being done. However, we did encounter some issues with db links that you might want to be aware of when migrating.

The master in this case was a 9.2.0.5 db on Solaris. We upgraded it to 10.2.0.3 and moved to Linux at the same time (good old export/import). There were several "client" dbs that ran the gamut across Oracle versions and OS Platforms.

The first problem was with 9.2.0.8/Solaris trying to get data from the 10.2.0.3/Linux db. We kept running into an error while trying to refresh the Mviews:

ERROR at line 1:
ORA-12008: error in materialized view refresh path
ORA-04052: error occurred when looking up remote object MYUSER.MVIEW_NAME@DBLINK_NAME
ORA-00604: error occurred at recursive SQL level 2
ORA-03106: fatal two-task communication protocol error
ORA-02063: preceding line from DBLINK_NAME
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 820
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 877
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 858

Here, the problem stems from bug 5671074. In this bug, 9.2.0.8 has a problem using dblinks when the "Endianess" doesn't match between sites. Also check out ML Note 415788.1. There is a one-off patch for the 9.2.0.8 software that needs to be applied and it worked fine for us.

The second issue was a little more straight forward, but unfathomable how it didn't get caught in testing. Here, we have two 10.2.0.3 dbs trying to use an unauthenticated database link (ie. no username/password hard-coded into the definition). An OS Authenticated user can connect to both dbs, but when he tries to access an object across a database link, he is presented with ORA-01017: invalid username/password. This functionality has worked since 8i and it escapes me how it slipped through the test cycle. This too resulted in a bug, but in 10.2. Check out bug 4624183 and ML Note 339238.1 for the current status. As of this post, the bug is fixed in 11.0 and has been backported to some platforms (not mine, of course). The only work around that worked for us was to use authenticated db links with the username/password embedded in the db link. Yuck.

2 comments:

gandolf989 said...

Hi Jeff. Does this mean that you are now favoring the Intel/AMD chip over the Sparc? Are you heading towards 64 bit on Linux? We just recently got burned, by buying a T2000. It really didn't fit into our environment, and we need to use a software initiator to connect it to the SAN rather than a hardware initiator. Just curious about why you are heading off into Linux land from Sun.

Jeff Hunter said...

Lets put it this way: Nothing new is going on Sun, the majority of stuff is moving from Sun to Linux X86_64. There will be a few legacy apps on Sun (Oracle Apps, NFS, etc.) but as far as expanding the base goes, it ain't happening.