Monday, January 18, 2010

Corrupt Backup Piece

So, I was doing a project the other day to restore my production 10.2.0.3 database to my development server.  Since I use Recover Manager to backup, my normal course of action is to use the DUPLICATE feature of rman to restore my backup to my QA server and apply the archived redo logs.

It had been a long time since I last restored this particular database.  I had an old init.ora file laying around, so I figured I might as well use it.  I started the instance in NOMOUNT mode and fired off my duplication.

About half way through, I received the following error:

ORA-19870: error reading backup piece /nfs/backup/PROD/20100112.3fl38k2l_1_1
ORA-19599: block number 144709 is corrupt in backup piece /nfs/backup/PROD/20100112.3fl38k2l_1_1

Hmm, that's different.

I just figured my backup pieces were corrupt somehow, so I took another backup on the source db.  I started restoring it on the QA server, and BAM!  Same thing.

This was a job for Oracle Support.  I created an SR and while they didn't have an immediate solution for me, they did bring up that there were a couple unverified bugs with compressed backups that might cause this error.  Since I was using compressed backup pieces, I figured this was worth a shot to try.

I then took a backup on my production db without compressing the backup pieces.  When I went to duplicate on my QA server, the restore part succeeded!  The only thing that failed was creating the controlfile which gave me an error about a 9.2 controlfile was not compatible with a 10.2 database.  Seems as though the old init.ora file had a compatible=9.2.0 parameter in it which was preventing my controlfile from being created.  I change the compatible setting to 10.2.0.3 and recreated my controlfile and recovered my db with no problem.

My Oracle Support Analyst suggested I retry the duplicate using a compressed backup set.  Lo and Behold, I was able to duplicate with the compressed pieces.  A theory was tested that the compatible parameter set to 9.2 caused Oracle to think my backup pieces were corrupt.  I changed the parameter back to 9.2 and sure enough, same error during the duplicate.

Just something to be aware of (And no, the lesson is not be stupid and forget to check  your init.ora file).

3 comments:

Anonymous said...

Jeff, while I usually learn a lot from your blog, I would think that the lesson you would learn from this would be to include any init files and spfiles with your backup script. In my backup script not only do I have init/spfiles I also beckup the $TNS_ADMIN folder, contab and ora password file.

But I have been burned by using the wrong init file.

Thanks,

Gandolf989

Joel Garry said...

There's also the lesson of being a bit skeptical about error messages. They may make sense after you figure out what is going on, but lead you down the garden path if you take them at face value.

word: foxidis

Jeff Hunter said...

All my configuration files (init.ora, listener.ora, tnsnames.ora, etc.) for production are in CVS. I don't have QA init.ora files in CVS, no do I back them up. While I admit that I should have use the init.ora from the production db, I just thought it was interesting that the error message only made sense after the fact and wasn't really clear when I first got it.

And yes Joel, I have become a little more skeptical about error messages.