I've got to admit, I'm pretty anal about backup and recovery. I've done some things without a proper backup in place and gotten burned because something went wrong. You only have to do that a couple times before you understand why you need a proper backup/recovery methodology. Before I let a new hire on my production systems, I make sure they understand how my backup/recovery works. In addition, I make them prove they know it by having them diagnose and recover a development database that I have crashed. I don't know everything about backup and recovery, but these are some lessons I have learned.
11. Send it off site
If you're going through the trouble of backing your data up to magnetic media, send it off site. In case of a fire (or god forbid another type of disaster) in your main data center you will be able to recover to a point in time. The best method is to duplicate your media and send the duplicate off site. If you can't duplicate your media, send it off site to a company that can return it to you 24x7. Yes, it's a pain duplicating your tapes and sending them off site, but when your power goes out at 16:04 (like in 8/2003 in the US) the banks will be closed and you won't be able to get to your media until the next business day.
10. Export/import is not a primary backup methodology.
Export and Import are fine tools for transferring data between databases as long as you can prevent changes from occurring. People usually setup export as their backup for one of two reasons; they feel "safe" because they have a copy of their data on disk or because it is easy to configure. The drawback is that using export for backup will only give you a single point-in-time to which you can recover. It is very unlikely that your database will crash the moment the export finishes. There are a couple of valid reasons why you want to use export as a secondary backup method, but as your primary backup methodology, forget it. If you do use exp/imp for a secondary backup method, make sure you use the consistent=y option to get a consistent copy of your data and use compress=n to make sure all your data doesn't go into one extent.
9. The needs of the business will dictate the backup methodology and frequency.
The rules under which your business operates will dictate the appropriate backup methodology. If your database must be up 24x7, there is no doubt you will need to run some sort of online backup (hot backup). If you have a short window for your Mean Time To Recovery (MTTR) it may mean a full backup at least once a day.
8. Have at least two backups available at all times.
I have talked to so many people that recycle their backup media every day. They stick the tape in the drive and backup their database. The tape remains in the drive for the next day when the backup overwrites the previous day's backup. When the database crashes during the backup, you have no way to recover.
7. Use RMAN over in-house developed scripts.
In-house developed scripts are error prone. Sure, there are a number of scripts out there that work in certain circumstances. RMAN works in every environment. Learn it, understand it, use it. Yes, it's different. The ability to run incremental backups alone makes the learning worthwhile.
6. Use a recovery catalog over the controlfile.
Both your recovery catalog and your control files store metadata about your database. The amount of history you can keep in the controlfile is limited by the control_file_keep_time init.ora parameter. In addition, the controlfile only stores information about the current incarnation of the database. If you ever need to restore before a resetlogs, you must use a recovery catalog. Last but not least, a recovery catalog lets you store your backup and recovery scripts whereas the controlfile does not.
5. Know how to do basic restore scenarios.
Your restore/recover methodology must include three basic recovery scenarios; recover the entire database, recover a tablespace, and recover a datafile. Most other recoveries will be a modification of these three basic types.
4. Not running in archivelog mode is a disaster waiting to happen.
When your database runs in archivelog mode you have two advantages. First and foremost, you can recover to any point in time. Second, you can recover from user DML errors by using logminer.
3. Cold backups are worthless (on a production system).
Cold (offline) backups offer several disadvantages but no advantages to a backup plan. The database has to be closed for a cold backup. The cache is empty when the database comes back up. All the SQL has to be reparsed. There are times you may want a to use a cold backup for maintenance reasons (major upgrade, moving database to a new host), but they offer no advantages in a daily backup plan.
2. Write it down.
Who knows what kind of disaster you will have. Maybe your building will be the next WTC. Maybe you got hit by a bus at 02:12 while you were coming in to do the restore. Maybe you've been at work for 18 hours and your brain is fried. Your thinking during a recovery should be limited. Write down your recovery scenarios so your manager could perform them.
1. Practice, Practice, Practice.
Practice your restore scenarios. You can be confident in your recovery because you just did it three months ago. If your test recovery doesn't work, it's better to know before you need it.