Thursday, October 15, 2009

Oracle Backup and Recovery on Windows, Part I - Database Backups using RMAN:

1. Introduction:

This article is the first in a series of three, introducing Oracle's RMAN (Recovery MANager) backup and recovery tool to novice DBAs. The articles focus on the Windows operating system, but can be easily adapted to other platforms. RMAN will work in the same manner on all operating systems (OS) that Oracle runs on. However, OS functions such as script scheduling etc. obviously vary from platform to platform. The present article describes the implementation of a simple Oracle backup strategy using RMAN. The next article will discuss some recovery scenarios within this backup framework, and the final article will deal with disaster recovery.

We begin with a statement of assumptions regarding the database environment and business requirements regarding data recoverability. This serves to anchor the subsequent discussion in a definite context. We then briefly discuss the files relevant to operation of Oracle. Finally, we move on to the main topic - a discussion of RMAN and how it can be used to backup a database.

Following are the assumptions pertaining to the database environment and business expectations regarding data recovery:
The database is hosted on a Windows NT / 2000 / 2003 server.
The database software is Oracle 9i.
Users expect the database to be available round the clock, so the database cannot be shutdown for backups. This makes an online (hot) backup mandatory.
In case of a disaster, where the server is destroyed, the business expects us to be able to recover all transactions up to the previous working day - i.e. a maximum of 24 hours data loss is acceptable. (We will discuss options for doing better than this in the third article of this series)
The database is relatively small - say 1 to 30 GB in size
Specified directories on the host server are backed up to to tape every night using an OS backup utility.
We will configure RMAN to backup to disk. These backups will then be copied to tape by the OS backup utility. For completeness we note that RMAN can be configured to backup the database directly to tape, via an interface called MML (Media Management Library) which integrates third party backup products with RMAN. MML can be somewhat fiddly to setup, and the details depend on the third-party product used. We will not discuss RMAN backups to tape in the present article.

The server has three independent disks - named C:, D: and E:. The contents of the drives are as follows:
C: - The operating system and database software (Oracle home is c:\oracle\ora92. The oracle home directory is normally referred to as ORACLE_HOME). We will also keep a copy of the controlfile and a group of online redo logs on this disk. This isn't ideal, but will have to do because of the limited number of disks we have. Database files and their function are described in the next section.
D: - All datafiles, a copy of the controlfile and one group of online redo logs.
E: - A copy of the controlfile, all archived logs and database backups. RMAN will be configured to backup to this drive. Note that the backups could reside on D: instead.
All disks should be mirrored using some form of RAID technology. Note that the above layout isn't ideal - we're limited by the number of disks available. If you have more disks you could, and probably should, configure a better layout.


2. Oracle database files:

In order to perform backups effectively, it is necessary to understand a bit about the various files that comprise a database. This section describes the files that make up an Oracle database. The descriptions given here are very brief. Please check the Oracle Administrator's Guide for further details.

Oracle requires the following files for its operation:
Datafiles: These hold logical units (called blocks) that make up the database. The SYSTEM tablespace datafile, which holds the data dictionary and other system-owned database objects, is required for database operation. The database can operate without non-SYSTEM datafiles as long as no data is requested from or modified in blocks within those files. In our case all datafiles are located on D:.
Data files can be backed up by RMAN.
Online redo logs: These files hold transaction information that is necessary for transaction rollback, and for instance recovery in case of a database crash. Since these files hold critical information, it is wise to maintain two or more identical copies of these files on independent disks (multiplexing). Each set of copies is known as a redo log group. The members of a group should be on separate disks. In our case we maintain three groups with two members per group - each group has one member D: and the other on E: (instead of E:, one could maintain the second set on C:). When writing to the logs, Oracle cycles through a group at a time in a circular manner - i.e. once all groups have been written to, the first one is reused, overwriting its previous contents. Hence the importance of archiving filled logs as the system switches from one log to the next. - see item (4) below.
Note that online logs should NEVER be backed up in a hot backup. If you do back them up, there is a danger that you may overwrite your current (valid) logs with the backed up ones, which are out of date and therefore invalid. Never ever backup your redo logs in a hot backup scenario..
Control file: This file holds, among other things, the physical structure of the database (location of all files), current database state (via a system change number, or SCN - which is a monotonically increasing number that is incremented on every commit), and information regarding backups taken by RMAN. The control files, being critical to database operation, should also be multiplexed. We will maintain three copies, one each on C:, D: and E: drive.
Controlfiles can be optionally backed up by RMAN.
Archived redo logs: These are archived copies of online redo logs, created after a switch occurs from one online log to the next one in the group. Archiving ensures that a complete transaction history is maintained. Once an online log is archived, it can be overwritten without any loss of transaction history. Archive logs are required for online backups, so they are mandatory in our scenario. We maintain a single copy of these on E: drive. This is a weak point in the present backup strategy (only one copy of archive logs). There are ways to do better - one can maintain up to ten independent sets of archive logs at different physical locations - check the documentation for details.
Note that online logs are archived only when the database operates in ARCHIVELOG mode. The default operation mode is NOARCHIVELOG - where online redo logs are not archived before being overwritten. It would take us too far afield to discuss this any further - please check the Oracle Administrator's Guide for instructions on configuring your database to archivelog mode.
Archive logs can be optionally backed up by RMAN.
Server parameter file (also known as Stored parameter file): This file contains initialization parameters that are used to configure the Oracle instance on startup. We maintain this file in its default location (ORACLE_HOME\database).
The server parameter file can be optionally backed up by RMAN.
Password file: This file stores credentials of users who are allowed to log on to Oracle as privileged (sysdba) users without having to supply an Oracle password. All users belonging to the Windows OS group ORA_DBA are included in the password file. Users with sysdba privileges are allowed to perform database startup, shutdown and backups (among other administrative operations). Please check the Oracle Administrator's guide for details on sysdba and sysoper privileges. We maintain the password file in its default location (ORACLE_HOME\database).
This file is not backed up by RMAN, and must be backed up via OS commands.
Networking files: These files (tnsnames.ora, listener.ora and sqlnet.ora) are Oracle network configuration files. They are maintained in their standard location - ORACLE_HOME\network\admin.
These files are not backed up by RMAN, and must be backed up via OS commands.
Additionally, it is a good idea to store a copy of all database creation scripts in the backup directory. This will help in determining the correct database file directory structure in case of a disaster. This, however, is not absolutely necessary as file placement information can be retrieved from the backup controlfile. More on this in the third article of this series.

3. OS backup utilities vs. RMAN- a brief discussion:

OS Backup utilities copy OS files from disk to tape. By themselves they are not useful for database backups, unless the database is closed. The reason they cannot be used to backup open databases is as follows: If the database is open, it is possible that contents of a datafile block are being modified at the very instant that the block is being copied by the utility. In such a situation the copy of the block would be inconsistent, and hence useless for recovery. The way to avoid this is to put a tablespace into a special "backup mode" before copying the datafiles associated with the tablespace. Such OS level backups, also called user managed backups, are the traditional (non-RMAN) way to backup Oracle databases. When a tablespace is in backup mode, the SCN, which is marked in the header of each datafile in the tablespace, is frozen until the tablespace is taken out of backup mode. Additionally, whenever a data block in the tablespace is modified, the entire block is copied to the online redo log (in contrast to only modified rows being copied when the tablespace is not in backup mode). This causes a huge increase in the redo generated, which is a major disadvantage of user managed backups.

One can perform user managed backups of a database using homegrown scripts. Such a script would cycle through all tablespaces in the database, putting each tablespace in backup mode, copying the associated datafiles and finally switching the tablespace out of backup mode. A fragment of a user managed hot backup script for Windows might read as follows:


--put USERS tablespace in backup mode
alter tablespace users begin backup;
--copy files belonging to USERS tablespace
host copy d:\oracle\ora92\orcl\users.dbf e:\backup;
--take USERS tablespace out of backup mode
alter tablespace users end backup;
--continue with other tablespaces and then copy other oracle files...


The above would be invoked from sqlplus, via an appropriately scripted batch file.

Most OS backup utility vendors provide optional add-ons that automate the process of user managed backups. These add-ons, which usually do no more than the script shown above, are sold as extra cost add-ons to the base backup software.

RMAN is a database backup utility that comes with the Oracle database, at no extra cost. As such, it is aware of the internal structure of Oracle datafiles and controlfiles, and knows how to take consistent copies of data blocks even as they are being written to. Furthermore, it does this without putting tablespaces in backup mode. Therefore RMAN does not cause a massive increase in generated redo. Another advantage of using RMAN is that it backs up only those blocks that have held or currently hold data. Hence RMAN backups of datafiles are generally smaller than the datafiles themselves. In contrast, OS copies of datafiles have the same size as the original datafiles. Finally, with RMAN backups it is possible to recover individual blocks in case of block corruption of datafiles. Considering the above, it makes good sense to use RMAN instead of vendor supplied add-ons or homegrown user managed backup scripts.

4. Configuring RMAN:

RMAN is a command line utility that is installed as a part of a standard database installation. Note that RMAN is only a command interface to the database - the actual backup is performed by a dedicated server process on the Oracle database.

RMAN can invoked from the command line on the database host machine like so:


C:\>rman target /

Recovery Manager: Release 9.2.0.1.0 - Production

Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.

connected to target database: ORCL (DBID=1036216947)

RMAN>


The first line is the one we type and the remaining ones are feedback from execution of the command. The net result is to leave us connected to the target database - the database we want to back up - with the RMAN> prompt, indicating that RMAN is ready for further work. Here we have invoked RMAN on the server, and have logged on to the server using an account that belongs to the ORA_DBA OS group. As described earlier, this enables us to connect to the target database (as sysdba - this is implicit) without having to supply a password. Note that on Windows, one must also set SQLNET.AUTHENTICATION_SERVICES=(NTS) in the sqlnet.ora file order to connect using OS authentication as above.

At this point a digression is in order. RMAN can be run in two modes - catalog and nocatalog. In the former, backup information and RMAN scripts are stored in another database known as the RMAN catalog. In the latter, RMAN stores backup information in the target database controlfile. Catalog mode is more flexible, but requires the maintenance of a separate database on another machine (there's no point in creating the RMAN catalog on the database to be backed up!). Nocatalog mode has the advantage of not needing a separate database, but places more responsibility on the controlfile. We will use nocatalog mode in our discussion, as this is a perfectly valid choice for sites with a small number of databases.

RMAN can be configured through various persistent parameters. Note that persistent parameters can be configured only for Oracle versions 9i and better. The current configuration can be seen via the "show all" command:


RMAN> show all;

RMAN configuration parameters are:
CONFIGURE RETENTION POLICY TO REDUNDANCY 2;
CONFIGURE BACKUP OPTIMIZATION OFF; # default
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
CONFIGURE CONTROLFILE AUTOBACKUP ON;
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO 'e:\backup\ctl_sp_bak_%F';
CONFIGURE DEVICE TYPE DISK PARALLELISM 2;
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE CHANNEL 1 DEVICE TYPE DISK FORMAT 'e:\backup\%U.bak' MAXPIECESIZE 4G;
CONFIGURE CHANNEL 2 DEVICE TYPE DISK FORMAT 'e:\backup\%U.bak' MAXPIECESIZE 4G;
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO 'C:\ORACLE\ORA92\DATABASE\SNCFORCL.ORA'; # default

RMAN>


The reader is referred to the RMAN documentation for a detailed explanation of the options attached to each of these parameters. Here we will discuss only those of relevance to our backup requirements.


Retention Policy: This instructs RMAN on the backups that are eligible for deletion. For example: A retention policy with redundancy 2 would mean that two backups - the latest and the one prior to that - should be retained. All other backups are candidates for deletion. Retention policy can also be configured based on time - check the docs for details on this option.
Default Device Type: This can be "disk" or "sbt" (system backup to tape). We will backup to disk and then have our OS backup utility copy the completed backup, and other supporting files, to tape.
Controlfile Autobackup: This can be set to "on" or "off". When set to "on", RMAN takes a backup of the controlfile AND server parameter file each time a backup is performed. Note that "off" is the default.
Controlfile Autobackup Format: This tells RMAN where the controlfile backup is to be stored. The "%F" in the file name instructs RMAN to append the database identifier and backup timestamp to the backup filename. The database identifier, or DBID, is a unique integer identifier for the database. We have configured RMAN to store controlfile backups in the directory e:\backup.
Parallelism: This tells RMAN how many server processes you want dedicated to performing the backups.
Device Type Format: This specifies the location and name of the backup files. We need to specify the format for each channel. The "%U" ensures that Oracle appends a unique identifier to the backup file name. The MAXPIECESIZE attribute sets a maximum file size for each file in the backup set. We have configured RMAN to store backups in the directory e:\backup.
Any of the above parameters can be changed using the commands displayed by the "show all" command. For example, one can turn off controlfile autobackups by issuing:


RMAN> configure controlfile autobackup off;

using target database controlfile instead of recovery catalog
old RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP ON;
new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP OFF;
new RMAN configuration parameters are successfully stored

RMAN>


5. Scripting the backup:

With the background stuff out of the way, we now move on to the actual backup. We will write a simple script that will backup our database, verify that the backup can be restored and then delete all obsolete backups and archive logs (based on a redundancy of 2, as discussed above). The Windows scheduler will be used to run the script at a time of our choosing.

An Aside: Before we move on it is worth stating that RMAN can perform full or incremental backups. Full backups, as their name suggests, are backups of every data block in the datafiles. In contrast, Incremental backups backup only those database blocks that have changed since the last higher level backup. It would take us too far afield to detail the intricacies of incremental backups - we refer you to the Oracle documentation for more details on this. For the case at hand, we can afford to perform full backups every night as the database is relatively small.

The backup script, which we store in a file named "rman_backup.rcv", is very simple:


#contents of rman_backup.rcv. "#" denotes a comment line, and will be ignored by RMAN.
backup database plus archivelog;
restore database validate;
delete noprompt obsolete;
host 'copy C:\oracle\ora92\database\pwdorcl.ora e:\backup';
host 'copy C:\oracle\ora92\network\admin\tnsnames.ora e:\backup';
host 'copy C:\oracle\ora92\network\admin\listener.ora e:\backup';
exit;


The script backs up the database and all archive logs and then checks that the backup can be restored. After that it deletes backups according to the configured retention policy - the "noprompt" in the delete command instructs RMAN not to prompt us before deleting files. Finally it does an OS copy of the password file and the relevant network configuration files. The RMAN "host" command enables us to execute any operating system command (on Linux, for instance, we would use "cp" instead of "copy"). In the above script the database name is ORCL, hence the password file is pwdORCL.ora. You will need to adapt each of the "host 'copy..." commands in the script to your specific directory structure and filenames. As an aside, it is worth pointing out that SQL commands can be executed from RMAN. A couple of examples:


sql 'alter system archive log current';
sql "create pfile=''e:\backup\initORCL.ora'' from spfile";


The "sql" keyword tells RMAN what follows is to be interpreted as an SQL command. The actual SQL should be enclosed in single or double quotes. The latter is useful if the command contains single quotes, as in the second example above. Note: In the second example, the quotes enclosing the pfile path are two single quotes, and the quotes enclosing the entire command are double quotes.

The script, rman_backup.rcv, is invoked by the following one line batch file:


REM contents of rman_backup.bat
rman target / cmdfile rman_backup.rcv log rman_backup.log


The "target /" indicates that the script logs on to Oracle as sysdba via an OS account that belongs to the ORA_DBA group. The "cmdfile" option indicates the name of the command file that RMAN should execute, in this case it is rman_backup.rcv. The "log" option tells rman that we want a transcript of the RMAN session to be stored in the file that follows the option - rman_backup.log in this case. Remember to check the log file once between each backup for any errors that may have occurred. The log file is overwritten on each execution of the batch file so it may be worth changing the name to include a unique identifier (such as a timestamp). The backup scripts could reside anywhere on the server, but it may be best to keep them in e:\backup so that they are archived off to tape along with the backups.

The next step is to schedule our batch file (rman_backup.bat) to run at the desired interval. This is done by scheduling the batch file via the Window Scheduler wizard, which is accessed through Control Panel>Scheduled Tasks>Add Scheduled Task>.

Finally, it should be ensured that the entire backup directory (e:\backup) is copied to tape nightly, after the database backup has been completed. There is no need to backup any other Oracle related directory. The tapes must be stored offsite so that they aren't destroyed in case the site is struck by disaster. In a disaster situation, we can recreate the database and then restore and recover data files (with up to a 24 hour data loss), using the backups that are on tape. The procedure for recovering from a disaster will be covered in the third article of this series. In case the database fails (say due to datafile corruption, for example) but the host server remains available, we can recover right up to the instant of failure using the backup on disk together with all archive logs since the backup and the current online redo logs. Some of these scenarios will be covered in the next article of this series.

6. Summary and Further Reading:

This article provides steps on setting up automated RMAN based backups of Oracle databases on Windows. As with all critical DBA tasks, scripts and commands described above should be customised to your requirements and tested thoroughly before implementation on your production systems.

In the interest of brevity, we have had to rush through some of the detail that is relevant to backup and recovery. The reader is therefore urged to read the pertinent Oracle documentation for complete coverage. The books of interest are:
Oracle 9i Backup and Recovery Concepts - This book discusses basics of Oracle backup and recovery.
Oracle 9i Recovery Manager User's Guide - This book discusses backup and recovery using RMAN.
Oracle 9i Administrator's Guide - discusses backup related issues such as how to put the database in ARCHIVELOG mode.
These books can be downloaded, free of charge, from Oracle's documentation site. You will need to register with the Oracle Technology Network (OTN) to gain access to the documentation. Membership of OTN is free, and well worth it for Oracle professionals.

No comments: