Wednesday, October 14, 2009

Recovery from missing or corrupted control file-Controlfile

Case 1: A multiplexed copy of the control file is available.

On startup Oracle must read the control file in order to find out where the datafiles and online logs are located. Oracle expects to find control files at locations specified in the CONTROL_FILE initialisation parameter. The instance will fail to mount the database if any one of the control files are missing or corrupt. A brief error message will be displayed, with further details recorded in the alert log. The exact error message will vary depending on what has gone wrong. Here's an example:


SQL> startup

ORACLE instance started.

Total System Global Area 135338868 bytes
Fixed Size 453492 bytes
Variable Size 109051904 bytes
Database Buffers 25165824 bytes
Redo Buffers 667648 bytes
ORA-00205: error in identifying controlfile, check alert log for more info

SQL>


On checking the alert log, as suggested, we find the following:


ORA-00202: controlfile: 'e:\oracle_dup_dest\controlfile\ORCL\control02.ctl'
ORA-27046: file size is not a multiple of logical block size
OSD-04012: file size mismatch (OS 5447783)


The above corruption was introduced by manually editing the control file when the database was closed.

The solution is simple, provided you have at least one uncorrupted control file - replace the corrupted control file with a copy using operating system commands. Remember to rename the copied file. The database should now start up without any problems.

Case 2: All control files lost

What if you lose all your control files? In that case you have no option but to use a backup control file. The recovery needs to be performed from within RMAN, and requires that all logs (archived and current online logs) since the last backup are available. The logs are required because all datafiles must also be restored from backup. The database will then have to be recovered up to the time the control files went missing. This can only be done if all intervening logs are available. Here's an annotated transcript of a recovery session (as usual, lines in bold are commands to be typed, lines in italics are explanatory comments, other lines are RMAN feedback):


-- Connect to RMAN
C:\>rman target /

Recovery Manager: Release 9.2.0.4.0 - Production

Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.

connected to target database: ORCL (not mounted)

-- set DBID - get this from the name of the controlfile autobackup.
-- For example, if autobackup name is
-- CTL_SP_BAK_C-1507972899-20050124-00 the the DBID is
-- 1507972899. This step will not be required if the instance is
-- started up from RMAN

RMAN> set dbid 1507972899

executing command: SET DBID

--restore controlfile from autobackup. The backup is not at the default
--location so the path must be specified

RMAN> restore controlfile from 'e:\backup\CTL_SP_BAK_C-1507972899-20050124-00';

Starting restore at 26/JAN/05

using channel ORA_DISK_1
channel ORA_DISK_1: restoring controlfile
channel ORA_DISK_1: restore complete
replicating controlfile
input filename=D:\ORACLE_DATA\CONTROLFILE\ORCL\CONTROL01.CTL
output filename=E:\ORACLE_DUP_DEST\CONTROLFILE\ORCL\CONTROL02.CTL
output filename=C:\ORACLE_DUP_DEST\CONTROLFILE\ORCL\CONTROL03.CTL
Finished restore at 26/JAN/05

-- Now that control files have been restored, the instance can mount the
-- database.

RMAN> mount database;

database mounted

-- All datafiles must be restored, since the controlfile is older than the current
-- datafiles. Datafile restore must be followed by recovery up to the current log.

RMAN> restore database;

Starting restore at 26/JAN/05

using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
restoring datafile 00001 to D:\ORACLE_DATA\DATAFILES\ORCL\SYSTEM01.DBF
restoring datafile 00004 to D:\ORACLE_DATA\DATAFILES\ORCL\USERS01.DBF
channel ORA_DISK_1: restored backup piece 1
piece handle=E:\BACKUP\0DGB0I79_1_1.BAK tag=TAG20050124T115832 params=NULL
channel ORA_DISK_1: restore complete
channel ORA_DISK_1: starting datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
restoring datafile 00002 to D:\ORACLE_DATA\DATAFILES\ORCL\UNDOTBS01.DBF
restoring datafile 00003 to D:\ORACLE_DATA\DATAFILES\ORCL\TOOLS01.DBF
channel ORA_DISK_1: restored backup piece 1
piece handle=E:\BACKUP\0CGB0I78_1_1.BAK tag=TAG20050124T115832 params=NULL
channel ORA_DISK_1: restore complete
Finished restore at 26/JAN/05

--Database must be recovered because all datafiles have been restored from
-- backup

RMAN> recover database;

Starting recover at 26/JAN/05
using channel ORA_DISK_1

starting media recovery

archive log thread 1 sequence 2 is already on disk as file E:\ORACLE_ARCHIVE\ORCL\1_2.ARC
archive log thread 1 sequence 4 is already on disk as file D:\ORACLE_DATA\LOGS\ORCL\REDO02A.LOG
archive log thread 1 sequence 5 is already on disk as file D:\ORACLE_DATA\LOGS\ORCL\REDO01A.LOG
archive log thread 1 sequence 6 is already on disk as file D:\ORACLE_DATA\LOGS\ORCL\REDO03A.LOG
archive log filename=E:\ORACLE_ARCHIVE\ORCL\1_2.ARC thread=1 sequence=2
archive log filename=E:\ORACLE_ARCHIVE\ORCL\1_3.ARC thread=1 sequence=3
archive log filename=E:\ORACLE_DATA\LOGS\ORCL\REDO02A.LOG thread=1 sequence=4
archive log filename=E:\ORACLE_DATA\LOGS\ORCL\REDO01A.LOG thread=1 sequence=5
archive log filename=E:\ORACLE_DATA\LOGS\ORCL\REDO03A.LOG thread=1 sequence=6
media recovery complete
Finished recover at 26/JAN/05

-- Recovery completed. The database must be opened with RESETLOGS
-- because a backup control file was used. Can also use
-- "alter database open resetlogs" instead.

RMAN> open resetlogs database;

database opened


Several points are worth emphasising.
Recovery using a backup controlfile should be done only if a current control file is unavailable.
All datafiles must be restored from backup. This means the database will need to be recovered using archived and online redo logs. These MUST be available for recovery until the time of failure.
As with any database recovery involving RESETLOGS, take a fresh backup immediately.
Technically the above is an example of complete recovery - since all committed transactions were recovered. However, some references consider this to be incomplete recovery because the database log sequence had to be reset.
After recovery using a backup controlfile, all temporary files associated with locally-managed tablespaces are no longer available. You can check that this is so by querying the view V$TEMPFILE - no rows will be returned. Therefore tempfiles must be added (or recreated) before the database is made available for general use. In the case at hand, the tempfile already exists so we merely add it to the temporary tablespace. This can be done using SQLPlus or any tool of your choice:


SQL> alter tablespace temp add tempfile
'D:\oracle_data\datafiles\ORCL\TEMP01.DBF';

Tablespace altered.

SQL>


Check that the file is available by querying v$TEMPFILE.

6. Wrap up:

In an article this size it is impossible to cover all possible recovery scenarios that one might encounter in real life. The above examples will, I hope, provide you with some concrete situations to try out on your test box. The best preparation for real-life recovery is practice. Simulate as many variations of the above situations, and others, as you can think up. Then try recovering from them. The exercise will improve your recovery skills, clarify conceptual issues and highlight deficiencies in your backup strategy.

No comments: