Wednesday, January 20, 2010

RMAN - Data Recovery Advisor

This I did some time ago but since someone asked me to tell something about this new Oracle 11g feature, I am representing what I did. This is the test for Data Recovery Advisor.

Oracle has introduced a new feature in Oracle Database 11g that can be used to get assistance in case of loss or corruption of datafiles, redo log files or controlfiles. This tool is called Data Recovery Advisor and is accessible from the command line as well as from the GUI. Command line interface is implemented as new commands for RMAN and the graphical interface is integrated into the Enterprise Manager.

The Data Recovery Advisor is a tool that helps you to diagnose and repair data failures and corruptions. The Data Recovery Advisor analyzes failures based on symptoms and intelligently determines optimal repair strategies. The tool can also automatically repair diagnosed failures. It is very useful in case of Block Corruption.

DBAs need to proactively check for the block corruption in the database on a regular basis. There can be two types of block corruptions: physical and logical. When physical corruption occurs, the Oracle kernel is unable to read or write one or more data blocks, usually due to media corruption. In logical corruption, the data that is read back or written is inconsistent. Usually this is due to various bugs at the Oracle or Media layer, and one may observe ORA-00600 or ORA-01578 errors.

In Oracle Database 11g, RMAN directly informs us about corrupted blocks it has found in the output of the backup command and creates a trace file in the Automatic Diagnostic Repository (ADR). Prior to 11g, DBAs had to look at v$database_block_corruption for this information after the backup was done. This is a great improvement. A check such as this that has returned corrupted block in Oracle Database 11g immediately creates a failure in the database. There are multiple checkers available in 11g that are automatically run by the server as soon as critical error is raised, like, database structure checks, integrity checks for the data dictionary as well as block corruption check.

RMAN has the following set of new commands to deal with failures:

• LIST FAILURE
• ADVISE FAILURE
• CHANGE FAILURE
• REPAIR FAILURE

Depending upon the type of failure and it’s recovery time, in Oracle Database 11g, RMAN can choose to use block copies from the flashback logs instead of a backup for a block media recovery. It is in such situations, that the ADVICE FAILURE command comes in very handy.

Let’s look at an example how we can repair a corrupted block from a datafile. For this test, we will be using TSTDB database.

Step 1. Let’s corrupt a data file

dd of=/u01/oradata/TESTDB/example_01.dbf bs=8192 conv=notrunc seek=12
<<>
CORRUPT corrupt CORRUPT corrupt CORRUPT corrupt CORRUPT corrupt CORRUPT corrupt
EOF
0+3 records in
0+3 records out

Step 2. Check if there is any corruption in the database

SQL> ALTER SYSTEM FLUSH BUFFER_CACHE;
System altered.

SQL> select * from hr.dba;
select * from hr.dba
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 12, block # 12)
ORA-01110: data file 12: '/u01/oradata/TESTDB/example_01.dbf'

Step 3. List the failure as per the DRA

rman target /

connected to target database: TESTDB (DBID=xxxxxxxxxxxx)

RMAN> connect catalog test_rmandb/*****@RMANDB

connected to recovery catalog database

RMAN> list failure;

List of Database Failures
=========================

Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
622 HIGH OPEN 20-OCT-08 Datafile 12: '/u01/oradata/TESTDB/example_01.dbf' contains one or more corrupt blocks



Step 4. Advice the solution for the failure as per the DRA

RMAN> advise failure;

List of Database Failures
=========================

Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
622 HIGH OPEN 20-OCT-08 Datafile 12: '/u01/oradata/TESTDB/example_01.dbf' contains one or more corrupt blocks

analyzing automatic repair options; this may take some time
allocated channel: ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: SID=129 device type=SBT_TAPE
channel ORA_SBT_TAPE_1: Data Protection for Oracle: version 5.5.1.0
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=128 device type=DISK
analyzing automatic repair options complete

Mandatory Manual Actions
========================
no manual actions available

Optional Manual Actions
=======================
no manual actions available

Automated Repair Options
========================
Option Repair Description
------ ------------------
1 Perform block media recovery of block 12 in file 12
Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/diag/rdbms/testdb/TESTDB/hm/reco_181505967.hm



Step 5. Now repair the corrupted block as per the solution provided be the DRA

RMAN> repair failure;

Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/diag/rdbms/testdb/TESTDB/hm/reco_181505967.hm

contents of repair script:
# block media recovery
recover datafile 12 block 12;

Do you really want to execute the above repair (enter YES or NO)? yes
executing repair script

Starting recover at 20-OCT-08
using channel ORA_SBT_TAPE_1
using channel ORA_DISK_1

channel ORA_SBT_TAPE_1: restoring block(s)
channel ORA_SBT_TAPE_1: specifying block(s) to restore from backup set
restoring blocks of datafile 00012
channel ORA_SBT_TAPE_1: reading from backup piece 0sjtk0q7_1_1
channel ORA_SBT_TAPE_1: piece handle=0sjtk0q7_1_1 tag=TAG20081020T095831
channel ORA_SBT_TAPE_1: restored block(s) from backup piece 1
channel ORA_SBT_TAPE_1: block restore complete, elapsed time: 00:01:55

starting media recovery
media recovery complete, elapsed time: 00:00:03

Finished recover at 20-OCT-08
repair failure complete



ADR and DRA is a great help for every DBA. DBA might use ADR to prevent making big mistakes during disaster situations, usually when time plays a crucial role and the chance of making a mistake is rather high.

No comments:

Post a Comment