Saturday, 20 February 2016

Oracle Grid: Oracle Local Registry Recovery

With the OCR and Voting disks being placed in ASM, these files are accessible only when the cluster processes have started up and ASM instance is up and running. However without the OCR file the clusterware processes cannot start up

.
To resolve this, a copy of the Oracle Local Registry(OLR) registry is now created on each node during the initial installation which stores node specific information and allows us to get around the problem.
OLR stores the information on the clusterware configuration, version information, and GpnP wallets. OHASD process mostly manages this file. The OCR in turn is manged by the CRSD processes.
The OLR is stored in $GRID_HOME/cdata and must be available on the node before other services can be started up. Without this the clusterware will fail to start.

Backup of Oracle Grid OLR

During the initial configuration and installation a manual backup of the OLR is created in $GRID_HOME/cdata. This backup is used in case there is loss/corruption of the original OLR.

# pwd
/opt/app/oracle/grid/1120/cdata/lvsfsdbp1
# ls -ltr
total 14554
-rw------- 1 root root 6356992 Oct 1 21:14 backup_20121001_75347.olr
-rw------- 1 root root 6504448 Oct 16 10:06 backup_20121016_02946.olr

Missing OLR Error

When ever there is an issue with accessing the OLR that following will be seen in the OHASD and the messages file.

2012-3-12 11:31:55.211: [ default][3046311632] OHASD Daemon Starting. Command string :restart
2012-3-12 11:31:55.218: [ default][3046311632] Initializing OLR
2012-3-12 11:31:55.672: [ OCROSD][3046311632]utopen:6m':failed in stat OCR file/disk /opt/app/oracle/grid/1120/cdata/lvsfsdbp1.olr, errno=2, os err string=No such file or directory
..
2012-3-12 11:31:55.675: [ OCRRAW][3046311632]proprinit: Could not open raw device
...
2012-3-12 11:31:55.671: [ default][3046311632]Created alert : (:OHAS00106:) : Failed to initialize Oracle Local Registry
2012-3-12 11:31:55.671: [ default][3046311632][PANIC] OHASD exiting; Could not init OLR
2012-3-12 11:31:55.671: [ default][3046311632] Done.

Restoring OLR

a. Create a empty OLR file and provide permissions:
# touch lvsfsdbp1.olr
# chmod 600 lvsfsdbp1.olr
b. Use backup to restore wiht the -local option
# ocrconfig -local -restore /opt/app/oracle/grid/1120/cdata/lvsfsdbp1/backup_20121016_02946.olr
# ls -ltr
-rw------- 1 root root 272356541 Oct 16 15:07 lvsfsdbp1.olr...

c. Restart the CRS processes
#sudo crscrtl start crs
View the ohasd logfile to confirm status
2012-3-12 15:07:16.211: [ default][3046725328] OHASD Daemon Starting. Command string :restart
2012-3-12 15:07:16.215: [ default][3046725328] Initializing OLR
2012-3-12 15:07:16.676: [ OCRRAW][3046725328]proprioo: for disk 0 (/opt/app/oracle/grid/1120/cdata/lvsfsdbp1.olr), id match (1), total id sets, (1) need recover (0), my votes (0), total votes (0), commit_lsn (12), lsn (12)
d. As a sanity Shutdown the cluster and reboot the server
Cluster healthy, backup and running. Make sure you are backing up the OCR and OLR files in the cdata directory on each node!
Rebootless RAC node fencing

No comments:

Post a Comment