Today we are going to embark on the adventure of building an Active DataGuard database (the “standby”) with a two-node RAC as the source (the “primary”).
* There is an existing RAC database (RACDB) on two nodes: racnode1 and racnode2.
* We are using ASM/OMF and the Oracle software and ASM instance on the standby host is already installed.
* The standby server has no existing database.
* The standby is standalone (no RAC).
* The Oracle software version is 11gR2.
* The Primary database is running in ARCHIVELOG mode.
* RAC database name: PrimRac
* RAC Scan adress: primrac-scan
* RAC node1 instance name: PrimRac01
* RAC node2 instance name:PrimRac02
* RAC Unique Name: RACScan
* Standby database name: dg_stby
[Primary node1, node2, standby ] Step 1: Configure Oracle Net Services
dg_stby = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = stby_server)(PORT = 1521)) ) (CONNECT_DATA = (SERVICE_NAME = dg_stby) (UR=A) ) ) PrimRac = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = primrac-scan)(PORT = 1521)) ) (CONNECT_DATA = (SERVICE_NAME = PrimRac) ) ) primrac01 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = primrac01-vip)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = primrac) (instance_name=primrac01) ) ) primrac02 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = primrac02-vip)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = primrac) (instance_name=primrac02) ) )
Alter the tnsnames.ora on all the hosts involved so they are “known” to each other. Note the specifics of the NODES of the RAC!
Alter the listener.ora on the STANDBY database node and add the static listener :
SID_LIST_LISTENER= (SID_LIST= (SID_DESC= (GLOBAL_DBNAME=dg_stby) (SID_NAME=dg_stby) (ORACLE_HOME=/oracle/base/db/dbhome1/) ) )
Restart the listener to have the new changes take effect.
$ srvctl stop listener $ srvctl status listener Listener LISTENER is enabled Listener LISTENER is not running $ srvctl start listener $ lsnrctl status LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 10-MAR-2014 12:21:13 Copyright (c) 1991, 2011, Oracle. All rights reserved. Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER))) STATUS of the LISTENER ------------------------ Alias LISTENER Version TNSLSNR for Linux: Version 11.2.0.3.0 - Production Start Date 10-MAR-2014 12:21:10 Uptime 0 days 0 hr. 0 min. 2 sec Trace Level off Security ON: Local OS Authentication SNMP OFF Listener Parameter File /oracle/grid/network/admin/listener.ora Listener Log File /oracle/base/diag/tnslsnr/XXXX/listener/alert/log.xml Listening Endpoints Summary... (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER))) (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=xx.xx.xx.xx)(PORT=1521))) (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=xx.xx.xx.xx)(PORT=1521))) Services Summary... Service "dg_stby" has 1 instance(s). Instance "dg_stby", status UNKNOWN, has 1 handler(s) for this service... The command completed successfully
The “STATUS UNKNOWN” is what we are looking for, this is the static entry doing it’s business.
[Primary node1, node2, standby] Step 2: OraPWD
This is an VERY important step, since in 11g the rules regarding the orapwd have changed, RECREATE doesn’t work as it used to! It is VERY important to COPY 1 of the files over to ALL other nodes (renaming is allowed, even necessary!) to make this work.
We choose the primary node1 orapwd file as the source. Make sure you know the sys password.
$ cd $ORACLE_HOME/dbs
$ orapwd file=orapwdprimrac01 password=oracle
In my case these files where already created, and I know the sys password. So I just copied these over to the other instances with scp.
One again: copy over the file from one node to the other node and to the standby, renaming the file on the specific node to reflect the instancename.
[Primary] Step 3: Create standby control file
Next we need to create a standby control file. Do not create this in a location which get emptied automatically (like /tmp). It will be included in the daily backup with RMAN.
As user SYS:
SQL> ALTER DATABASE CREATE STANDBY CONTROLFILE AS '/oracle/base/db/dbhome1/dbs/standbyctlprd.ctl';
Database altered.
SQL>
Copy this file to a known location (like $ORACLE_HOME/dbs) on the standby server.
[Standby] Step 4: Create standby instance pfile
One can copy over the pfile from the primary database and alter this to create a similar pfile on the standby node.
*.db_cache_size=469762048 *.java_pool_size=33554432 *.large_pool_size=33554432 *.pga_aggregate_target=939524096 *.sga_target=7449083904 *.shared_pool_size=6710886400 *.streams_pool_size=100663296 *.audit_file_dest='/oracle/base/admin/dg_stby/adump' *.audit_trail='db' *.cluster_database=false *.compatible='11.2.0.0.0' *.control_files='/oracle/base/db/dbHome1/dbs/standbyctlfile.ctl' *.db_block_size=8192 *.db_create_file_dest='+DATA' *.db_domain='' *.db_name='PrimRac' *.db_unique_name='dg_stby' *.db_recovery_file_dest='+FRA' *.db_recovery_file_dest_size=858412548096 *.diagnostic_dest='/oracle/base' *.fal_server='primrac01','primrac02' *.log_archive_config='DG_CONFIG=(primrac,dg_stby)' *.log_archive_dest_2='SERVICE=primrac01 NOAFFIRM ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=primrac' *.log_archive_dest_state_2='ENABLE' *.log_archive_max_processes=30 *.memory_max_target=17179869184 *.memory_target=8388608000 *.open_cursors=3500 *.optimizer_capture_sql_plan_baselines=FALSE *.optimizer_use_sql_plan_baselines=FALSE *.parallel_force_local=TRUE *.plsql_code_type='NATIVE' *.processes=680 *.recyclebin='OFF' *.remote_login_passwordfile='exclusive' *.standby_file_management='AUTO' *.statistics_level='ALL' *.streams_pool_size=83886080 *.undo_retention=1800 *.undo_tablespace='UNDOTBS1' *.thread=1 *.fal_client='dg_stby' *.db_file_name_convert='+DATAPRD/primrac/','+DATA/DG_stby/','+FRAPRD/primrac/','+FRA/dg_stby/' *.log_file_name_convert='+DATAPRD/primrac/','+DATA/DG_stby/','+FRAPRD/primrac/','+FRA/dg_stby/'
Make note of the following DG entries:
*.log_archive_dest_2='SERVICE=primrac01 NOAFFIRM ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=primrac
Here we need to be careful: the service dictates NODE1 and the DB_UNIQUE_NAME is the same as the RAC database.
*.fal_server='primrac01','primrac02'
In a non-rac primary, here the database/instance would be listed, however..in a RAC, we list ALL the RAC NAMES. Not all sites are consistent/conclusive with this. This needs to be done ONLY on the STBY node! Also be carefull wiith the quotes, the individual names need to be set between their own set of quotes, as determined in the STBY tnsnames.orafile.
We use conversion between the ASM instances, since these differ also in our environment by utilizing the following:
*.db_file_name_convert='+DATAPRD/primrac/','+DATA/DG_stby/','+FRAPRD/primrac/','+FRA/dg_stby/' *.log_file_name_convert='+DATAPRD/primrac/','+DATA/DG_stby/','+FRAPRD/primrac/','+FRA/dg_stby/'
Also keep in mind NOT to remove (or include) the following entry:
*.thread=1
Of course DG has to know which the client is:
*.fal_client='dg_stby'
Save the file and start the standby instance:
$ export ORACLE_SID=dg_stby
$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Mon Mar 10 14:39:02 2014
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to an idle instance.
SQL> create spfile from pfile='/oracle/base/db/dbHome1/dbs/initdg-stby.ora';
File created.
SQL> startup nomount
ORACLE instance started.
Total System Global Area 1.7103E+10 bytes
Fixed Size 2245480 bytes
Variable Size 1.6609E+10 bytes
Database Buffers 469762048 bytes
Redo Buffers 21708800 bytes
SQL>
[PRIMARY, node 1] Step 5: Duplicate the database.
Now we need to get the data from the primary on the standby. We use the network duplicate, although this can also be done with an backup to disk from the primary, and restore this backup after copying them over to the remote site. Since I don’t have this disk-space locally, so I opt to use the “over the wire” method.
On the primary:
$rlwrap rman target / auxiliary sys/secret@dg_stby Recovery Manager: Release 11.2.0.3.0 - Production on Mon Mar 10 20:42:34 2014 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. connected to target database: primrac (DBID=1727048999) connected to auxiliary database: primrac (not mounted) RMAN> run { 2> allocate channel prmy1 type disk; 3> allocate channel prmy2 type disk; 4> allocate channel prmy3 type disk; 5> allocate channel prmy4 type disk; 6> allocate auxiliary channel stby type disk; 7> duplicate target database for standby from active database nofilenamecheck dorecover; 8> } using target database control file instead of recovery catalog allocated channel: prmy1 channel prmy1: SID=985 instance=primrac01 device type=DISK allocated channel: prmy2 channel prmy2: SID=1022 instance=primrac01 device type=DISK allocated channel: prmy3 channel prmy3: SID=179 instance=primrac01 device type=DISK allocated channel: prmy4 channel prmy4: SID=402 instance=primrac01 device type=DISK allocated channel: stby channel stby: SID=794 device type=DISK Starting Duplicate Db at 10-MAR-14 contents of Memory Script: { backup as copy reuse targetfile '/oracle/base/db/dbhome1/dbs/orapwprimrac01' auxiliary format '/oracle/base/db/dbHome1/dbs/orapwdg_stby' ; } executing Memory Script Starting backup at 10-MAR-14 Finished backup at 10-MAR-14 contents of Memory Script: { backup as copy current controlfile for standby auxiliary format '/oracle/base/db/dbHome1/dbs/standbyctlprd.ctl'; } executing Memory Script Starting backup at 10-MAR-14 channel prmy1: starting datafile copy copying standby control file output file name=+FRAPRDG/primrac/controlfile/snapcf_primrac.f tag=TAG20140310T220054 RECID=1059 STAMP=841874455 channel prmy1: datafile copy complete, elapsed time: 00:00:03 Finished backup at 10-MAR-14 contents of Memory Script: { sql clone 'alter database mount standby database'; } executing Memory Script sql statement: alter database mount standby database RMAN-05529: WARNING: DB_FILE_NAME_CONVERT resulted in invalid ASM names; names changed to disk group only. contents of Memory Script: { set newname for tempfile 1 to "+dataprdg"; set newname for tempfile 2 to "+dataprdg"; set newname for tempfile 3 to "+dataprdg"; switch clone tempfile all; set newname for datafile 1 to "+dataprdg"; set newname for datafile 2 to "+dataprdg"; <SNIP> set newname for datafile 22 to "+dataprdg"; backup as copy reuse datafile 1 auxiliary format "+dataprdg" datafile 2 auxiliary format "+dataprdg" datafile 3 auxiliary format "+dataprdg" datafile 4 auxiliary format <SNIP> 22 auxiliary format "+dataprdg" ; sql 'alter system archive log current'; } executing Memory Script executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME renamed tempfile 1 to +dataprdg in control file renamed tempfile 2 to +dataprdg in control file renamed tempfile 3 to +dataprdg in control file executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME <SNIP> executing command: SET NEWNAME Starting backup at 10-MAR-14 channel prmy1: starting datafile copy input datafile file number=00011 name=+DATAPRDG/primrac/datafile/car_indx.273.811722355 channel prmy2: starting datafile copy input datafile file number=00002 name=+DATAPRDG/primrac/datafile/sysaux.260.804442149 channel prmy3: starting datafile copy input datafile file number=00010 name=+DATAPRDG/primrac/datafile/car_data.272.811719989 channel prmy4: starting datafile copy input datafile file number=00003 name=+DATAPRDG/primrac/datafile/undotbs1.261.804442151 <SNIP> executing Memory Script executing command: SET until clause Starting recover at 10-MAR-14 starting media recovery archived log for thread 1 with sequence 6698 is already on disk as file +FRAPRDG/dg_stby/archivelog/2014_03_10/thread_1_seq_6698.283.841877139 archived log for thread 1 with sequence 6699 is already on disk as file +FRAPRDG/dg_stby/archivelog/2014_03_10/thread_1_seq_6699.280.841877141 archived log for thread 1 with sequence 6700 is already on disk as file +FRAPRDG/dg_stby/archivelog/2014_03_10/thread_1_seq_6700.276.841877147 archived log for thread 2 with sequence 6001 is already on disk as file +FRAPRDG/dg_stby/archivelog/2014_03_10/thread_2_seq_6001.284.841877139 archived log for thread 2 with sequence 6002 is already on disk as file +FRAPRDG/dg_stby/archivelog/2014_03_10/thread_2_seq_6002.277.841877143 archived log file name=+FRAPRDG/dg_stby/archivelog/2014_03_10/thread_1_seq_6698.283.841877139 thread=1 sequence=6698 archived log file name=+FRAPRDG/dg_stby/archivelog/2014_03_10/thread_2_seq_6001.284.841877139 thread=2 sequence=6001 archived log file name=+FRAPRDG/dg_stby/archivelog/2014_03_10/thread_1_seq_6699.280.841877141 thread=1 sequence=6699 archived log file name=+FRAPRDG/dg_stby/archivelog/2014_03_10/thread_2_seq_6002.277.841877143 thread=2 sequence=6002 archived log file name=+FRAPRDG/dg_stby/archivelog/2014_03_10/thread_1_seq_6700.276.841877147 thread=1 sequence=6700 media recovery complete, elapsed time: 00:00:31 Finished recover at 10-MAR-14 Finished Duplicate Db at 10-MAR-14 released channel: prmy1 released channel: prmy2 released channel: prmy3 released channel: prmy4 released channel: stby RMAN>exit Recovery Manager complete. $
This concludes the longest step (depending on the size and speed of network).
**** Special notice **** Special notice ****
Due to the long runtime (yes, this is a large database, and the network data line to the standby server is quite loaded and stretches a long way) the “normal” backup of the primary started. In itself not an issue, however in this RMAN script, after the backups completes, the archive logs are deleted/gone. I kinda did this on purpose, since I think this is more a real-world example and most importantly: I didn’t want to change to many things (rather none at all) settings on the primary side. This has some serious impact on how the recovery goes (like: Not!) of the duplication of the standby database. This will also be solved in this blog, and contains valuable information on how to resolve so-called “gaps” in the synchronization.
I’ll get back on this later when we run into this issue, but one might want to turn off the backup on the primary. One might refrain from creating a backup on the primary (and deleting the archive logs) until the log synchronization is enabled if possible.
[standby] Step 6: Configure Data Guard settings
Log on to the standby database and shut it down:
SQL> shutdown ORA-01109: database not open Database dismounted. ORACLE instance shut down. SQL> startup nomount; ORACLE instance started. Total System Global Area 1.7103E+10 bytes Fixed Size 2245480 bytes Variable Size 1.6609E+10 bytes Database Buffers 469762048 bytes Redo Buffers 21708800 bytes SQL>exit Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Partitioning, Real Application Clusters, OLAP, Data Mining and Real Application Testing options [oracle@bedr-odb01 ~]$ rman target / Recovery Manager: Release 11.2.0.3.0 - Production on Tue Mar 11 09:29:26 2014 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. connected to target database: PRIMRAC (not mounted) Startup RMAN and restore the standby control file into ASM. RMAN> restore controlfile from '/oracle/base/db/dbHome1/dbs/standbyctlprd.ctl'; Starting restore at 11-MAR-14 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=728 device type=DISK channel ORA_DISK_1: copied control file copy output file name=/oracle/base/db/dbHome1/dbs/standbyctlprd.ctl Finished restore at 11-MAR-14 RMAN> exit;
Now we restart the standby database in NOMOUNT mode:
$ rlwrap sqlplus / as sysdba SQL*Plus: Release 11.2.0.3.0 Production on Tue Mar 11 09:33:07 2014 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Partitioning, Real Application Clusters, OLAP, Data Mining and Real Application Testing options SQL> shutdown immediate; ORA-01109: database not open Database dismounted. ORACLE instance shut down. SQL> startup nomount ORACLE instance started. Total System Global Area 1.7103E+10 bytes Fixed Size 2245480 bytes Variable Size 1.6609E+10 bytes Database Buffers 469762048 bytes Redo Buffers 21708800 bytes SQL> alter database mount; Database altered. SQL>
NOTE: I’m taking small steps here..Just to be sure nothing is error-ing..
[Primary, node 1] Step 7: Configure primary database
Set the log_archive_config to the following settings:
SQL> show parameter LOG_ARCHIVE_CONFIG NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ log_archive_config string SQL> ALTER SYSTEM SET LOG_ARCHIVE_CONFIG='DG_CONFIG=(primrac,dg_stby)'; System altered. SQL> show parameter LOG_ARCHIVE_CONFIG NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ log_archive_config string DG_CONFIG=(primrac,dg_stby)
Check if the (default) log_archive_format is correct:
SQL> show parameters LOG_ARCHIVE_FORMAT NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ log_archive_format string %t_%s_%r.dbf
Change to above if not the same!
Make sure we have enough Log_Archive Processes:
SQL> show parameters LOG_ARCHIVE_MAX_PROCESSES NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ log_archive_max_processes integer 4 SQL> ALTER SYSTEM SET LOG_ARCHIVE_MAX_PROCESSES=30; System altered. SQL> show parameters LOG_ARCHIVE_MAX_PROCESSES NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ log_archive_max_processes integer 30 SQL>
Double check we are using the remote_login_passwordfile.
THIS IS IMPORTANT, and normally this is default, but let’s not take anything for granted. Saves a lot of issues and troubleshooting later on.
SQL>show parameters REMOTE_LOGIN_PASSWORDFILE; NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ remote_login_passwordfile string EXCLUSIVE SQL>
Ok. We’re good.
Turn on FORCE LOGGING:
SQL >SELECT force_logging FROM v$database; FORCE_LOGGING ------------------------------ NO SYSTEM@primary SQL > ALTER DATABASE FORCE LOGGING; Database altered. SQL > SQL >SELECT force_logging FROM v$database; FORCE_LOGGING ------------------------------ YES SQL >
Configure the LOG_ARCHIVE_DEST_2 parameter:
SQL> show parameter LOG_ARCHIVE_DEST_2; NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ log_archive_dest_2 string SQL>ALTER SYSTEM SET LOG_ARCHIVE_DEST_2='SERVICE=dg_stby NOAFFIRM ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=dg_stby'; System altered. SQL> show parameter LOG_ARCHIVE_DEST_2; NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ log_archive_dest_2 string SERVICE=dg_stby NOAFFIRM ASYNC VALID_FOR=(ONLINE_LOGFILES,PR IMARY_ROLE) DB_UNIQUE_NAME=dg_stby SQL>
Enable the LOG_ARCHIVE_DEST_STATE_2: (–> basically this “turns on” the transport of the redolog files to the standby).
SQL> show parameter LOG_ARCHIVE_DEST_STATE_2 NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ log_archive_dest_state_2 string enable SQL>ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2=ENABLE; System altered. SQL> show parameter LOG_ARCHIVE_DEST_STATE_2 NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ log_archive_dest_state_2 string ENABLE SQL>
Alter the parameter FAL_SERVER:
SQL> show parameters FAL_SERVER NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ fal_server string SQL> ALTER SYSTEM SET FAL_SERVER=dg_stby; System altered. SQL> show parameters FAL_SERVER NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ fal_server string DG_STBY SQL>
Alter parameter STANDBY_FILE_MANAGEMENT:
SQL> show parameters STANDBY_FILE_MANAGEMENT NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ standby_file_management string MANUAL SQL>
This needs to be auto:
SQL> alter system set STANDBY_FILE_MANAGEMENT=auto; System altered. SQL> show parameters STANDBY_FILE_MANAGEMENT NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ standby_file_management string AUTO SQL>
[Standby] Step 8: Create standby logfiles.
Here we are going to create the standby redo log files on the standby. We use the following recommendation from Oracle:
(maximum # of logfiles +1) * maximum # of threads
These values will be the same on the standby as well as on the primary, since we duplicated this database.
These standby redo log files will not be created on the primary in this post.
Determine maximum # of logfiles/threads:
sql> select * from v$standby_log; GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARC STATUS ---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME ------------- --------- ------------ --------- 1 1 6713 209715200 512 2 NO CURRENT 1118893072 11-MAR-14 2.8147E+14 2 1 6712 209715200 512 2 YES INACTIVE 1118640974 11-MAR-14 1118893072 11-MAR-14 3 2 6015 209715200 512 2 NO CURRENT 1118892968 11-MAR-14 2.8147E+14 4 2 6014 209715200 512 2 YES INACTIVE 1118782650 11-MAR-14 1118892968 11-MAR-14 5 1 6711 209715200 512 2 YES INACTIVE 1118556150 11-MAR-14 1118640974 11-MAR-14 6 2 6013 209715200 512 2 YES INACTIVE 1118640499 11-MAR-14 1118782650 11-MAR-14
With a bit of squinting we can determine: we have 2 threads with each 3 groups
Thread 1: group 1,2,5
Thread 2: group 3,4,6
size of redo log should all be the same: 209715200
So we need on the standby:
(6 +1) * 2= 14 standby redo log files.
We have one more standby redo log file for each thread.
SQL> select * from v$standby_log; no rows selected SQL> alter database add standby logfile thread 1 group 7 '+DATAPRDG' size 209715200; Database altered. SQL>
And the rest for clarity:
alter database add standby logfile thread 1 group 8 '+DATAPRDG' size 209715200; alter database add standby logfile thread 1 group 9 '+DATAPRDG' size 209715200; alter database add standby logfile thread 1 group 10 '+DATAPRDG' size 209715200; alter database add standby logfile thread 2 group 11 '+DATAPRDG' size 209715200; alter database add standby logfile thread 2 group 12 '+DATAPRDG' size 209715200; alter database add standby logfile thread 2 group 13 '+DATAPRDG' size 209715200; alter database add standby logfile thread 2 group 14 '+DATAPRDG' size 209715200;
And now….the moment supreme…Place the standby database in Data Guard mode..
To place the standby database in Automatic Managed Recovery Mode (aka: MRP) issue the following command:
SQL> alter database recover managed standby database disconnect from session; Database altered. SQL> exit;
BUT….
We will run into another issue. When trying to bounce the database, or opening it in read-only mode one can run into the error where datafiles need to be recovered.
This is because the names in the ASM on the standby are different than the names in the primary ASM. The cause of this is OMF. It renamed the files to a different name in the standby, since this is how OMF works. to solve this, follow the steps that are perfectly documented in the Helios Blog.
Be aware however to recreate the standby log files, since these will be lost after this exercise. Once the database files are correct, the recovery issues should be gone.
BUT….
The basics are now in place, but in my situation we have a gap..since between the duplicate and the enable of the logfiles transfer, a backup took place..(Mentioned in the special note section before). This is made clear by this query:
[standby] SQL> select process,status,sequence#,thread# from v$managed_standby; PROCESS STATUS SEQUENCE# THREAD# --------- ------------ ---------- ---------- ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CLOSING 6714 1 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CLOSING 6715 1 ARCH CONNECTED 0 0 ARCH CLOSING 6716 1 PROCESS STATUS SEQUENCE# THREAD# --------- ------------ ---------- ---------- ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 PROCESS STATUS SEQUENCE# THREAD# --------- ------------ ---------- ---------- ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 ARCH CONNECTED 0 0 RFS IDLE 0 0 RFS IDLE 6717 1 RFS IDLE 0 0 PROCESS STATUS SEQUENCE# THREAD# --------- ------------ ---------- ---------- MRP0 WAIT_FOR_GAP 6701 1
To fix this we do a recover of the archive-logs from the primary database after we determine the range:
Determine the “gap-range” by using this query on the standby database:
SELECT high.thread#, "LowGap#", "HighGap#"
FROM
(SELECT thread#, MIN(sequence#)-1 "HighGap#"
FROM
(SELECT a.thread#, a.sequence#
FROM
(SELECT * FROM v$archived_log) a,
(SELECT thread#, MAX(next_change#)gap1
FROM v$log_history
GROUP BY thread#
) b
WHERE a.thread# = b.thread#
AND a.next_change# > gap1
)
GROUP BY thread#
) high,
(SELECT thread#, MIN(sequence#) "LowGap#"
FROM
(SELECT thread#, sequence#
FROM v$log_history, v$datafile
WHERE checkpoint_change# <= next_change#
AND checkpoint_change# >= first_change#
)
GROUP BY thread#
) low
WHERE low.thread# = high.thread#
THREAD# LowGap# HighGap#
---------- ---------- ----------
1 6700 6704
Close the gap:
RMAN> restore archivelog from sequence 6700 until sequence 6704; Starting restore at 11-MAR-14 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=804 instance=primrac01 device type=DISK channel ORA_DISK_1: starting archived log restore to default destination channel ORA_DISK_1: restoring archived log archived log thread=1 sequence=6700 channel ORA_DISK_1: restoring archived log archived log thread=1 sequence=6701 channel ORA_DISK_1: restoring archived log archived log thread=1 sequence=6702 channel ORA_DISK_1: restoring archived log archived log thread=1 sequence=6703 channel ORA_DISK_1: restoring archived log archived log thread=1 sequence=6704 channel ORA_DISK_1: reading from backup piece +FRAPRDG/primrac/backupset/2014_03_11/annnf0_incr_backup_0.608.841889217 channel ORA_DISK_1: piece handle=+FRAPRDG/primrac/backupset/2014_03_11/annnf0_incr_backup_0.608.841889217 tag=INCR_BACKUP channel ORA_DISK_1: restored backup piece 1 channel ORA_DISK_1: restore complete, elapsed time: 00:00:35 Finished restore at 11-MAR-14
Issuing the same “gap-query” as before gives us:
THREAD# LowGap# HighGap# ---------- ---------- ---------- 1 6701 6701
Next issue we had is this:
The second node (primrac2) jumped ship (ORA-12514: TNS:listener does not currently know of service requested in connect descriptor) and only the log files from the node primrac1 where being applied. Luckily this is a common issue, and the procedure to recover from this is fairly straightforward. This process has been documented VERY well by The Arup Nanda Blog
Remember: All this can be prevented by having the archivelogs available until the Data Guard catches up.
[Standby] Step 9: Validating Data Guard process.
Determine the lag in seconds:
[primary or standby]
select abs(nvl(max(ceil( (extract(day from replication_lag)*24*60*60)+ (extract(hour from replication_lag)*60*60)+ (extract(minute from replication_lag)*60)+ (extract(second from replication_lag)) )),0)) lag_in_seconds from (select sysdate-scn_to_timestamp(current_scn) replication_lag from v$database); LAG_IN_SECONDS -------------- 2
Check if all logs are applied:
Verify from v$archived_log if there is a gap in the sequence number. All the logs up to that gap should have APPLIED=YES and all the sequence# after the missing log sequence# are APPLIED=NO. This means that due to the missing log, MRP is not applying the logs on standby but the logs are still being transmitted to the standby and are available.
[primary or standby]
SQL> SELECT SEQUENCE#, APPLIED FROM V$ARCHIVED_LOG;
So for example, if the missing log sequence# is 400, then the above query should show that up to sequence#399, all have APPLIED=YES and starting from 401, all are APPLIED=NO. Use the recovery method described in the The Arup Nanda Blog.
[Standby] Step 9: Validating Data Guard process.
Update: In the day to day working with ADG, I missed some queries to validate the operations of the standby database. To get some idea of what is going on, one can use these queries on the standby:
SQL> select process,status,sequence#,thread# from v$managed_standby; PROCESS STATUS SEQUENCE# THREAD# --------- ------------ ---------- ---------- ARCH CLOSING 6936 1 ARCH CLOSING 6942 1 ARCH CONNECTED 0 0 ARCH CLOSING 6947 1 ARCH CLOSING 6217 2 <snip> ARCH CLOSING 6220 2 ARCH CLOSING 6225 2 RFS IDLE 0 0 RFS IDLE 6228 2 RFS IDLE 0 0 PROCESS STATUS SEQUENCE# THREAD# --------- ------------ ---------- ---------- RFS IDLE 6950 1 RFS IDLE 0 0 RFS IDLE 0 0 MRP0 APPLYING_LOG 6950 1 SQL> select status,instance_name,database_role from v$instance,v$database; STATUS INSTANCE_NAME DATABASE_ROLE ------------ ---------------- ---------------- OPEN dg_stby01 PHYSICAL STANDBY SQL>
Note: This database is open in read-only mode, hence the “open” status.
To determine the last received and applied log file from the primary, run this query on the STANDBY:
SELECT 'Last Applied : ' Logs, TO_CHAR(next_time,'DD-MON-YY:HH24:MI:SS') TIME,thread#,sequence# FROM v$archived_log WHERE sequence# = (SELECT MAX(sequence#) FROM v$archived_log WHERE applied='YES' ) UNION SELECT 'Last Received : ' Logs, TO_CHAR(next_time,'DD-MON-YY:HH24:MI:SS') TIME,thread#,sequence# FROM v$archived_log WHERE sequence# = (SELECT MAX(sequence#) FROM v$archived_log ); LOGS TIME THREAD# SEQUENCE# ---------------- ------------------ ---------- ---------- Last Applied : 20-NOV-14:13:14:54 2 23321 Last Received : 20-NOV-14:13:37:14 2 23322
To determine the “queue” on the PRIMARY database:
SELECT (SELECT name FROM V$DATABASE ) name, (SELECT MAX (sequence#) FROM v$archived_log WHERE dest_id = 1 ) Current_primary_seq, (SELECT MAX (sequence#) FROM v$archived_log WHERE TRUNC(next_time) > SYSDATE - 1 AND dest_id = 2 ) max_stby, (SELECT NVL ( (SELECT MAX (sequence#) - MIN (sequence#) FROM v$archived_log WHERE TRUNC(next_time) > SYSDATE - 1 AND dest_id = 2 AND applied = 'NO' ), 0) FROM DUAL ) "To be applied", ( (SELECT MAX (sequence#) FROM v$archived_log WHERE dest_id = 1 ) - (SELECT MAX (sequence#) FROM v$archived_log WHERE dest_id = 2 )) "To be Shipped" FROM DUAL; NAME CURRENT_PRIMARY_SEQ MAX_STBY To be applied To be Shipped --------- ------------------- ---------- ------------- ------------- PRIMRAC0 23322 23322 0 0
And that’s it! It occurs to me I could just create a blog where everything goes smooth sailing from end-to-end, but in my opinion this is not how the real world works. And probably why this blog is being read. There are many blogs out there where it works for the author, but when one is trying to reproduce the steps, it bails out..and no solution is available on the blog, since they didn’t run into the issue. Or they are just smarter than me
.
[Standby] Step 10: Optionally place standby DB in read only mode.
To enable Active Data Guard, you need to open the physical standby in read-only mode and start redo apply.
The Data Guard should be in one of two states prior to enabling Active Data Guard:
- The standby is mounted and redo is running
- The standby database has been shutdown cleanly and redo was stopped
First stop the redo and open the database as read-only:
sql> recover managed standby database cancel; sql> alter database open read only;
Restart the redo.
sql> recover managed standby database disconnect using current logfile;
Conclusion:
Implementing ADG is no mean feat, and there is a lot more to say on this subject, however I hope that some guidance and relief can be found with this information provided.
No comments:
Post a Comment