Oracle Data Guard setup - Step by step setup

mail

How to configure Data Guard primary/standby replication ?

Preliminary notes :

  • primaryDatacenter : datacenter initially hosting the primary server primaryServer
  • standbyDatacenter : datacenter initially hosting the standby server standbyServer

Pointing out the word initially in definitions above : if primaryServer dies, thanks to Data Guard standbyServer will become a primary server, so things may become harder to explain (and I should probably give more descriptive names to servers and datacenters ). Anyway, this whole article is about the initial situation, where the primary server is in the primary datacenter.

Referring to servers primaryServer, standbyServer or currentServer (see below) is made via any means allowing to contact them over the network : FQDN, DNS record, /etc/hosts entry, ...
For example : standbyServer may have a value such as mystandbyoracleserver.domain.tld, whereas standbyServerShortname would just be mystandbyoracleserver.

Variables presented all uppercase in examples below should be entered uppercase too. Variables shown in camelCase should be all lowercase. This makes no big difference anyway, except in paths.

Some actions/usages/rules/notations described below may be specific to my company and differ from the general usage and documentation.

Configuration of primaryServer :

OraToolBox :

As root : install oratoolbox 2.7. The install script used here delivers + generates custom conf files for Oracle + monitoring + backups (with AutoConf.ksh) :

cd /home/afh/livraison/05.00.00/oracle; ./install_oratoolbox_2.7.sh

Configuration files :

As root : sed -i 's/:N/:Y/g' /etc/oratab

Network parameters :
  1. As oracle : create or edit /home/oracle/network/admin/tnsnames.ora :
    ORACLE_SID =
    	(DESCRIPTION =
    		(ADDRESS = (PROTOCOL = TCP)(HOST = currentServer)(PORT = 1521))
    		(CONNECT_DATA =
    			(SERVER = DEDICATED)
    			(SID = ORACLE_SID)
    		)
    	)
    
    ORACLE_SID_PRIMARYDATACENTER =
    	(DESCRIPTION =
    		(ADDRESS = (PROTOCOL = TCP)(HOST = primaryServer)(PORT = 1521))
    		(CONNECT_DATA =
    			(SERVER = DEDICATED)
    			(SID = ORACLE_SID)
    		)
    	)
    
    ORACLE_SID_STANDBYDATACENTER =
    	(DESCRIPTION =
    		(ADDRESS = (PROTOCOL = TCP)(HOST = standbyServer)(PORT = 1521))
    		(CONNECT_DATA =
    			(SERVER = DEDICATED)
    			(SID = ORACLE_SID)
    		)
    	)

    You'll have to create one version of this file for primaryServer, and another one for standbyServer, the only difference being currentServer.

  2. As oracle : create or edit /home/oracle/network/admin/listener.ora :
    LISTENER_ORACLE_SID =
    	(DESCRIPTION_LIST =
    		(DESCRIPTION =
    			(ADDRESS = (PROTOCOL = TCP)(HOST = currentServer)(PORT = 1521))
    		)
    	)
    
    SID_LIST_LISTENER_ORACLE_SID =
    	(SID_LIST =
    		(SID_DESC =
    			(SID_NAME = ORACLE_SID)
    			(GLOBAL_DBNAME = ORACLE_SID_CURRENTDATACENTER_DGMGRL)
    			(ORACLE_HOME = ORACLE_HOME)
    		)
    	)
    
    DIAG_ADR_ENABLED_LISTENER_ORACLE_SID = ON
    LOGGING_LISTENER_ORACLE_SID = ON
    TRACE_LEVEL_LISTENER_ORACLE_SID = OFF
    TRACE_TIMESTAMP_LISTENER_ORACLE_SID = TRUE
    SUBSCRIBE_FOR_NODE_DOWN_EVENT_listener = OFF
    ADMIN_RESTRICTIONS_LISTENER_ORACLE_SID = ON
    LOCAL_OS_AUTHENTIFICATION_LISTENER_ORACLE_SID = ON
    SECURE_REGISTER_LISTENER_ORACLE_SID = (TCP)

    You'll have to create one version of this file for primaryServer, and another one for standbyServer, the differences being currentServer and CURRENTDATACENTER.

Parameters for primaryServer only :

Enabling FORCE_LOGGING of ARCHIVELOG mode :

As oracle : export ORACLE_SID=ORACLE_SID; sqlplus / as sysdba

Is archivelog mode enabled and logging forced ? (Details : 1, 2, 3)

  1. SET linesize 100;
    COL "FORCE_LOGGING" FORMAT A15;
    SELECT dbid, name, log_mode, force_logging FROM v$database;
          DBID NAME              LOG_MODE              FORCE_LOGGING
    ---------- ----------------- --------------------- ---------------
    1152768762 ORACLE_SID        ARCHIVELOG            NO
  2. It says NO, so let's toggle it : alter database force logging;
    Database altered.
  3. Check : SELECT dbid, name, log_mode, force_logging FROM v$database;
          DBID NAME              LOG_MODE              FORCE_LOGGING
    ---------- ----------------- --------------------- ---------------
    1152768762 ORACLE_SID        ARCHIVELOG            YES
Main parameters :
alter system set db_unique_name='ORACLE_SID_PRIMARYDATACENTER' scope=spfile;
alter system set remote_login_passwordfile=EXCLUSIVE scope=spfile;
alter system set standby_file_management=auto scope=both;
alter system set log_archive_dest_state_2=defer scope=both;
System altered.
local_listener parameter :
  1. alter system set local_listener='ORACLE_SID_PRIMARYDATACENTER' scope=both;
    alter system set log_archive_dest_1='LOCATION=/appli/oracle/ORACLE_SID/oraarch/ VALID_FOR=(ALL_LOGFILES,ALL_ROLES)' scope=spfile;
    System altered.
  2. shutdown immediate
    Database closed.
    Database dismounted.
    ORACLE instance shut down.
  3. startup
    ORACLE instance started.
    
    Total System Global Area 8551575552 bytes
    Fixed Size                  2270360 bytes
    Variable Size            5200939880 bytes
    Database Buffers         3338665984 bytes
    Redo Buffers                9699328 bytes
    Database mounted.
    Database opened.
log_archive destination parameter :
alter system set log_archive_dest_2='SERVICE=ORACLE_SID_STANDBYDATACENTER LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) delay=0 DB_UNIQUE_NAME=ORACLE_SID_STANDBYDATACENTER';
alter system set log_archive_config='dg_config=(ORACLE_SID_PRIMARYDATACENTER,ORACLE_SID_STANDBYDATACENTER)' scope=both;
System altered.
FAL parameters :

FAL settings define where the current server will get missing archivelogs in case of a gap (i.e. when acting as a standby server). The fal_server is the one serving archivelogs, and fal_client is the one receiving them (sources 1, 2).
Values are Oracle Net service names. Such names are usually defined locally in tnsnames.ora, or in a directory server (details).
These settings stay inert on primaryServer until Data Guard switches and primaryServer becomes a standby server.

  1. alter system set fal_server='ORACLE_SID_STANDBYDATACENTER' scope=both;
    alter system set fal_client='ORACLE_SID_PRIMARYDATACENTER' scope=both;
    System altered.
  2. shutdown immediate
    Database closed.
    Database dismounted.
    ORACLE instance shut down.
  3. exit
Move configuration files :
  1. mv $ORACLE_HOME/dbs/{orapw$ORACLE_SID,init$ORACLE_SID.ora,spfile$ORACLE_SID.ora} /home/oracle/admin/$ORACLE_SID/dbs
    ln -sf /home/oracle/admin/$ORACLE_SID/dbs/{orapw$ORACLE_SID,init$ORACLE_SID.ora,spfile$ORACLE_SID.ora} $ORACLE_HOME/dbs/
  2. sqlplus / as sysdba
    SQL*Plus: Release 11.2.0.4.0 Production on Wed Jan 6 10:12:52 2016
    
    Copyright (c) 1982, 2013, Oracle.  All rights reserved.
    
    Connected to an idle instance.
  3. startup
    ORACLE instance started.
    
    Total System Global Area 8551575552 bytes
    Fixed Size                  2270360 bytes
    Variable Size            5200939880 bytes
    Database Buffers         3338665984 bytes
    Redo Buffers                9699328 bytes
    Database mounted.
    Database opened.
Checking parameters :
  1. column type format a10;
    show parameter log_archive_dest_1
    NAME                  TYPE       VALUE
    --------------------- ---------- ------------------------------
    log_archive_dest_1    string     LOCATION=/appli/oracle/ORACLE_SID/oraarch/ VALID_FOR=(ALL_LOGFILES,ALL_ROLES)
  2. show parameter log_archive_dest_2
    NAME                  TYPE       VALUE
    --------------------- ---------- ------------------------------
    log_archive_dest_2    string     SERVICE=ORACLE_SID_STANDBYDATACENTER LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) delay=0 DB_UNIQUE_NAME=ORACLE_SID_STANDBYDATACENTER
  3. show parameter fal
    NAME                  TYPE       VALUE
    --------------------- ---------- ------------------------------
    fal_client            string     ORACLE_SID_PRIMARYDATACENTER
    fal_server            string     ORACLE_SID_STANDBYDATACENTER
  4. show parameter log_archive_config
    NAME                  TYPE       VALUE
    --------------------- ---------- ------------------------------
    log_archive_config    string     dg_config=(ORACLE_SID_PRIMARYDATACENTER,ORACLE_SID_STANDBYDATACENTER)

Configuration of standby redologs :

standby redologs should be the same size than online redologs. To get the size of the online redologs :

Method 1:
As oracle : ls -lh /appli/oracle/$ORACLE_SID/oraredolog01
-rw-r----- 1 oracle oinstall 151M Jan  6 11:15 ORACLE_SID_redo1_1.rdo
-rw-r----- 1 oracle oinstall 151M Jan  6 10:13 ORACLE_SID_redo2_1.rdo
-rw-r----- 1 oracle oinstall 151M Jan  6 10:13 ORACLE_SID_redo3_1.rdo
Method 2:
grep -A6 'RedoLogGroupAttributes' databaseCreationTemplate.dbt
<RedoLogGroupAttributes id="1">
	<reuse>false</reuse>
	<fileSize unit="KB">153600</fileSize>
	<Thread>1</Thread>
	<member ordinal="0" memberName="ORACLE_SID_redo1_1.rdo" filepath="/appli/oracle/ORACLE_SID/oraredolog01/"/>
	<member ordinal="1" memberName="ORACLE_SID_redo1_2.rdo" filepath="/appli/oracle/ORACLE_SID/oraredolog02/"/>
</RedoLogGroupAttributes>

Then, as oracle : sqlplus / as sysdba
ALTER DATABASE ADD STANDBY LOGFILE GROUP 11 ('/appli/oracle/ORACLE_SID/oraredolog01/ORACLE_SID_stdbyredo11_1.rdo','/appli/oracle/ORACLE_SID/oraredolog02/ORACLE_SID_stdbyredo11_2.rdo') SIZE 150M;
ALTER DATABASE ADD STANDBY LOGFILE GROUP 12 ('/appli/oracle/ORACLE_SID/oraredolog01/ORACLE_SID_stdbyredo12_1.rdo','/appli/oracle/ORACLE_SID/oraredolog02/ORACLE_SID_stdbyredo12_2.rdo') SIZE 150M;
ALTER DATABASE ADD STANDBY LOGFILE GROUP 13 ('/appli/oracle/ORACLE_SID/oraredolog01/ORACLE_SID_stdbyredo13_1.rdo','/appli/oracle/ORACLE_SID/oraredolog02/ORACLE_SID_stdbyredo13_2.rdo') SIZE 150M;
Database altered.

Backup of the primary database :

All these steps must be executed as oracle :

  1. Declare the backup parameters :
    [ -z "$ORACLE_SID" ] && echo 'Please define ORACLE_SID then retry.' || echo "$ORACLE_SID rman /sauvebase/$ORACLE_SID 2 3 Y" >> /home/oracle/admgen/SavGen/SavGen.cfg
  2. Make sure the destination directory is writable by oracle :
    testFile="/sauvebase/$ORACLE_SID/test"; touch "$testFile" && { rm "$testFile"; echo 'OK'; } || echo 'KO'
  3. Then launch the RMAN backup :
    /home/oracle/admgen/SavGen/SavGen.ksh "$ORACLE_SID"
    FCM interne.
    2016/01/06-16:17:16 |pcyyycvv|SavOra5.6b   |SavDebut     |0|_INFO_   ||0|||Base ORACLE_SID: sauvegarde RMAN (level 0-), dir=/sauvebase/ORACLE_SID, parall=2, retention=3, StartStop/Compress=Y, dir2=0, IncrementalOnly=
    2016/01/06-16:17:16 |pcyyycvv|SavOra5.6b   |DelFlagErr   |0|_WARNING_||102|||Base ORACLE_SID: Suppression de fanion(s) en erreur: FANION_DIR=/nsr/fanion ID_APPLI=
    2016/01/06-16:18:11 |pcyyycvv|SavOra5.6b   |SetFlagErr   |0|_WARNING_||103|||Base ORACLE_SID: Pose de fanion en erreur: FANION_DIR=/nsr/fanion ID_APPLI=
    2016/01/06-16:18:11 |pcyyycvv|SavOra5.6b   |FinOk        |0|_INFO_   ||0|||Base ORACLE_SID: sauvegarde Ok.
  4. Retrieve the password file :
    cp "$ORACLE_HOME/dbs/orapw$ORACLE_SID" "/sauvebase/$ORACLE_SID"
  5. Archive everything :
    cd "/sauvebase/$ORACLE_SID"; tar cfz "/sauvebase/$ORACLE_SID/backupPrimary_$(date +%F_%H-%M-%S).tgz" * --exclude='lost+found'
  6. Send the archive to /sauvebase/$ORACLE_SID on standbyServer (May I suggest a Python-based HTTP solution ?). Make sure the transferred file still belongs to oracle:oinstall.

Configuration of standbyServer :

Actions common to both servers :

  1. Install OraToolBox
  2. Edit /etc/oratab
  3. Create or edit tnsnames.ora
  4. Create or edit listener.ora
  5. Configure standby redologs

Retrieve data sent from primaryServer :

As oracle :

backupFile='backupPrimary_2016-01-12_11-37-45.tgz'; cd "/sauvebase/$ORACLE_SID" && tar zxf "$backupFile" && rm "$backupFile"

Check status of the listener :

As oracle, from the system shell :
lsnrctl status "listener_$ORACLE_SID"
LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 12-JAN-2016 16:14:56

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=standbyServer)(PORT=1521)))
STATUS of the LISTENER
------------------------
Alias                     listener_ORACLE_SID
Version                   TNSLSNR for Linux: Version 11.2.0.4.0 - Production
Start Date                12-JAN-2016 16:08:05
Uptime                    0 days 0 hr. 6 min. 50 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      ON
Listener Parameter File   /home/oracle/network/admin/listener.ora
Listener Log File         /logiciels/oracle/diag/tnslsnr/standbyServerShortname/listener_oracle_sid/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=standbyServer)(PORT=1521)))
Services Summary...
Service "ORACLE_SIDXDB" has 1 instance(s).
  Instance "ORACLE_SID", status READY, has 1 handler(s) for this service...
Service "ORACLE_SID_STANDBYDATACENTER_DGMGRL" has 1 instance(s).
  Instance "ORACLE_SID", status UNKNOWN, has 1 handler(s) for this service...
Service "oracle_sid_standbyDatacenter" has 1 instance(s).
  Instance "ORACLE_SID", status READY, has 1 handler(s) for this service...
The command completed successfully
  • If the listener is stopped, lsnrctl status "listener_$ORACLE_SID" will return :
    LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 12-JAN-2016 16:04:16
    
    Copyright (c) 1991, 2013, Oracle.  All rights reserved.
    
    Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=standbyServer)(PORT=1521)))
    TNS-12541: TNS:no listener
     TNS-12560: TNS:protocol adapter error
      TNS-00511: No listener
       Linux Error: 111: Connection refused
    Start it with : lsnrctl start "listener_$ORACLE_SID"
  • Wait up to a minute after starting a listener to get its status. If you query this earlier, you'll get a partial (weird) display.
  • If one of the services is displayed as UNKNOWN, this means the instance is registered statically (with a SID_LIST) in listener.ora file, rather than dynamically with service registration. There's nothing to worry about here. (source)

Deploy configuration files :

All actions below are executed as oracle.
orapw :
cp "/sauvebase/$ORACLE_SID/orapw$ORACLE_SID" "/home/oracle/admin/$ORACLE_SID/dbs" && ln -sf "/home/oracle/admin/$ORACLE_SID/dbs/orapw$ORACLE_SID" "$ORACLE_HOME/dbs/orapw$ORACLE_SID"
Since we've just copied the password file of primaryServer to standbyServer, the sys account now has the same password on both servers. Its value is the value we configured on primaryServer.
init file :
  1. Deploy it : cp "/sauvebase/$ORACLE_SID/init$ORACLE_SID"* "/home/oracle/admin/$ORACLE_SID/dbs/init$ORACLE_SID.ora" && ln -sf "/home/oracle/admin/$ORACLE_SID/dbs/init$ORACLE_SID.ora" "$ORACLE_HOME/dbs/init$ORACLE_SID.ora"
  2. Adapt it to standbyServer : vim "/home/oracle/admin/$ORACLE_SID/dbs/init$ORACLE_SID.ora"
    *.audit_file_dest='/logiciels/oracle/diag/rdbms/oracle_sid_standbyDatacenter/ORACLE_SID/adump'
    *.db_unique_name='ORACLE_SID_STANDBYDATACENTER'
    *.fal_client='ORACLE_SID_STANDBYDATACENTER'
    *.fal_server='ORACLE_SID_PRIMARYDATACENTER'
    *.local_listener='ORACLE_SID_STANDBYDATACENTER'
    *.log_archive_dest_2='SERVICE=ORACLE_SID_PRIMARYDATACENTER LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) delay=0 DB_UNIQUE_NAME=ORACLE_SID_PRIMARYDATACENTER'		<== this is a continued line, don't edit the *.db_unique_name= line !
Create the spfile :

Step-by-step version :

  1. sqlplus / as sysdba
  2. shutdown immediate
    Database closed.
    Database dismounted.
    ORACLE instance shut down.
  3. startup nomount
    ORACLE instance started.
    
    Total System Global Area 8551575552 bytes
    Fixed Size                  2270360 bytes
    Variable Size            5200939880 bytes
    Database Buffers         3338665984 bytes
    Redo Buffers                9699328 bytes
    This command may fail (because of an already existing spfile with parameters that don't fit) and complain :
    ORA-00119: invalid specification for system parameter LOCAL_LISTENER
    ORA-00132: syntax error or unresolved network name 'LISTENER_ORACLE_SID'
    In such case :
    1. Let's delete the uninvited spfile (the ! in SQL*Plus allows launching shell commands) :
      !mv /home/oracle/admin/$ORACLE_SID/dbs/spfile$ORACLE_SID.ora{,_DELETEME}
    2. Retry the startup command above :
      startup nomount
    3. Otherwise : read this.
  4. create spfile='/home/oracle/admin/@/dbs/spfile@.ora' from pfile='/home/oracle/admin/@/dbs/init@.ora';
    File created.
  5. shutdown immediate
    ORA-01507: database not mounted
    
    ORACLE instance shut down.
  6. exit
  7. ln -sf "/home/oracle/admin/$ORACLE_SID/dbs/spfile$ORACLE_SID.ora" "$ORACLE_HOME/dbs/spfile$ORACLE_SID.ora"

Big-bang version :

sqlplus -s / as sysdba <<EOC
shutdown immediate
startup nomount
create spfile='/home/oracle/admin/@/dbs/spfile@.ora' from pfile='/home/oracle/admin/@/dbs/init@.ora';
shutdown immediate
!ln -sf "/home/oracle/admin/$ORACLE_SID/dbs/spfile$ORACLE_SID.ora" "$ORACLE_HOME/dbs/spfile$ORACLE_SID.ora"
EOC
ORA-01109: database not open


Database dismounted.
ORACLE instance shut down.
ORACLE instance started.

Total System Global Area 2137886720 bytes
Fixed Size                  2254952 bytes
Variable Size            1258293144 bytes
Database Buffers          872415232 bytes
Redo Buffers                4923392 bytes

File created.

ORA-01507: database not mounted


ORACLE instance shut down.

Restore the database :

Still as oracle :
  1. rman target /
    
    connected to target database (not started)
  2. startup nomount;
    Oracle instance started
    
    Total System Global Area    8551575552 bytes
    
    Fixed Size                     2270360 bytes
    Variable Size               5234494312 bytes
    Database Buffers            3305111552 bytes
    Redo Buffers                   9699328 bytes
    If this fails with ORA-00119 + ORA-00132 errors, read this.
  3. restore controlfile from '/sauvebase/@/control_stdby.ctl';
    Starting restore at 13-JAN-16
    using target database control file instead of recovery catalog
    allocated channel: ORA_DISK_1
    channel ORA_DISK_1: SID=1145 device type=DISK
    
    channel ORA_DISK_1: copied control file copy
    output file name=/appli/oracle/ORACLE_SID/oractl01/ORACLE_SID_ctrl01.ctl
    output file name=/appli/oracle/ORACLE_SID/oradata01/ORACLE_SID_ctrl02.ctl
    Finished restore at 13-JAN-16
  4. sql 'alter database mount standby database';
    sql statement: alter database mount standby database
    released channel: ORA_DISK_1
  5. restore database;
    Starting restore at 13-JAN-16
    allocated channel: ORA_DISK_1
    channel ORA_DISK_1: SID=1145 device type=DISK
    
    channel ORA_DISK_1: starting datafile backup set restore
    channel ORA_DISK_1: specifying datafile(s) to restore from backup set
    channel ORA_DISK_1: restoring datafile 00003 to /appli/oracle/ORACLE_SID/oradata02/ORACLE_SID_undo_01.dbf
    channel ORA_DISK_1: restoring datafile 00004 to /appli/oracle/ORACLE_SID/oradata04/ORACLE_SID_tools_01.dbf
    channel ORA_DISK_1: restoring datafile 00007 to /appli/oracle/ORACLE_SID/oradata02/ORACLE_SID_tab_itr1_s_02.dbf
    channel ORA_DISK_1: restoring datafile 00009 to /appli/oracle/ORACLE_SID/oradata01/ORACLE_SID_idx_itr1_s_02.dbf
    channel ORA_DISK_1: restoring datafile 00010 to /appli/oracle/ORACLE_SID/oradata02/ORACLE_SID_jms_itr1_s_01.dbf
    channel ORA_DISK_1: reading from backup piece /sauvebase/ORACLE_SID/db_0__20160106_161716_ORACLE_SID_s2_p1
    channel ORA_DISK_1: piece handle=/sauvebase/ORACLE_SID/db_0__20160106_161716_ORACLE_SID_s2_p1 tag=DB_0_
    channel ORA_DISK_1: restored backup piece 1
    channel ORA_DISK_1: restore complete, elapsed time: 00:00:35
    channel ORA_DISK_1: starting datafile backup set restore
    channel ORA_DISK_1: specifying datafile(s) to restore from backup set
    channel ORA_DISK_1: restoring datafile 00001 to /appli/oracle/ORACLE_SID/oradata01/ORACLE_SID_system_01.dbf
    channel ORA_DISK_1: restoring datafile 00002 to /appli/oracle/ORACLE_SID/oradata01/ORACLE_SID_sysaux_01.dbf
    channel ORA_DISK_1: restoring datafile 00005 to /appli/oracle/ORACLE_SID/oradata03/ORACLE_SID_users_01.dbf
    channel ORA_DISK_1: restoring datafile 00006 to /appli/oracle/ORACLE_SID/oradata02/ORACLE_SID_tab_itr1_s_01.dbf
    channel ORA_DISK_1: restoring datafile 00008 to /appli/oracle/ORACLE_SID/oradata01/ORACLE_SID_idx_itr1_s_01.dbf
    channel ORA_DISK_1: restoring datafile 00011 to /appli/oracle/ORACLE_SID/oradata02/ORACLE_SID_quartz_itr1_s_01.dbf
    channel ORA_DISK_1: reading from backup piece /sauvebase/ORACLE_SID/db_0__20160106_161716_ORACLE_SID_s1_p1
    channel ORA_DISK_1: piece handle=/sauvebase/ORACLE_SID/db_0__20160106_161716_ORACLE_SID_s1_p1 tag=DB_0_
    channel ORA_DISK_1: restored backup piece 1
    channel ORA_DISK_1: restore complete, elapsed time: 00:00:45
    Finished restore at 13-JAN-16
    This _may_ fail saying :
    RMAN-00571: ===========================================================
    RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
    RMAN-00571: ===========================================================
    RMAN-03002: failure of restore command at 07/22/2016 11:07:36
    ORA-19870: error while restoring backup piece /sauvebase/$ORACLE_SID/db_0__20160722_104854_${ORACLE_SID}_s11197_p1
    ORA-19573: cannot obtain exclusive enqueue for datafile 10
    "Fixed" it by stopping MRP + retry the restore command.
  6. sql 'alter database recover managed standby database using current logfile disconnect';
    sql statement: alter database recover managed standby database using current logfile disconnect
    This _may_ be grumpy and return instead :
    sql statement: alter database recover managed standby database using current logfile disconnect
    RMAN-00571: ===========================================================
    RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
    RMAN-00571: ===========================================================
    RMAN-03009: failure of sql command on default channel at 07/21/2016 17:55:30
    RMAN-11003: failure during parse/execution of SQL statement: alter database recover managed standby database using current logfile disconnect
    ORA-01153: an incompatible media recovery is active
    This is because this statement asks MRP to start whereas it is already running (check it : ps -ef | grep -E "[m]rp.*$ORACLE_SID"). In such case (source) :
    1. stop MRP :
      sql 'alter database recover managed standby database cancel';
      sql statement: alter database recover managed standby database cancel
    2. Retry the command above that made RMAN morose.
  7. exit
    Recovery Manager complete.

Cross checks before enabling Data Guard :

Run tnsping from primaryServer and from standbyServer :

tnsping ORACLE_SID_PRIMARYDATACENTER
tnsping ORACLE_SID_STANDBYDATACENTER
All 4 tnspings should return :
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = fqdnOfServerBeingTnsping'd)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SID = ORACLE_SID)))
OK (0 msec)

Check the sys password :

To do so, we'll (try to) connect as sys from primaryServer to standbyServer and vice versa (details) :
  • From primaryServer :
    1. sqlplus /nolog
    2. connect sys@ORACLE_SID_STANDBYDATACENTER as sysdba
      Enter password: sys password
      Connected.
  • From standbyServer :
    1. sqlplus /nolog
    2. connect sys@ORACLE_SID_PRIMARYDATACENTER as sysdba
      Enter password: enter the same password
      Connected.

Failing this test means something went wrong when copying orapwORACLE_SID from primaryServer to standbyServer. You MUST fix this before continuing. To quickly check it, run on both servers as oracle :

md5sum $ORACLE_HOME/dbs/orapw*

Enabling Data Guard :

On primaryServer as oracle :

Step-by-step version :

  1. sqlplus / as sysdba
  2. Enable the transmission of archivelogs from primaryServer to standbyServer :
    alter system set log_archive_dest_state_2=enable scope=both;
    System altered.
  3. Get the sequence number of the current archivelog :
    select thread#, sequence#, archived, applied from v$archived_log where recid in (select max(recid) from v$archived_log);
       THREAD#  SEQUENCE# ARCHIVED  APPLIED
    ---------- ---------- --------- ---------------------------
             1         17 YES       NO
  4. Force Oracle to create a new archivelog :
    alter system switch logfile;
    System altered.
  5. Make sure the sequence number has been incremented :
    select thread#, sequence#, archived, applied from v$archived_log where recid in (select max(recid) from v$archived_log);
       THREAD#  SEQUENCE# ARCHIVED  APPLIED
    ---------- ---------- --------- ---------------------------
             1         20 YES       NO

Big-bang version :

sqlplus -s / as sysdba <<EOSQL
alter system set log_archive_dest_state_2=enable scope=both;
select thread#, sequence#, archived, applied from v\$archived_log where recid in (select max(recid) from v\$archived_log);
alter system switch logfile;
select thread#, sequence#, archived, applied from v\$archived_log where recid in (select max(recid) from v\$archived_log);
EOSQL
System altered.


   THREAD#  SEQUENCE# ARCHIVED     APPLIED
---------- ---------- ------------ ------------------------------------
         1         17 YES          YES


System altered.


   THREAD#  SEQUENCE# ARCHIVED     APPLIED
---------- ---------- ------------ ------------------------------------
         1         20 YES          YES

On standbyServer as oracle :

Step-by-step version :

  1. sqlplus / as sysdba
  2. Enable the transmission of archivelogs from standbyServer to primaryServer (this setting is inert so far and will become active after a switchover/failover) :
    alter system set log_archive_dest_state_2=enable scope=both;
    System altered.
  3. Get the sequence number of the current archivelog :
    select thread#, sequence#, archived, applied from v$archived_log where recid in (select max(recid) from v$archived_log);
       THREAD#  SEQUENCE# ARCHIVED  APPLIED
    ---------- ---------- --------- ---------------------------
             1         20 YES       YES

    This is OK if we get the same number on both servers (here 20).

  4. Make sure there is no gap in archivelogs :
    select * from v$archive_gap;
    no rows selected

Big-bang version :

sqlplus -s / as sysdba <<EOSQL
alter system set log_archive_dest_state_2=enable scope=both;
select thread#, sequence#, archived, applied from v\$archived_log where recid in (select max(recid) from v\$archived_log);
select * from v\$archive_gap;
EOSQL
System altered.


   THREAD#  SEQUENCE# ARCHIVED     APPLIED
---------- ---------- ------------ ------------------------------------
         1         20 YES          YES


no rows selected

Configuration of the Data Guard broker :

The Data Guard broker is used to create, manage and monitor a Data Guard configuration (adding a new standby database, configuring the standby control file, ...). It also simplifies switchover and failover operations, and many more things...
As usual, all actions below are run as oracle.

Create the broker configuration on both servers :

Run, as oracle, from the system shell :
sqlplus '/ as sysdba' << EOSQL
alter system set dg_broker_config_file1='$ORACLE_HOME/dbs/dr1${ORACLE_SID}_PRIMARYDATACENTER.dat' scope=both;
alter system set dg_broker_config_file2='$ORACLE_HOME/dbs/dr2${ORACLE_SID}_STANDBYDATACENTER.dat' scope=both;
alter system set dg_broker_start=true scope=both;
EOSQL
This should output :
SQL>
System altered.

SQL>
System altered.

SQL>
System altered.

Don't forget to do this on both servers !

Then on primaryServer as oracle :

Step-by-step version :

  1. dgmgrl
  2. connect /
    Connected.
  3. create configuration 'ORACLE_SIDDR' as primary database is 'ORACLE_SID_PRIMARYDATACENTER' connect identifier is 'ORACLE_SID_PRIMARYDATACENTER';
    Configuration "ORACLE_SIDDR" created with primary database "ORACLE_SID_PRIMARYDATACENTER"
  4. add database 'ORACLE_SID_STANDBYDATACENTER' as connect identifier is 'ORACLE_SID_STANDBYDATACENTER';
    Database "ORACLE_SID_STANDBYDATACENTER" added
  5. enable configuration;
    Enabled.
  6. edit database 'ORACLE_SID_PRIMARYDATACENTER' set property StaticConnectIdentifier='ORACLE_SID_PRIMARYDATACENTER';
    Property "staticconnectidentifier" updated
  7. edit database 'ORACLE_SID_STANDBYDATACENTER' set property StaticConnectIdentifier='ORACLE_SID_STANDBYDATACENTER';
    Property "staticconnectidentifier" updated
  8. show configuration verbose;
    The expected output is :
    Configuration - ORACLE_SIDDR
    
      Protection Mode: MaxPerformance
      Databases:
        ORACLE_SID_PRIMARYDATACENTER - Primary database
        ORACLE_SID_STANDBYDATACENTER - Physical standby database
    
      Properties:
        FastStartFailoverThreshold      = '30'
        OperationTimeout                = '30'
        FastStartFailoverLagLimit       = '30'
        CommunicationTimeout            = '180'
        ObserverReconnect               = '0'
        FastStartFailoverAutoReinstate  = 'TRUE'
        FastStartFailoverPmyShutdown    = 'TRUE'
        BystandersFollowRoleChange      = 'ALL'
        ObserverOverride                = 'FALSE'
        ExternalDestination1            = ''
        ExternalDestination2            = ''
        PrimaryLostWriteAction          = 'CONTINUE'
    
    Fast-Start Failover: DISABLED
    
    Configuration Status:
    SUCCESS

Big-bang version :

dgmgrl / << EOC
create configuration '${ORACLE_SID}DR' as primary database is '${ORACLE_SID}_PRIMARYDATACENTER' connect identifier is '${ORACLE_SID}_PRIMARYDATACENTER';
add database '${ORACLE_SID}_STANDBYDATACENTER' as connect identifier is '${ORACLE_SID}_STANDBYDATACENTER';
enable configuration;
edit database '${ORACLE_SID}_PRIMARYDATACENTER' set property StaticConnectIdentifier='${ORACLE_SID}_PRIMARYDATACENTER';
edit database '${ORACLE_SID}_STANDBYDATACENTER' set property StaticConnectIdentifier='${ORACLE_SID}_STANDBYDATACENTER';
show configuration verbose;
EOC
Welcome to DGMGRL, type "help" for information.
Connected.
DGMGRL> Configuration "ORACLE_SIDDR" created with primary database "ORACLE_SID_PRIMARYDATACENTER"
DGMGRL> Database "ORACLE_SID_STANDBYDATACENTER" added
DGMGRL> Enabled.
DGMGRL> Property "staticconnectidentifier" updated
DGMGRL> Property "staticconnectidentifier" updated
DGMGRL>
Configuration - ORACLE_SIDDR

  Protection Mode: MaxPerformance
  Databases:
    ORACLE_SID_PRIMARYDATACENTER - Primary database
    ORACLE_SID_STANDBYDATACENTER - Physical standby database

  Properties:
    FastStartFailoverThreshold      = '30'
    OperationTimeout                = '30'
    FastStartFailoverLagLimit       = '30'
    CommunicationTimeout            = '180'
    ObserverReconnect               = '0'
    FastStartFailoverAutoReinstate  = 'TRUE'
    FastStartFailoverPmyShutdown    = 'TRUE'
    BystandersFollowRoleChange      = 'ALL'
    ObserverOverride                = 'FALSE'
    ExternalDestination1            = ''
    ExternalDestination2            = ''
    PrimaryLostWriteAction          = 'CONTINUE'

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

The show configuration verbose could also return :

Configuration - ORACLE_SIDDR

  Protection Mode: MaxPerformance
  Databases:
    ORACLE_SID_PRIMARYDATACENTER - Primary database
    ORACLE_SID_STANDBYDATACENTER - Physical standby database
      Warning: ORA-16857: standby disconnected from redo source for longer than specified threshold

  Properties:
    FastStartFailoverThreshold      = '30'
    OperationTimeout                = '30'
    FastStartFailoverLagLimit       = '30'
    CommunicationTimeout            = '180'
    ObserverReconnect               = '0'
    FastStartFailoverAutoReinstate  = 'TRUE'
    FastStartFailoverPmyShutdown    = 'TRUE'
    BystandersFollowRoleChange      = 'ALL'
    ObserverOverride                = 'FALSE'
    ExternalDestination1            = ''
    ExternalDestination2            = ''
    PrimaryLostWriteAction          = 'CONTINUE'

Fast-Start Failover: DISABLED

Configuration Status:
WARNING
The ORA-16857 error shown here is generally due to network latency and may not occur trying the same command again (details).

Move configuration files :

on primaryServer :
As oracle, still in dgmgrl :
  1. disable configuration;
    Disabled.
  2. exit
on both servers :
As oracle :
  1. sqlplus '/ as sysdba' << EOSQL
    alter system set dg_broker_start=false;
    set linesize 2000
    show parameter broker
    EOSQL
    System altered.
    
    
    NAME                      TYPE       VALUE
    ------------------------- ---------- ------------------------------------------------------------------------------
    dg_broker_config_file1    string     /logiciels/oracle/oradb/11.2.0/ora/dbs/dr1ORACLE_SID_PRIMARYDATACENTER.dat
    dg_broker_config_file2    string     /logiciels/oracle/oradb/11.2.0/ora/dbs/dr2ORACLE_SID_STANDBYDATACENTER.dat
    dg_broker_start           boolean    FALSE
  2. configFile1="/home/oracle/admin/$ORACLE_SID/dbs/dr1${ORACLE_SID}_PRIMARYDATACENTER.dat"; configFile2="/home/oracle/admin/$ORACLE_SID/dbs/dr2${ORACLE_SID}_STANDBYDATACENTER.dat"; link1="dr1${ORACLE_SID}_PRIMARYDATACENTER.dat"; link2="dr2${ORACLE_SID}_STANDBYDATACENTER.dat"; cd "$ORACLE_HOME/dbs"; mv "$link1" "$configFile1"; mv "$link2" "$configFile2"; ln -sf "$configFile1" "$link1"; ln -sf "$configFile2" "$link2"
  3. sqlplus '/ as sysdba' << EOSQL
    alter system set dg_broker_start=true;
    set linesize 2000
    show parameter broker
    EOSQL
    System altered.
    
    
    NAME                      TYPE       VALUE
    ------------------------- ---------- ------------------------------------------------------------------------------
    dg_broker_config_file1    string     /logiciels/oracle/oradb/11.2.0/ora/dbs/dr1ORACLE_SID_PRIMARYDATACENTER.dat
    dg_broker_config_file2    string     /logiciels/oracle/oradb/11.2.0/ora/dbs/dr2ORACLE_SID_STANDBYDATACENTER.dat
    dg_broker_start           boolean    TRUE
  4. Don't forget to repeat this on both servers !
on primaryServer, still as oracle :

Step-by-step version :

  1. dgmgrl
  2. connect /
    Connected.
  3. enable configuration;
    Enabled.
  4. show configuration verbose;
    Configuration - ORACLE_SIDDR
    
      Protection Mode: MaxPerformance
      Databases:
        ORACLE_SID_PRIMARYDATACENTER - Primary database
        ORACLE_SID_STANDBYDATACENTER - Physical standby database
    
      Properties:
        FastStartFailoverThreshold      = '30'
        OperationTimeout                = '30'
        FastStartFailoverLagLimit       = '30'
        CommunicationTimeout            = '180'
        ObserverReconnect               = '0'
        FastStartFailoverAutoReinstate  = 'TRUE'
        FastStartFailoverPmyShutdown    = 'TRUE'
        BystandersFollowRoleChange      = 'ALL'
        ObserverOverride                = 'FALSE'
        ExternalDestination1            = ''
        ExternalDestination2            = ''
        PrimaryLostWriteAction          = 'CONTINUE'
    
    Fast-Start Failover: DISABLED
    
    Configuration Status:
    SUCCESS
  5. exit

Big-bang version :

dgmgrl / << EOC
enable configuration;
show configuration verbose;
EOC
DGMGRL for Linux: Version 11.2.0.4.0 - 64bit Production

Copyright (c) 2000, 2009, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected.
DGMGRL> Enabled.
DGMGRL>
Configuration - ORACLE_SIDDR

  Protection Mode: MaxPerformance
  Databases:
    ORACLE_SID_PRIMARYDATACENTER - Primary database
    ORACLE_SID_STANDBYDATACENTER - Physical standby database

  Properties:
    FastStartFailoverThreshold      = '30'
    OperationTimeout                = '30'
    FastStartFailoverLagLimit       = '30'
    CommunicationTimeout            = '180'
    ObserverReconnect               = '0'
    FastStartFailoverAutoReinstate  = 'TRUE'
    FastStartFailoverPmyShutdown    = 'TRUE'
    BystandersFollowRoleChange      = 'ALL'
    ObserverOverride                = 'FALSE'
    ExternalDestination1            = ''
    ExternalDestination2            = ''
    PrimaryLostWriteAction          = 'CONTINUE'

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

Same remark as earlier regarding the possible Warning: ORA-16857: standby disconnected from redo source for longer than specified threshold error.

Final check of the Data Guard configuration :

Sync check :

Read this.

Switchover test :

Let's switchover, baby :
Run these actions on any server you like (either primaryServer or standbyServer) as oracle :

Step-by-step version :

  1. dgmgrl
  2. connect sys@ORACLE_SID_STANDBYDATACENTER
    Password: sys password
    Connected.
  3. switchover to 'ORACLE_SID_STANDBYDATACENTER';
    Performing switchover NOW, please wait...
    New primary database "ORACLE_SID_STANDBYDATACENTER" is opening...
    Operation requires startup of instance "ORACLE_SID" on database "ORACLE_SID_PRIMARYDATACENTER"
    Starting instance "ORACLE_SID"...
    ORACLE instance started.
    Database mounted.
    Switchover succeeded, new primary is "ORACLE_SID_STANDBYDATACENTER"
    I have no idea why it says Operation requires startup of instance "ORACLE_SID" on database "ORACLE_SID_PRIMARYDATACENTER", I expected it to be Operation requires startup of instance "ORACLE_SID" on database "ORACLE_SID_STANDBYDATACENTER" ... (bug ?)
  4. show configuration
    Configuration - ORACLE_SIDDR
    
      Protection Mode: MaxPerformance
      Databases:
        ORACLE_SID_STANDBYDATACENTER - Primary database
        ORACLE_SID_PRIMARYDATACENTER - Physical standby database
    
    Fast-Start Failover: DISABLED
    
    Configuration Status:
    SUCCESS

Big-bang version :

oracleSid='ORACLE_SID'; currentStandbyDatacenter='STANDBYDATACENTER'; sysPassword='password'; export ORACLE_SID="$oracleSid"; dgmgrl sys/"$sysPassword"@${oracleSid}_${currentStandbyDatacenter} << EOC
switchover to '${oracleSid}_${currentStandbyDatacenter}';
show configuration
EOC
Connected.
DGMGRL> Performing switchover NOW, please wait...
New primary database "ORACLE_SID_STANDBYDATACENTER" is opening...
Operation requires startup of instance "ORACLE_SID" on database "ORACLE_SID_PRIMARYDATACENTER"
Starting instance "ORACLE_SID"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "ORACLE_SID_STANDBYDATACENTER"
DGMGRL>
Configuration - ORACLE_SIDDR

  Protection Mode: MaxPerformance
  Databases:
    ORACLE_SID_STANDBYDATACENTER - Primary database
    ORACLE_SID_PRIMARYDATACENTER - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

Now the primary server is the one in standbyDatacenter and vice versa.

Are both servers still in sync ?
Follow these steps.
Back to normal
As oracle on standbyServer (which is now a primary server) :

Step-by-step version :

  1. dgmgrl
  2. connect sys@ORACLE_SID_PRIMARYDATACENTER
    Password: sys password
    Connected.
  3. switchover to 'ORACLE_SID_PRIMARYDATACENTER';
    Performing switchover NOW, please wait...
    New primary database "ORACLE_SID_PRIMARYDATACENTER" is opening...
    Operation requires startup of instance "ORACLE_SID" on database "ORACLE_SID_STANDBYDATACENTER"
    Starting instance "ORACLE_SID"...
    ORACLE instance started.
    Database mounted.
    Switchover succeeded, new primary is "ORACLE_SID_PRIMARYDATACENTER"
    Same remark as above about : Operation requires startup of instance "ORACLE_SID" on database ....
  4. show configuration
    Configuration - ORACLE_SIDDR
    
      Protection Mode: MaxPerformance
      Databases:
        ORACLE_SID_PRIMARYDATACENTER - Primary database
        ORACLE_SID_STANDBYDATACENTER - Physical standby database
    
    Fast-Start Failover: DISABLED
    
    Configuration Status:
    SUCCESS

Big-bang version :

oracleSid='ORACLE_SID'; currentStandbyDatacenter='PRIMARYDATACENTER'; sysPassword='password'; export ORACLE_SID="$oracleSid"; dgmgrl sys/"$sysPassword"@${oracleSid}_${currentStandbyDatacenter} << EOC
switchover to '${oracleSid}_${currentStandbyDatacenter}';
show configuration
EOC
show command output

Things are back to the initial situation.

Are both servers still in sync ?
Follow these steps.
If servers are effectively still synchronized : congratulations, this Data Guard configuration is finished !!!