Thursday, November 4, 2010

Configuring Oracle Data Guard – Physical Standby with Fast-Start Fail Over using SuSE 11 and Oracle 11g.

Home | Profile 


For this setup I used two systems with SuSE 11.2 and Oracle 11g installed on each. Installation binaries/versions and directories are identical on both servers.

Summary of Steps:

  •          Configure the Primary Database.
  •          Configure Listener/TNS on both Primary and Secondary.
  •          Configure the Standby Database.
  •          Configure Oracle Data Guard and Data Guard broker (DGMGRL)
  •          Perform Failover/Switch Over tests.

I'm running the primary database on host "silicon" and standby on host "graphics".  I also took out the prompt "SQL>" preceding the SQL statements, so it's easier for you to cut/paste. The database SID is RBPDB01.

1.    Configure the Primary Database.

1.1             Enable FLASHBACK, ARCHIVELOG and FORCE_LOGGING for the Database:
oracle@silicon:~> sqlplus / as sysdba;

startup mount;
alter database archivelog ;
alter database flashback on;
alter database open;
shutdown immediate;
startup;
alter  database force logging;

Check the status of each by running the below.

SQL> select log_mode, flashback_on, force_logging from v$database;
LOG_MODE     FLASHBACK_ON       FOR
------------ ------------------ ---
ARCHIVELOG   YES                YES

1.2            Configure Standby Redo Logs (SRL’s) on Primary Database:
As LGWR reads log buffer and writes to redo logs, a new process called LNS (Log Network Server) ships these change vectors to the standby database through Oracle Net Services which is picked up by RFS (Remote File Server) writing this to Standby Redo Logs.
If you wonder why we need this on Primary DB, the reason is when primary changes its role to standby during a failover/switchover, these are then used by RFS.

       ALTER DATABASE ADD STANDBY LOGFILE GROUP 4 ('/u01/oracle/oradata/RBPDB01/onlinelog/RBPDB01_redo04.log') SIZE 50M;
       ALTER DATABASE ADD STANDBY LOGFILE GROUP 5 ('/u01/oracle/oradata/RBPDB01/onlinelog/RBPDB01_redo05.log') SIZE 50M;
       ALTER DATABASE ADD STANDBY LOGFILE GROUP 6 ('/u01/oracle/oradata/RBPDB01/onlinelog/RBPDB01_redo06.log') SIZE 50M;

1.3            Create Password File:

oracle@silicon:~> cd /u01/oracle/product/11.1.0/db_1/dbs/
oracle@silicon:~> orapwd file=orapwRBPDB01 password=oracle entries=5                   

1.4            Create pfile if only spfile exists.

create pfile from spfile;

If you get an ORA-01565 or ORA-27037, this indicates that you have already started the             database with a pfile, ignore the error for now and proceed.

1.5            Edit the pfile  (  /u01/oracle/product/11.1.0/db_1/dbs/initRBPDB01.ora )  to include Data Guard Specific Parameters.

*.archive_lag_target = 1800
*.db_unique_name='silicon_RBPDB01'
*.log_archive_config='DG_CONFIG=(silicon_RBPDB01,graphics_RBPDB01)'
*.log_archive_dest_1='LOCATION=/u01/oracle/admin/RBPDB01/arch/ mandatory VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=silicon_RBPDB01'
*.log_archive_dest_2='service="graphcis.RBPDB01"',' LGWR SYNC AFFIRM delay=0 OPTIONAL compression=DISABLE max_failure=0 max_connections=1 reopen=60 db_unique_name="graphics_RBPDB01" net_timeout=10 valid_for=(online_logfile,primary_role)'

*.fal_client='silicon.RBPDB01'
*.fal_server='graphics.RBPDB01'

*.service_names='RBPDB01','silicon.RBPDB01'
*.dg_broker_config_file1='/u01/oracle/product/11.1.0/db_1/dbs/broker_silicon_RBPDB01_1.dat'
*.dg_broker_config_file2='/u01/oracle/product/11.1.0/db_1/dbs/broker_silicon_RBPDB01_2.dat'
*.dg_broker_start=true

**NB : Once you update init.ora with above parms, make sure there are no duplicate parameters. For e.g. If you had used DBCA to create the database and copied above to edit init.ora db_unique_name will now be a duplicate parameter!

You can lookup the definition of each parameters in here.

1.6        Bounce the database with the new pfile and update the spfile.
oracle@silicon:~> sqlplus / as sysdba;

startup force nomount pfile=$ORACLE_HOME/dbs/initRBPDB01.ora;
create spfile from  pfile;
shutdown immediate;
startup;

1.7        Create a STANDBY control file, update spfile again.    

shutdown immediate;
startup mount;
ALTER DATABASE CREATE STANDBY CONTROLFILE AS '/u01/oracle/oradata/RBPDB01/controlfile/RBPDB01_control01_stdby.ctl';
alter database open;
create pfile from spfile;
shutdown immediate;

1.8        Copy files from Primary to Standby: You can create the standby database either using RMAN DUPLICATE (new feature) or by doing  standard ftp or scp.  

cd /u01/oracle/oradata
scp -P 9999 -r RBPDB01/ oracle@192.168.5.101:/u01/oracle/oradata/
           
cd /u01/oracle/admin
scp -P 9999 -r RBPDB01/ oracle@192.168.5.101:/u01/oracle/admin/
           
cd /u01/oracle/product/11.1.0/db_1/dbs
scp -P 9999 orapwRBPDB01 oracle@192.168.5.101:/u01/oracle/product/11.1.0/db_1/dbs/

** I’m running ssh on a different port which’s why you see scp –P 9999 and also used my secondary set of NIC’s configured with static IP’s bypassing router for faster transfer. So essentially (192.168.5.101 is “graphics”). You can use ftp’s, obviously faster than scp’s instead. Anyways please pay attention to the directories and files you need to copy to standby.

2.    Configure TNSNAMES, SQLNET on both primary and secondary servers.

2.1             For the observer and also the dgmgrl to restart and communicate with instances after the fast-start failover and restarts of instances during broker operations, a static registration with the local listener is required for each instance’s. It is for the same we have to add the below to the listener file on each servers.

**Define GLOBAL_DBNAME as db_unique_name_DGMGRL.domain_name.

*************PRIMARY (silicon ) ********

--listener.ora

SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (GLOBAL_DBNAME = silicon_RBPDB01_DGMGRL.chakravyoha.com)
      (ORACLE_HOME = /u01/oracle/product/11.1.0/db_1)
      (SID_NAME = RBPDB01)
    )
  )

--tnsnames.ora

graphics.RBPDB01 =( DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=graphics)(PORT=1521))(CONNECT_DATA=(SID=RBPDB01)))

--sqlnet.ora

sqlnet.expire_time=15


*************STANDBY (graphics) ********

--listener.ora

SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (GLOBAL_DBNAME = graphics_RBPDB01_DGMGRL.chakravyoha.com)
      (ORACLE_HOME = /u01/oracle/product/11.1.0/db_1)
      (SID_NAME = RBPDB01)
    )
  )


--tnsnames.ora

silicon.RBPDB01 =( DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=silicon)(PORT=1521))(CONNECT_DATA=(SID=RBPDB01)))


--sqlnet.ora

sqlnet.expire_time=15


2.2             Start listeners on both primary and standby servers.

oracle@silicon:~>  lsnrctl start
Service "silicon_RBPDB01_DGMGRL.chakravyoha.com" has 1 instance(s).  Instance "RBPDB01", status UNKNOWN, has 1 handler(s) for this service...

oracle@graphics > lsnrctl start
Service "graphics_RBPDB01_DGMGRL.chakravyoha.com" has 1 instance(s). Instance  
"RBPDB01", status UNKNOWN, has 1 handler(s) for this service...          

3     Setup Standby Database on host “graphics”

3.1              Overlay the control file with standby control file. Make sure to copy this over to the flash recovery area as well.

       cd /u01/oracle/oradata/RBPDB01/controlfile
       cp RBPDB01_control01_stdby.ctl RBPDB01_control01.ctl

       cd /u01/oracle/flash_recovery_area/RBPDB01/controlfile
       cp /u01/oracle/oradata/RBPDB01/controlfile/RBPDB01_control01_stdby.ctl RBPDB01_control02.ctl

                        3.2        Prepare the init.ora file for standby database.

*.archive_lag_target = 1800
*.db_unique_name='graphics_RBPDB01'
*.log_archive_config='DG_CONFIG=(silicon_RBPDB01,graphics_RBPDB01)'
*.log_archive_dest_1='LOCATION=/u01/oracle/admin/RBPDB01/arch/ mandatory VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=graphics_RBPDB01'
*.log_archive_dest_2='service="silicon.RBPDB01"',' LGWR SYNC AFFIRM delay=0 OPTIONAL compression=DISABLE max_failure=0 max_connections=1 reopen=60 db_unique_name="silion_RBPDB01" net_timeout=10 valid_for=(online_logfile,primary_role)'
*.fal_client='graphics.RBPDB01'
*.fal_server='silicon.RBPDB01'
*.service_names='RBPDB01','graphics.RBPDB01'
*.dg_broker_config_file1='/u01/oracle/product/11.1.0/db_1/dbs/broker_graphics_RBPDB01_1.dat'
*.dg_broker_config_file2='/u01/oracle/product/11.1.0/db_1/dbs/broker_graphics_RBPDB01_2.dat'
*.dg_broker_start=true           


** Notice the difference in service name's between log_archive_dest_1 and log_archive_dest_2 for the primary and standby init.ora files. Once again, make sure the init.ora contains no duplicate entries with the addition of the above. For e.g db_unique_name etc.

                        3.3       Startup primary (@silicon) Database once the files are copied over.

oracle@silicon:~>  sqlplus / as sysdba;
startup;

3.4        Startup Standby Database with the edited init.ora file , refresh spfile with the            
            updated DG parameters.

oracle@graphics >  sqlplus / as sysdba;
startup mount pfile=$ORACLE_HOME/dbs/initRBPDB01.ora;
alter database flashback on;
create spfile from pfile;
shutdown immediate;

                        3.5      Startup Standby with the newly created spfile and put it in recovery mode.

oracle@graphics >  sqlplus / as sysdba;
startup mount;  

SQL> select log_mode, flashback_on, force_logging from v$database;
LOG_MODE     FLASHBACK_ON       FOR
------------ ------------------ ---
ARCHIVELOG   YES                YES

alter database recover managed standby database disconnect from session;

4           Configure DGMGRL for broker communication.

            4.1       On the primary server from unix/linux command prompt run dgmgrl.

oracle@silicon:~>     dgmgrl

In the DGMGRLprompt run the below commands.

connect sys/oracle
create configuration RBPDB01_DG as primary database is silicon_RBPDB01 connect identifier is silicon.RBPDB01;
add database graphics_RBPDB01 as connect identifier is graphics.RBPDB01 maintained as physical ;

See the output of show configuration

DGMGRL> show configuration;

Configuration - rbpdb01_dg

  Protection Mode: MaxPerformance
  Databases:
    silicon_rbpdb01  - Primary database
    graphics_rbpdb01 - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
DISABLED

enable configuration;

DGMGRL> show configuration;

Configuration - rbpdb01_dg

  Protection Mode: MaxPerformance
  Databases:
    silicon_rbpdb01  - Primary database
    graphics_rbpdb01 - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

            4.2       Enable Fast-Start failover and configure observer.
                         
              oracle@silicon:~>     dgmgrl
                         
                        In the DGMGRL prompt run the below commands.

connect sys/oracle
edit configuration set protection mode as maxavailability;
edit database silicon_RBPDB01 set property LogXptMode='SYNC' ;
edit database graphics_RBPDB01 set property LogXptMode='SYNC' ;
edit database silicon_RBPDB01 set property NetTimeout=30 ;
edit database graphics_RBPDB01 set property NetTimeout=30 ;

DGMGRL> show configuration;

Configuration - rbpdb01_dg

  Protection Mode: MaxAvailability
  Databases:
    silicon_rbpdb01  - Primary database
    graphics_rbpdb01 - Physical standby database

Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS

4.3          Start the Observer from a different dgmgrl session.

oracle@silicon:~>     dgmgrl
In the DGMGRL prompt run the below commands.
      
connect sys/oracle
start observer
                      You will not be able to get to the command prompt again from this session and this is 
                      normal. From another linux session start dgmgrl prompt for the rest of operations.

4.4          Enable FAST_START FAILOVER.

oracle@silicon:~>     dgmgrl

enable fast_start failover;


DGMGRL> enable fast_start failover;
Enabled.
DGMGRL> show configuration;

Configuration - rbpdb01_dg

  Protection Mode: MaxAvailability
  Databases:
    silicon_rbpdb01  - Primary database
    graphics_rbpdb01 - (*) Physical standby database

Fast-Start Failover: ENABLED

Configuration Status:
SUCCESS

** FYI - If you were to do a tail –f on the alert.log you can see quite a few interesting messages/communications passed between the primary and standby databases.

5           Perform SwitchOver and FailOver.

5.1          Switchover is usually a planned outage so there's no data loss.

oracle@silicon:~>     dgmgrl
      
connect sys/oracle
       show configuration;
       switchover to graphics_RBPDB01;

** I had to force-kill (kill -9) a sqlplus session once to get the switchover to continue, as it was hung on “CLOSING sessions” or “Killing Sessions”. You can do a tail –f on alert log on both instances to monitor the progress of swithover.

Once done, connect to dgmgrl again and do a show configuration. As you can see silicon_rbpdb01 has now become the standby database and graphics_rbpdb01 is now the primary database.

oracle@silicon:~>   dgmgrl
show configuration;
      
DGMGRL> connect sys/oracle
Connected.
DGMGRL> show configuration;

Configuration - rbpdb01_dg

  Protection Mode: MaxAvailability
  Databases:
    graphics_rbpdb01 - Primary database
    silicon_rbpdb01  - (*) Physical standby database

Fast-Start Failover: ENABLED

Configuration Status:
SUCCESS


5.2          Failover the primary database.

After the switch over, the primary database is now running in “graphics”.
Now kill the pmon process of the primary database running in graphics, this is to simulate a database crash. Observer will initiate a failover of the primary database to secondary automatically.

Database on graphics is down and the one on silicon is taken over as
primary. Remember the prompt where you left the observer running, if you go back to that session you will be able to see the actions performed. As you can see silicon_rpbdb01 is now the primary database.

Also go to the DGMGRL command prompt and do a show configuration.


DGMGRL> start observer
Observer started

21:16:40.86  Thursday, November 04, 2010
Initiating Fast-Start Failover to database "silicon_rbpdb01"...
Performing failover NOW, please wait...
Failover succeeded, new primary is "silicon_rbpdb01"
21:16:42.43  Thursday, November 04, 2010

oracle@silicon:~>   dgmgrl
show configuration;
        
DGMGRL> show configuration;

Configuration - rbpdb01_dg

  Protection Mode: MaxAvailability
  Databases:
    silicon_rbpdb01  - Primary database
      Warning: ORA-16817: unsynchronized fast-start failover configuration

    graphics_rbpdb01 - (*) Physical standby database (disabled)
      ORA-16661: the standby database needs to be reinstated

Fast-Start Failover: ENABLED

Configuration Status:
WARNING
WARNING is because there’s no secondary database to synchronize the now primary silicon_rbpdb01.

5.3          Go to silicon (Primary) host and mount the database now.

Assuming your database crashed in the middle of night, you can see the FAST-START fail over configuration has put you back in business. Everyone is happy and now it’s time to fix the error and bring back the standby database. 

oracle@graphics:~> sqlplus / as sysdba;
sqlplus / as sysdba;
       startup mount;


oracle@silicon:~>  dgmgrl

       connect sys/oracle
       reinstate database 'graphics_rbpdb01'

If you were to go back to the observer session, you can see the actions performed by database while re-instating the standby on graphics. Based on the messages/actions I believe there’s no need to mount the database (just I performed) above as the dataguard performed a shutdown.  

21:27:35.99  Thursday, November 04, 2010
Initiating reinstatement for database "graphics_rbpdb01"...
Reinstating database "graphics_rbpdb01", please wait...
Operation requires shutdown of instance "RBPDB01" on database "graphics_rbpdb01"
Shutting down instance "RBPDB01"...
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "RBPDB01" on database "graphics_rbpdb01"
Starting instance "RBPDB01"...
ORACLE instance started.
Database mounted.

Go to the dgmgrl prompt and run the show configuration.    

       show configuration;

DGMGRL> show configuration;

Configuration - rbpdb01_dg

  Protection Mode: MaxAvailability
  Databases:
    silicon_rbpdb01  - Primary database
    graphics_rbpdb01 - (*) Physical standby database

Fast-Start Failover: ENABLED

Configuration Status:
SUCCESS

Hope this helps. Regards, Raj.

26 comments:

  1. Awesome . Nice document.

    ReplyDelete
  2. I run command at 4.2:
    edit configuration set protection mode as maxavailability;

    throws error Error: ORA-16627: operation disallowed since no standby databases would remain to support protection mode

    although all archivelogs are apllied in standby DB
    SQL> select sequence#, applied from v$archived_log order by sequence#;

    SEQUENCE# APPLIED
    ---------- ---------
    13 YES
    14 YES
    15 YES
    16 YES
    17 YES
    18 YES
    19 YES
    20 YES
    21 YES

    Thanks,
    Dang

    ReplyDelete
  3. Hi,
    After failover, my log sequence is different from v$archived_log and archive log list. How can I make it the same?

    SQL> archive log list
    Database log mode Archive Mode
    Automatic archival Enabled
    Archive destination E:\oracle\flash_recovery_area\MMUSIC\ARCHIVELOG
    Oldest online log sequence 3
    Next log sequence to archive 5
    Current log sequence 5

    SQL> select sequence#, applied from v$archived_log order by sequence#;
    .....
    ......
    SEQUENCE# APPLIED
    ---------- ---------
    94 YES
    94 YES


    Thanks,
    Dang

    ReplyDelete
  4. what is fail fast? what is fail safe? what is difference between fail fast and fail safe? this is very important interview question. you can find the details on http://www.itsoftpoint.com/?page_id=2736

    ReplyDelete
  5. Very Nice. Really Helps alot.. :)

    ReplyDelete