Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Never Name a Database "Standby" (and nine other...

Never Name a Database "Standby" (and nine other things they don't teach in Data Guard school)

Avatar for Sean Scott

Sean Scott PRO

October 27, 2025
Tweet

More Decks by Sean Scott

Other Decks in Technology

Transcript

  1. Never Name a Database "Standby" (...and nine other things they

    don't teach in Data Guard school) CASOUG 25 - Tashkent 27 October 2025 Seàn Scott Oracle ACE Director Managing Principal Consultant Viscosity North America
  2. Database Reliability Engineering MAA ⁘ RAC ⁘ RMAN Data Guard

    ⁘ Sharding ⁘ Partitioning Information Lifecycle Management Exadata & Engineered Systems Database Modernization Upgrades ⁘ Patching ⁘ Migrations Cloud ⁘ Hybrid Automation DevOps ⁘ IaC ⁘ Containers ⁘ Terraform Vagrant ⁘ Ansible Observability AHF ⁘ TFA ⁘ CHA ⁘ CHM
  3. Tech Superstars Unite Get worldwide recognition as an Oracle ACE

    Oracle.com profile page Exclusive content Your own Oracle cloud account Swag, certification exam credit & event passes Networking events Travel support Learn more at: ace.oracle.com @oracleace Linkedin.com/groups/72183 @oracleace.bsky.social
  4. 10 Conferences 15 Presentations 5 Panels 🌐 15 Time Zones

    33 Days 54:28 Hours 15 Nations • 16 Airports • 20 Flights 43,826 km 23,664 nm and More Trains Trains Trains
  5. www.viscosityna.com @ViscosityNA Choose a "Good" DB Unique Name As a

    general rule, use the SID + a location identifier ORACLE_SID + LOCATION
  6. www.viscosityna.com @ViscosityNA Choose a "Good" DB Unique Name As a

    general rule, airport codes make good location identifiers. PRODDWSAN - San Diego PRODDWSFO - San Francisco
  7. www.viscosityna.com @ViscosityNA Choose a "Good" DB Unique Name As a

    general rule, airport codes make good location identifiers. These are easily identified at a glance (different last letters) PRODDWSAN - San Diego PRODDWSFO - San Francisco
  8. www.viscosityna.com @ViscosityNA Choose a "Good" DB Unique Name Underscores will

    help separate prefix from suffix This is helpful in mixed/overlapping environments PRODDWSAN PRODDWSFO PRODTXSAN PRODTXSFO PRODDW_SAN PRODDW_SFO PRODTX_SAN PRODTX_SFO San Diego Data Warehouse San Francisco Data Warehouse San Diego OLTP San Francisco OLTP
  9. www.viscosityna.com @ViscosityNA Choose a "Good" DB Unique Name Still, an

    airport code isn't always the best choice. PRODTX_PHX - Phoenix PRODTX_PDX - Portland
  10. www.viscosityna.com @ViscosityNA Choose a "Good" DB Unique Name OMF wasn't

    in use An RMAN duplicate attempted to restore the control files DBA thought SID2 was standby; deleted the "offending" controlfiles ORA-01503: CREATE CONTROLFILE failed ORA-00200: control file could not be created ORA-00202: control file: '/u03/SID2/controlfile/control01.ctl' ORA-27038: created file already exists Additional information: 1
  11. www.viscosityna.com @ViscosityNA Make Documentation "Copy/Paste-Proof" # This: switchover to $target_db

    switchover to <target_db> # Not: switchover to prod_stdby # Harder to spot values that must be changed!
  12. www.viscosityna.com @ViscosityNA Set an Archive Log Deletion Policy Set a

    policy that includes both: • An applied on/shipped to clause • A backed up to device condition configure archivelog deletion policy to [applied on|shipped to] all standby backed up 2 times to <device>;
  13. www.viscosityna.com @ViscosityNA shipped to alone is not a guarantee! •

    Daily log backups on primary • Hourly log purge with 'sysdate - 1/24' on primary, standby • All logs less than an hour old were deleted • Standby apply was suspended for several hours • Logs were shipped but not applied • The standby couldn't recover locally • Log backups weren't available on the primary Set an Archive Log Deletion Policy
  14. www.viscosityna.com @ViscosityNA Standby redo logs are not online redo logs!

    • They can be pulled from the primary • (Data Guard does this automatically) • Assuming there's an appropriate archive log deletion policy! • Multiplexing standby redo logs makes Data Guard slower! Don't Multiplex Standby Redo Logs
  15. www.viscosityna.com @ViscosityNA Most common recommendation from DB review: Increase redo

    log size • Strive for no more than four log switches per hour • Every log switch creates a corresponding action on the standby • For RAC environments using a single standby apply instance: • Frequent switching @ primary can saturate the standby Size Redo Logs Properly
  16. www.viscosityna.com @ViscosityNA • Checkpoints can overload a standby • If

    the alert log frequently "Checkpoint not complete" messages • Add redo log groups • Check storage performance Checkpoints and fast_start_mttr_target
  17. www.viscosityna.com @ViscosityNA Low values for fast_start_mttr_target increase checkpoint activity •

    Sets the desired mean time to recover • Limits roll-forward after instance failure by issuing checkpoints • Forces DBWR to flush dirty blocks to disk • If set to 0, like issuing a checkpoint after every commit! • Oracle recommendations (from MOS Note 1095774.1) • Minimum of 300 • 3600 or the desired Recovery Time Objective Checkpoints and fast_start_mttr_target
  18. www.viscosityna.com @ViscosityNA • When primary & standby lose contact, the

    primary still advances • Initiating a failover will incur data loss (redo not shipped) • Reinstating the failed primary requires rolling back lost transactions • ...by using Flashback • ...by completely rebuilding the former primary • Flashback is a Data Guard requirement! Turn on Flashback!
  19. www.viscosityna.com @ViscosityNA • Without Flashback, a partial restore may be

    possible if: • Primary and standby were at the same SCN at failover • No changes were made at the primary after failover Recovering from a Failover without Flashback
  20. www.viscosityna.com @ViscosityNA Recovering from a Failover without Flashback Identify the

    failover SCN for performing a partial incremental backup Without a format directive, the backup will use ORACLE_HOME/dbs backup device type disk incremental from scn XXXXXX database format '/tmp/incr_standby_%U';
  21. www.viscosityna.com @ViscosityNA Recovering from a Failover without Flashback Copy and

    catalog the backupset to the failed database catalog start with '/some/directory/';
  22. www.viscosityna.com @ViscosityNA Recovering from a Failover without Flashback Recover the

    failed database using the noredo option RMAN automatically uses the special incremental backup recover database noredo;
  23. www.viscosityna.com @ViscosityNA Comment Parameter Changes Helps people understand the parameter

    is Data Guard related Future you thanks you! alter system set parameter=value comment='For Data Guard' scope=both sid='*';
  24. www.viscosityna.com @ViscosityNA Use the Force Broker, Luke! edit database DB_UNQNAME

    set state='APPLY-OFF'; alter database recover managed standby database cancel;
  25. www.viscosityna.com @ViscosityNA • Changes to a configuration aren't immediately active

    • They must be applied via enable configuration Enable the Configuration Following Changes
  26. www.viscosityna.com @ViscosityNA Export, Edit, & Import Configurations as XML DGMGRL>

    export configuration to my_config.txt DGMGRL> import configuration my_config.txt
  27. www.viscosityna.com @ViscosityNA Export, Edit, & Import Configurations as XML File

    location is the Diagnostic trace directory (same as alert log). You cannot change this! DGMGRL> export configuration to '/home/oracle/dgcdb.txt'; ORA-16540: invalid argument ORA-06512: at "SYS.DBMS_DRS", line 1947 ORA-06512: at line 1
  28. www.viscosityna.com @ViscosityNA <?xml version="1.0" encoding="UTF-8"?> <DRC Version="19.0.0.0.0" CurrentPath="True" Name="dgcdb"> <DefaultState>ONLINE</DefaultState>

    <DRC_UNIQUE_ID>705151955</DRC_UNIQUE_ID> ... <Member MemberID="1" CurrentPath="True" Enabled="True" MultiInstanced="True" Name="dgcdb_san"> <IntendedState>PRIMARY</IntendedState> <DefaultState>PRIMARY</DefaultState> <Status> <Severity>Success</Severity> <Error>0</Error> <Timestamp>1750881160</Timestamp> </Status> <StandbyType>PhysicalStandby</StandbyType> <DGConnectIdentifier>DGSAN.orapub.com:1521/dgcdb_san</DGConnectIdentifier> <DbDomain>orapub.com</DbDomain> <ResourceType>Database</ResourceType> <Instance InstanceID="1" CurrentPath="True" Enabled="True" MultiInstanced="True" DefaultWriteOnce="True" Name="dgcdb"> <PlannedState/> <HostName Default="True">DGSAN.orapub.com</HostName> <StaticConnectIdentifier Default="True">(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=DGSAN.orapub.com)(PORT=1521)) (CONNECT_DATA=(SERVICE_NAME=dgcdb_san_DGMGRL.orapub.com)(INSTANCE_NAME=dgcdb)(SERVER=DEDICATED))) </StaticConnectIdentifier>
  29. www.viscosityna.com @ViscosityNA Rules for the DGConnectIdentifier • All members must

    be able to reach every other member • Members must be able to reach themselves • Allow connections to all instances of RAC databases • Uses a service that dynamically registers with listeners • Allows connect-time failover on RAC • Must not be defined or managed by Clusterware • Failover attributes allow Redo Transport to ship to any RAC instance DGConnectIdentifier
  30. www.viscosityna.com @ViscosityNA Instance-speci fi c setting Data Guard uses to

    start databases • Used for switchover, convert, and reinstate operations • This connection is exclusively for the Data Guard Broker! • Oracle sets StaticConnectIdentifier using local_listener • ...appends _DGMGRL to the service • Oracle automatically manages this value unless users change: • The Broker's StaticConnectIdentifier • The local_listener parameter StaticConnectIdentifier
  31. www.viscosityna.com @ViscosityNA Risks associated with using TNS aliases in Data

    Guard environments • Using TNS aliases adds a dependency on the local tnsnames.ora • Users may not realize/remember the aliases used by Data Guard • tnsnames.ora may be updated frequencly, not under change control • Transport may survive changes; role transitions may not • iFiles add another layer of risk • local_listener changes prevent Broker from maintaining StaticConnectIdentifier Data Guard and EZConnect
  32. www.viscosityna.com @ViscosityNA Data Guard and EZConnect EZConnect strings disconnect Broker

    configurations from TNS DGMGRL> show database verbose dgcdb_san Database - dgcdb_san Role: PRIMARY Intended State: TRANSPORT-ON Instance(s): dgcdb Properties: DGConnectIdentifier = 'DGSAN.orapub.com:1521/dgcdb_san' ...
  33. www.viscosityna.com @ViscosityNA • NEVER use the StaticConnectIdentifier for other purposes,

    including: • The DGConnectIdentifier • RMAN connections (eg, duplicate) • OEM or monitoring • NEVER define static entries for the Broker's _DGB service! • NEVER create TNS entries for the _DGB or _DGMGRL services! • NEVER register Data Guard connections with a SCAN listener! Data Guard Connection No-Nos
  34. www.viscosityna.com @ViscosityNA Shipping and apply continue working; role transitions fail

    • Using the wrong Connect Identifiers • TNS changes to aliases used for Connect Identifiers • Incorrect static listener definitions • Creating TNS aliases for reserved Data Guard services • Manually configuring or changing log_archive_dest_n parameters • Adding Data Guard services to Clusterware • Changes to the local_listener parameter How to Quietly Break Data Guard
  35. www.viscosityna.com @ViscosityNA Useful show commands show configuration verbose; show configuration

    lag verbose; show configuration when primary is <DB_UNQNAME>; show database verbose <DB_UNQNAME>; show database <DB_UNQNAME> logxptstatus; show database <DB_UNQNAME> InconsistentProperties; show database <DB_UNQNAME> InconsistentLogXptProps; show instance verbose '<INSTANCE>' on database <DB_UNQNAME>;
  36. www.viscosityna.com @ViscosityNA show configuration vs. show configuration lag DGMGRL> show

    configuration verbose; Configuration - dgcdb Protection Mode: MaxPerformance Members: dgcdb_san - Primary database dgcdb_lax - Physical standby database DGMGRL> show configuration lag verbose; Configuration - dgcdb Protection Mode: MaxPerformance Members: dgcdb_san - Primary database dgcdb_lax - Physical standby database Transport Lag: 0 seconds (computed 0 seconds ago) Apply Lag: 0 seconds (computed 0 seconds ago)
  37. www.viscosityna.com @ViscosityNA show configuration when primary is ... DGMGRL> show

    configuration when primary is dgcdb_san Configuration when dgcdb_san is primary - dgcdb Members: dgcdb_san - Primary database dgcdb_lax - Physical standby database DGMGRL> show configuration when primary is dgcdb_lax Configuration when dgcdb_lax is primary - dgcdb Members: dgcdb_lax - Primary database dgcdb_san - Physical standby database
  38. www.viscosityna.com @ViscosityNA Useful validate commands validate database verbose <PRIMARY>; validate

    database verbose <STANDBY>; validate database verbose <STANDBY> spfile; validate network configuration for all; validate static connect identifier for all; validate dgconnectidentifier <CONNECT_IDENTIFIER>;
  39. www.viscosityna.com @ViscosityNA validate database verbose <primary> DGMGRL> validate database verbose

    dgcdb_san Database Role: Primary database Ready for Switchover: Yes Flashback Database Status: dgcdb_san: On Capacity Information: Database Instances Threads dgcdb_san 1 1 Managed by Clusterware: dgcdb_san: NO Validating static connect identifier for the primary database dgcdb_san... Connecting to instance "dgcdb_san" on database "dgcdb_san" ... Connected to "dgcdb_san" Succeeded.
  40. www.viscosityna.com @ViscosityNA validate database verbose <standby> DGMGRL> validate database verbose

    dgcdb_lax Database Role: Physical standby database Primary Database: dgcdb_san Ready for Switchover: Yes Ready for Failover: Yes (Primary Running) Flashback Database Status: dgcdb_san: On dgcdb_lax: On ... Standby Apply-Related Information: Apply State: Running Apply Lag: 0 seconds (computed 1 second ago) Apply Delay: 0 minutes
  41. www.viscosityna.com @ViscosityNA validate database verbose <standby> (cont) Transport-Related Information: Transport

    On: Yes Gap Status: No Gap Transport Lag: 0 seconds (computed 1 second ago) Transport Status: Success ... Current Log File Groups Configuration: Thread # Online Redo Log Groups Standby Redo Log Groups Status (dgcdb_san) (dgcdb_lax) 1 7 8 Sufficient SRLs Future Log File Groups Configuration: Thread # Online Redo Log Groups Standby Redo Log Groups Status (dgcdb_lax) (dgcdb_san) 1 7 8 Sufficient SRLs
  42. www.viscosityna.com @ViscosityNA validate database verbose <standby> spfile DGMGRL> validate database

    verbose dgcdb_lax spfile; Command requires a connection that uses database or external credentials. DGMGRL> connect [email protected]:1521/dgcdb_lax Password: Connected to "dgcdb_lax" Connected as SYSDG.
  43. www.viscosityna.com @ViscosityNA validate database verbose <standby> spfile DGMGRL> validate database

    verbose dgcdb_lax spfile; Connecting to "dgcdb_san". Connected to "dgcdb_san" Connecting to "dgcdb_lax". Connected to "dgcdb_lax" Parameter Settings: audit_file_dest: dgcdb_san (PRIMARY) : /u01/app/oracle/admin/dgcdb_san/adump dgcdb_lax : /u01/app/oracle/admin/dgcdb_lax/adump audit_sys_operations: dgcdb_san (PRIMARY) : false dgcdb_lax : false ...
  44. www.viscosityna.com @ViscosityNA • Data Guard is broken if show/validate reports

    anything but SUCCESS • There is no situation where a status of WARNING or ERROR is OK! • SUCCESS from individual components ≠ SUCCESS from all • A configuration can succeed even though a database doesn't • You must check status in the Broker for ALL members! • SUCCESS in the Broker still doesn't guarantee everything is OK! Accept Nothing Less Than SUCCESS
  45. www.viscosityna.com @ViscosityNA When SUCCESS = FAILURE DGMGRL> show configuration Configuration

    - dgcdb Protection Mode: MaxPerformance Members: dgcdb_san - Primary database dgcdb_lax - Physical standby database Fast-Start Failover: Disabled Configuration Status: SUCCESS (status updated 47 seconds ago)
  46. www.viscosityna.com @ViscosityNA When SUCCESS = FAILURE DGMGRL> show database dgcdb_lax

    Database - dgcdb_lax Role: PHYSICAL STANDBY Intended State: APPLY-ON Transport Lag: 0 seconds (computed 0 seconds ago) Apply Lag: 0 seconds (computed 0 seconds ago) Average Apply Rate: 125.00 KByte/s Real Time Query: OFF Instance(s): dgcdb Database Status: SUCCESS
  47. www.viscosityna.com @ViscosityNA When SUCCESS = FAILURE DGMGRL> show database verbose

    dgcdb_san Database - dgcdb_san ... Log file locations: Alert log : /u01/app/oracle/diag/rdbms/dgcdb_san/dgcdb/trace/alert_dgcdb.log Data Guard Broker log : /u01/app/oracle/diag/rdbms/dgcdb_san/dgcdb/trace/drcdgcdb.log Database Status: SUCCESS
  48. www.viscosityna.com @ViscosityNA When SUCCESS = FAILURE Yet the Broker log

    shows: Data Guard Broker Status Summary: Type Name Severity Status Configuration dgcdb Warning ORA-16608: one or more members have warnings Primary Database dgcdb_san Success ORA-0: normal, successful completion Physical Standby Database dgcdb_lax Warning ORA-16809: multiple warnings detected for the member
  49. www.viscosityna.com @ViscosityNA • Redo shipping and apply continues normally, without

    lag • Usually occurs following: • Host migration • Database upgrade • Changes to: • global_name, domain_name • log_archive_dest_* show = SUCCESS, log = FAILURE?
  50. www.viscosityna.com @ViscosityNA Solution • validate database verbose <standby> spfile •

    Check and correct inconsistencies • Export the configuration • Check for bad entries in the XML, import fixes, enable the config • Drop and recreate the configuration • Dropping the config shouldn't remove redo routes; ship/apply will continue show = SUCCESS, log = FAILURE?
  51. www.viscosityna.com @ViscosityNA An example of why you can't trust everything

    on the web! • Customer running EBS with Active Data Guard for eight years • They performed semi-annual switchover test • Switched to standby and immediately returned • Confirmed connections with simple tests • No data validation performed in the new database Don't Reinvent the Wheel!
  52. www.viscosityna.com @ViscosityNA Don't Reinvent the Wheel! -- From the standby

    alert log: Fri Mar 15 15:43:13 2019 ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION Attempt to start background Managed Standby Recovery process (xxx) Fri Mar 15 15:43:13 2019 MRP0 started with pid=72, OS id=9446 MRP0: Background Managed Standby Recovery process started (xxx) started logmerger process Fri Mar 15 15:43:18 2019 Managed Standby Recovery not using Real Time Apply Warning: Datafile 1 (/u02/app/oracle/oradata/xxx/XXX/datafile/o1_mf_system_u3thdhe7_.dbf) is infinitely media recovery fuzzy <snip> Media Recovery Log /u03/app/oracle/fast_recovery_area/XXXXXX/archivelog/2019_03_15/o1_mf_1_24590_g8r20y3w_.arc Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION
  53. www.viscosityna.com @ViscosityNA Don't Reinvent the Wheel! Tue Jun 25 13:40:46

    2019 alter database open Data Guard Broker initializing... Data Guard - stopping apply to allow Active Data Guard enabled database to open ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL Tue Jun 25 13:40:46 2019 MRP0: Background Media Recovery cancelled with status 16037 <snip> Some recovered datafiles maybe left media fuzzy Media recovery may continue but open resetlogs may fail Tue Jun 25 13:40:55 2019 MRP0: Background Media Recovery process shutdown (xxx) Tue Jun 25 13:40:56 2019 Managed Standby Recovery Canceled (xxx) Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL Data Guard Broker initialization complete
  54. www.viscosityna.com @ViscosityNA Don't Reinvent the Wheel! Beginning Standby Crash Recovery.

    Serial Media Recovery started Managed Standby Recovery starting Real Time Apply Warning: Datafile 1 (/u02/app/oracle/oradata/xxx/XXX/datafile/o1_mf_system_u3thdhe7_.dbf) is infinitely media recovery fuzzy Standby database will not open with this datafile online! Standby Crash Recovery aborted due to error 10554. Errors in file /logs/ora/diag/rdbms/xxx/xxx/trace/xxx_ora_725.trc: ORA-10554: Media recovery failed to bring datafile 1 to a consistent point ORA-01110: data file 1: '/u02/app/oracle/oradata/xxx/XXX/datafile/o1_mf_system_u3thdhe7_.dbf' Completed Standby Crash Recovery. Errors in file /logs/ora/diag/rdbms/xxx/xxx/trace/xxx_ora_725.trc: ORA-10458: standby database requires recovery ORA-01196: file 1 is inconsistent due to a failed media recovery session ORA-01110: data file 1: '/u02/app/oracle/oradata/xxx/XXX/datafile/o1_mf_system_u3thdhe7_.dbf' ORA-10458 signalled during: alter database open ... Tue Jun 25 13:41:07 2019 alter database open resetlogs Data Guard Broker initializing... Data Guard Broker initialization complete ORA-1666 signalled during: alter database open resetlogs...
  55. www.viscosityna.com @ViscosityNA An example of why you can't trust everything

    on the web! • Switchover and connection tests "worked" • TNS errors in the alert log were in a monitoring ignore list • The "fuzzy datafile" messages aren't "ORA-XXXXX" errors • The Broker log wasn't monitored or checked • DBAs weren't using the Broker's built-in lag monitoring • Instead, they monitored lag using a query from a blog post • See: https://oraclesean.com/blog/how-not-to-find-data-guard-gaps Don't Reinvent the Wheel!