本博文出自 博客客居天涯博主,有任何问题请进入博主页面互动讨论! 博文地址:http://tiany.blog.51cto.com/513694/1617646
Oracle DG(Dataguard)是目前比较常见的数据库HA配置策略。通过实现Physical Standby和Logical Standby,可以实现数据冗余容错机制。防止在主库出现严重故障,不能支持服务的时候,没有快速的后备支持环境。
在DG中,switchover和failover是两个重要的概念,也是DG实现的核心。两者共同点都是Primary和Standby角色切换,差异在于Planned和UnPlanned之分。Switchover关键点在于Planned,这个切换动作是在运维机构规划范围内的动作。比如,进行定期系统软硬件升级、设备维修等动作。而Failover是真正出现严重系统故障,如数据库宕机、软硬件故障导致的Primary不能支持服务,从而进行的切换动作。
根据不同的DG配置,switchover和failover也是有差异的。理论上,Switchover是不会造成数据丢失的,Primary在切换之后也是在DG配置环境中,作为Standby存在的。但是Failover则不同,除了运行在***保护(Maximum Protection)模式下,Primary突发的故障可能引起一部分Redo Log不能及时的传递到Standby端,切换之后很可能有数据损失的情况。更重要的是,Primary端在发生Failover之后,是不能够直接加入回DG配置的!也就是说,Failover之后,Primary实际上就是被“抛出”了DG环境。
那么,有什么方法实现Primary回到原有的环境呢?这个问题的困难在于保持Primary和Standby一致。在正常情况下,Primary和Standby之间是关联同步的,即使发生了Switchover,也在可控情况下。Failover过程中有数据的缺失,还有Primary修复问题。在目前流行版本(11g)中,有三个方法:
- 环境重建:一种最简单的方法就是直接删除原来的Primary库,引用DG重建方法,重新搭建Standby端;
- RMAN备份恢复:如果Primary端保留过一份Failover之前的备份,则可以强制原来的Primary端恢复到进行Failover的时间点,之后作为Standby接收当前Primary的redo log传递,应用后可以跟上进度;
- Flashback Database恢复:Flashback技术是作为传统备份还原技术的补充,提供了更加便捷的恢复策略。使用flashback,可以将数据库恢复到failover之前的时间点。之后的过程和RMAN备份恢复策略相同;
案例分析:
一、在主库端模拟数据库意外宕机
7scott@bjdb>conn/assysdba Connected. sys@bjdb>altersystemswitchlogfile; Systemaltered. sys@bjdb>shutdownabort ORACLEinstanceshutdown.
二、在备库端
1、查看切换信息
5sys@shdb>selectname,database_role,switchover_statusfromv$database; NAMEDATABASE_ROLESWITCHOVER_STATUS --------------------------------------------- TESTDB12PHYSICALSTANDBYNOTALLOWED 可以看到此时备库处于无法切换状态
2、直接切换
sys@shdb>alterdatabasecommittoswitchovertoprimary; alert_log:(告警日志) FatalNIconnecterror12514,connectingto: (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=shsrv)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=shdb)(CID=(PROGRAM=oracle)(HOST=bjsrv)(USER=oracle)))) VERSIONINFORMATION: TNSforLinux:Version11.2.0.3.0-Production TCP/IPNTProtocolAdapterforLinux:Version11.2.0.3.0-Production Time:04-MAR-201521:25:13 Tracingnotturnedon. Tnserrorstruct: nsmainerrcode:12564 TNS-12564:TNS:connectionrefused nssecondaryerrcode:0 ntmainerrcode:0 ntsecondaryerrcode:0 ntOSerrcode:0 Error12514receivedloggingontothestandby FAL[client,MRP0]:Error12514connectingtoshdbforfetchinggapsequence WedMar0421:26:002015 alterdatabasecommittoswitchovertoprimary ALTERDATABASESWITCHOVERTOPRIMARY(TestDB12) Maximumwaitforroletransitionis15minutes. Switchover:Mediarecoveryisstillactive Databasenotavailableforswitchover End-Of-REDOarchivedlogfilehasnotbeenrecovered Databasenotavailableforswitchover End-Of-REDOarchivedlogfilehasnotbeenrecovered Databasenotavailableforswitchover
3、关闭standby MPR进程
35sys@shdb>ALTERDATABASERECOVERmanagedstandbydatabasefinish; ALTERDATABASERECOVERmanagedstandbydatabasefinish TerminalRecovery:requestposted(TestDB12) WedMar0421:34:342015 Begin:StandbyRedoLogfilearchival End:StandbyRedoLogfilearchival TerminalRecoverytimestampis'03/04/201521:34:34' TerminalRecovery:applyingstandbyredologs. TerminalRecovery:thread1seq#34redorequired MediaRecoveryWaitingforthread1sequence34 TerminalRecovery:End-Of-Redologallocation TerminalRecovery:standbyredologfile4created'/dsk4/arch_bj/arch_1_0_820054583.log' Thisstandbyredologfileisbeingcreatedaspartofthe failoveroperation.Thisstandbyredologfileshouldbe deletedaftertheswitchovertoprimaryoperationcompletes. MediaRecoveryLog/dsk4/arch_bj/arch_1_0_820054583.log TerminalRecovery:log4reservedforthread1sequence34 RecoveryofOnlineRedoLog:Thread1Group4Seq34Readingmem0 Mem#0:/dsk4/arch_bj/arch_1_0_820054583.log IdentifiedEnd-Of-Redo(failover)forthread1sequence34atSCN0xffff.ffffffff IncompleteRecoveryapplieduntilchange1234252time03/04/201521:23:43 MRP0:MediaRecoveryComplete(TestDB12) TerminalRecovery:successfulcompletion WedMar0421:34:352015 ARCH:Archivalstopped,erroroccurred.Willcontinueretrying ORACLEInstanceTestDB12-ArchivalError ORA-16014:log4sequence#34notarchived,noavailabledestinations ORA-00312:onlinelog4thread1:'/dsk4/arch_bj/arch_1_0_820054583.log' ForcingARSCNtoIRSCNforTR0:1234252 Attempttosetlimboarscn0:1234252irscn0:1234252 ResettingstandbyactivationID2865247982(0xaac836ee) MRP0:BackgroundMediaRecoveryprocessshutdown(TestDB12) TerminalRecovery:completiondetected(TestDB12) Completed:ALTERDATABASERECOVERmanagedstandbydatabasefinish
4、切换数据库到Primary
sys@shdb>selectstatusfromv$instance; STATUS ------------ OPEN sys@shdb>selectname,database_role,switchover_statusfromv$database; NAMEDATABASE_ROLESWITCHOVER_STATUS --------------------------------------------- TESTDB12PHYSICALSTANDBYTOPRIMARY sys@shdb>alterdatabasecommittoswitchovertoprimary; Databasealtered. sys@shdb>alterdatabaseopen; Databasealtered. 告警日志: alterdatabasecommittoswitchovertoprimary ALTERDATABASESWITCHOVERTOPRIMARY(TestDB12) Maximumwaitforroletransitionis15minutes. Alldispatchersandsharedserversshutdown CLOSE:killingserversessions. CLOSE:allsessionsshutdownsuccessfully. WedMar0421:35:472015 SMON:disablingcacherecovery Backupcontrolfilewrittentotracefile/u01/app/oracle/diag/rdbms/bjdb/TestDB12/trace/TestDB12_ora_3146.trc StandbyterminalrecoverystartSCN:1234251 RESETLOGSafterincompleterecoveryUNTILCHANGE1234252 Onlinelog/dsk2/oradata/bjdb/redo01b.log:Thread1Group1waspreviouslycleared Onlinelog/dsk1/oradata/bjdb/redo01a.log:Thread1Group1waspreviouslycleared Onlinelog/dsk2/oradata/bjdb/redo02b.log:Thread1Group2waspreviouslycleared Onlinelog/dsk1/oradata/bjdb/redo02a.log:Thread1Group2waspreviouslycleared Onlinelog/dsk2/oradata/bjdb/redo03b.log:Thread1Group3waspreviouslycleared Onlinelog/dsk1/oradata/bjdb/redo03a.log:Thread1Group3waspreviouslycleared StandbybecameprimarySCN:1234250 WedMar0421:35:472015 Settingrecoverytargetincarnationto3 AUDIT_TRAILinitializationparameterischangedbacktoitsoriginalvalueasspecifiedintheparameterfile. Switchover:Complete-Databasemountedasprimary Completed:alterdatabasecommittoswitchovertoprimary
三、原主库修复后,开机
sys@bjdb>startup ORACLEinstancestarted. TotalSystemGlobalArea442601472bytes FixedSize2229184bytes VariableSize281021504bytes DatabaseBuffers155189248bytes RedoBuffers4161536bytes Databasemounted. Databaseopened. sys@bjdb>selectname,database_role,switchover_statusfromv$database; NAMEDATABASE_ROLESWITCHOVER_STATUS --------------------------------------------- TESTDB12PRIMARYFAILEDDESTINATION
现在原来的主库被修复后,整个DataGuara架构已经被破坏了,所以必须把原来的主库构建成新的备库,重新恢复DataGuard的环境。
四、重新构建DataGuard
1sys@bjdb>selectname,database_rolefromv$database;
NAME DATABASE_ROLE
————————————————– —————-
TESTDB12 PHYSICAL STANDBY