360如何检查磁盘损坏恢复损坏的表决磁盘和OCR

Oracle Database 12c RAC损坏ocr和votedisk恢复实验-mysql教程-PHP中文网QQ群微信公众号还没有收藏Oracle Database 12c RAC损坏ocr和votedisk恢复实验今天在Oracle 12c RAC下进行了破坏ocr和votedisk之后恢复的实验,基本和11g RAC相差无异,下面将实验过程分享一下。 实验环境:2-NODES Oracle Database 12c RAC on Linux6(OEL 6.4) 查看表决磁盘和Ocr相关信息 [root@12crac1 ~]# cd /u01/app/12.1.0/grid/b
今天在Oracle 12c RAC下进行了破坏ocr和votedisk之后恢复的实验,基本和11g RAC相差无异,下面将实验过程分享一下。
实验环境:2-NODES Oracle Database 12c RAC on Linux6(OEL 6.4)
查看表决磁盘和Ocr相关信息
[root@12crac1 ~]# cd /u01/app/12.1.0/grid/bin/
[root@12crac1 bin]# ./crsctl query css votedisk
File Universal Id
File Name Disk group
-----------------
--------- ---------
d883c23a7bfc4fdcbf418c9f631bd0af (/dev/asm-crs) [RACCRS]
Located 1 voting disk(s).
[root@12crac1 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
Total space (kbytes)
Used space (kbytes)
Available space (kbytes) :
Device/File Name
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
查看当前ocr备份情况
[root@12crac1 bin]# ./ocrconfig -showbackup
/u01/app/12.1.0/grid/cdata/scan12c/backup00.ocr
/u01/app/12.1.0/grid/cdata/scan12c/backup01.ocr
/u01/app/12.1.0/grid/cdata/scan12c/backup02.ocr
/u01/app/12.1.0/grid/cdata/scan12c/day.ocr
/u01/app/12.1.0/grid/cdata/scan12c/week.ocr
/u01/app/12.1.0/grid/cdata/scan12c/backup_856.ocr
/u01/app/12.1.0/grid/cdata/scan12c/backup_940.ocr
可以如下方式进行手工备份
[root@12crac1 bin]# ./ocrconfig -local -manualbackup
/u01/app/12.1.0/grid/cdata/12crac1/backup_510.olr
/u01/app/12.1.0/grid/cdata/12crac1/backup_939.olr
查看RAC资源服务状态
[grid@12crac1 ~]$ crsctl stat
--------------------------------------------------------------------------------
State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
ora.RACCRS.dg
ora.RACDATA.dg
ora.RACFRA.dg
Started,STABLE
Started,STABLE
ora.net1.network
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.12crac1.vip
ora.12crac2.vip
ora.LISTENER_SCAN1.lsnr
ora.LISTENER_SCAN2.lsnr
ora.LISTENER_SCAN3.lsnr
ora.MGMTLSNR
169.254.88.173 192.1
68.80.150,STABLE
ora.luocs12c.db
Open,STABLE
Open,STABLE
ora.mgmtdb
Open,STABLE
ora.scan1.vip
ora.scan2.vip
ora.scan3.vip
--------------------------------------------------------------------------------
用ASMCMD的md_backup命令备份磁盘组,顺便查看该磁盘组里都存放什么。
[root@12crac2 ~]# su - grid
[grid@12crac2 ~]$ asmcmd -p
ASMCMD [+] & md_backup /home/grid/ocrvote.bak -G RACCRS
Disk group metadata to be backed up: RACCRS
Current alias directory path: scan12c
Current alias directory path: ASM
Current alias directory path: _MGMTDB/CONTROLFILE
Current alias directory path: _MGMTDB/TEMPFILE
Current alias directory path: ASM/PASSWORD
Current alias directory path: _MGMTDB/ONLINELOG
Current alias directory path: _MGMTDB
Current alias directory path: scan12c/OCRFILE
Current alias directory path: _MGMTDB/DATAFILE
Current alias directory path: scan12c/ASMPARAMETERFILE
Current alias directory path: _MGMTDB/PARAMETERFILE
-- 从这里可以看出,在Oracle 12c RAC中,存放ocr的磁盘组里多了不少文件,有_MGMTDB相关文件以及ASM的PASSWORD。
下面是11g RAC中存放OCR的磁盘组内容
ASMCMD [+] & md_backup /home/grid/ocrvote.bak -G hk_crs
Disk group metadata to be backed up: HK_CRS
Current alias directory path: racscan/OCRFILE
Current alias directory path: racscan
Current alias directory path: racscan/ASMPARAMETERFILE
也可以导出ocr的内容
[root@12crac1 bin]# ./ocrconfig -export /home/grid/ocr.bak
以下方式都无法删除当前使用的ocr内容
ASMCMD [+] & rm -rf /raccrs/scan12c/ocrfile
ORA-29261: bad argument
ORA-15178: directory 'ocrfile' cannot drop this directory
ORA-15028: ASM file '+RACCRS.255.' currently being accessed
ORA-06512: at line 4 (DBD ERROR: OCIStmtExecute)
ASMCMD [+] & cd /raccrs/scan12c/ocrfile
ASMCMD [+raccrs/scan12c/ocrfile] & ls
REGISTRY.255.
ASMCMD [+raccrs/scan12c/ocrfile] & rm -rf REGISTRY.255.
ORA-15032: not all alterations performed
ORA-15028: ASM file '+raccrs/scan12c/ocrfile/REGISTRY.255.' currently being accessed (DBD ERROR: OCIStmtExecute)
那我们可以破坏存放ocr的设备文件
[root@12crac1 bin]# dd if=/dev/zero of=/dev/sdg bs=1024k count=1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0. s, 312 MB/s
然后停止集群:
[root@12crac1 bin]# ./crsctl stop has
[root@12crac2 bin]# ./crsctl stop has -f
或crsctl stop crs [-f]也可以
尝试启动clusterware,发现clusterware无法正常启动
[root@12crac1 bin]# ./crsctl start has
CRS-4123: Oracle High Availability Services has been started.
[grid@12crac1 ~]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager
查看集群日志:
23:46:01.413:
[ohasd(18692)]CRS-0714:Oracle Clusterware Release 12.1.0.1.0 - Production Copyright
Oracle. All rights reserved.
23:46:01.451:
[ohasd(18692)]CRS-2112:The OLR service started on node 12crac1.
23:46:01.494:
[ohasd(18692)]CRS-1301:Oracle High Availability Service started on node 12crac1.
23:46:01.498:
[ohasd(18692)]CRS-8017:location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred
23:46:10.768:
[gpnpd(19041)]CRS-2328:GPNPD started on node 12crac1.
23:46:42.712:
[cssd(19212)]CRS-1713:CSSD daemon is started in hub mode
23:46:43.221:
[cssd(19212)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 Details at (:CSSNM00070:) in /u01/app/12.1.0/grid/log/12crac1/cssd/ocssd.log
23:46:44.142:
[ohasd(18692)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE
23:46:44.143:
[ohasd(18692)]CRS-2769:Unable to failover resource 'ora.diskmon'.
23:46:58.280:
[cssd(19212)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 Details at (:CSSNM00070:) in /u01/app/12.1.0/grid/log/12crac1/cssd/ocssd.log
查看/u01/app/12.1.0/grid/log/12crac1/cssd/ocssd.log日志
23:48:13.450: [
GPNP][]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2207] get-profile call to url "ipc://GPNPD_12crac1" disco "" [f=0 claimed- host: cname: cguid: cli:gpnp p:19212 role: seq: ep: auth: diag:[]]
23:48:13.476: [
GPNP][]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2360] Result: (0) CLSGPNP_OK. Successful get-profile CALL to remote "ipc://GPNPD_12crac1" disco ""
23:48:13.477: [
CSSD][]clssnmReadDiscoveryProfile: voting file discovery string(/dev/asm*)
23:48:13.478: [
CSSD][]clssnmvDDiscThread: using discovery string /dev/asm* for initial discovery
23:48:13.478: [
SKGFD][]Discovery with str:/dev/asm*:
23:48:13.478: [
SKGFD][]UFS discovery with :/dev/asm*:
23:48:13.491: [
SKGFD][]Fetching UFS disk :/dev/asm-data:
23:48:13.491: [
SKGFD][]Fetching UFS disk :/dev/asm-fra:
23:48:13.492: [
SKGFD][]Fetching UFS disk :/dev/asm-crs:
23:48:13.492: [
SKGFD][]Fetching UFS disk :/dev/asm-extcrs:
23:48:13.492: [
SKGFD][]Fetching UFS disk :/dev/asm:
23:48:13.492: [
SKGFD][]OSS discovery with :/dev/asm*:
23:48:13.495: [
SKGFD][]Handle 0x7f8c from lib :UFS:: for disk :/dev/asm-data:
23:48:13.498: [
SKGFD][]Handle 0x7f8c from lib :UFS:: for disk :/dev/asm-fra:
23:48:13.500: [
SKGFD][]Handle 0x7f8c from lib :UFS:: for disk :/dev/asm-crs:
23:48:13.501: [
SKGFD][]Handle 0x7f8c from lib :UFS:: for disk :/dev/asm-extcrs:
23:48:13.501: [
SKGFD][]Lib :UFS:: closing handle 0x7f8c for disk :/dev/asm-data:
23:48:13.501: [
SKGFD][]Lib :UFS:: closing handle 0x7f8c for disk :/dev/asm-fra:
23:48:13.501: [
SKGFD][]Lib :UFS:: closing handle 0x7f8c for disk :/dev/asm-crs:
23:48:13.502: [
SKGFD][]Lib :UFS:: closing handle 0x7f8c for disk :/dev/asm-extcrs:
23:48:13.503: [
CSSD][]clssnmvDiskVerify: Successful discovery of 0 disks
23:48:13.503: [
CSSD][]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
23:48:13.503: [
CSSD][]clssnmvFindInitialConfigs: No voting files found
23:48:13.503: [
CSSD][](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds
– 我们可以看出表决磁盘无法找到等报错信息。
下面将集群关闭,尝试恢复。
[root@12crac1 bin]# ./crsctl stop has -f
ocr和vote disk损坏恢复步骤大致如下:
1)停止所有节点clusterware
# crsctl stop crs
# crsctl stop crs -f
2)以root用户在其中一个节点度扎模式启动clusterware
# crsctl start crs -excl -nocrs
备注:如果发现crsd在运行,那么通过如下命令将之停止。
# crsctl stop resource ora.crsd -init
3)创建新的存放ocr和vote disk的磁盘组,磁盘组名和原有的一致(如果想改变位置,需修改/etc/oracle/ocr.loc文件)
备注:如发现无法创建等情况,可以采用如下删除磁盘组等排错思路
SQL& drop diskgroup disk_group_name forc
4)还原ocr,并检查
# ocrconfig -restore file_name
# ocrcheck
5)恢复表决磁盘,并检查
# crsctl replace votedisk +asm_disk_group
# crsctl query css votedisk
6)停止独占模式运行的clusterware
# crsctl stop crs -f
7)所有节点正常启动clusterware
# crsctl start crs
8)CVU验证所有RAC节点OCR的完整性
$ cluvfy comp ocr -n all -verbose
下面开始演示操作,独占模式运行clusterware
[root@12crac1 bin]# ./crsctl start crs -excl -nocrs
CRS-4123: Oracle High Availability Services has been started.
CRS-2673: Attempting to stop 'ora.drivers.acfs' on '12crac2'
CRS-2677: Stop of 'ora.drivers.acfs' on '12crac2' succeeded
CRS-2672: Attempting to start 'ora.evmd' on '12crac2'
CRS-2672: Attempting to start 'ora.mdnsd' on '12crac2'
CRS-2676: Start of 'ora.evmd' on '12crac2' succeeded
CRS-2676: Start of 'ora.mdnsd' on '12crac2' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on '12crac2'
CRS-2676: Start of 'ora.gpnpd' on '12crac2' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on '12crac2'
CRS-2672: Attempting to start 'ora.gipcd' on '12crac2'
CRS-2676: Start of 'ora.cssdmonitor' on '12crac2' succeeded
CRS-2676: Start of 'ora.gipcd' on '12crac2' succeeded
CRS-2672: Attempting to start 'ora.cssd' on '12crac2'
CRS-2672: Attempting to start 'ora.diskmon' on '12crac2'
CRS-2676: Start of 'ora.diskmon' on '12crac2' succeeded
CRS-2676: Start of 'ora.cssd' on '12crac2' succeeded
CRS-2672: Attempting to start 'ora.drivers.acfs' on '12crac2'
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on '12crac2'
CRS-2672: Attempting to start 'ora.ctssd' on '12crac2'
CRS-2676: Start of 'ora.drivers.acfs' on '12crac2' succeeded
CRS-2676: Start of 'ora.ctssd' on '12crac2' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on '12crac2' succeeded
CRS-2672: Attempting to start 'ora.asm' on '12crac2'
CRS-2676: Start of 'ora.asm' on '12crac2' succeeded
通过grid用户登录sqlplus创建ASM磁盘组
[grid@12crac2 ~]$ sqlplus / as sysasm
SQL*Plus: Release 12.1.0.1.0 Production on Sun Jul 21 00:11:46 2013
Copyright (c) , Oracle.
All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL& create diskgroup raccrs external redundancy disk '/dev/asm-extcrs' attribute 'compatible.asm' = '12.1.0.0.0';
Diskgroup created.
通过ocrconfig还原Ocr
[root@12crac1 bin]# ./ocrconfig -import /home/grid/ocr.bak
[root@12crac1 bin]# ./ocrconfig -restore /u01/app/12.1.0/grid/cdata/scan12c/backup00.ocr
查看表决磁盘信息,当前无法找到
[root@12crac1 bin]# ./crsctl query css votedisk
Located 0 voting disk(s).
恢复表决磁盘,可能会遇到如下问题
[root@12crac1 bin]# ./crsctl replace votedisk +RACCRS
CRS-4602: Failed 27 to add voting file bf4f07bf313dc5a8f4c58a.
Failed to replace voting disk group with +RACCRS.
CRS-4000: Command Replace failed, or completed with errors.
此问题需要重新配置一下ASM的参数并重启ASM来解决。
[grid@12crac1 ~]$ sqlplus / as sysasm
SQL*Plus: Release 12.1.0.1.0 Production on Sun Jul 21 00:40:01 2013
Copyright (c) , Oracle.
All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL& alter system set asm_diskstring='/dev/asm*';
System altered.
SQL& creat
File created.
ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance
ASM instance started
Total System Global Area
Fixed Size
2297344 bytes
Variable Size
ASM diskgroups mounted
ASM diskgroups volume enabled
重新恢复表决磁盘
[root@12crac1 bin]# ./crsctl replace votedisk +RACCRS
Successful addition of voting disk 1499cddff03a4f86bffebcb1.
Successfully replaced voting disk group with +RACCRS.
CRS-4266: Voting file(s) successfully replaced
[root@12crac1 bin]# ./crsctl query css votedisk
File Universal Id
File Name Disk group
-----------------
--------- ---------
1499cddff03a4f86bffebcb1 (/dev/asm-extcrs) [RACCRS]
Located 1 voting disk(s).
退出独占模式:
[root@12crac1 bin]# ./crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on '12crac1'
CRS-2673: Attempting to stop 'ora.ctssd' on '12crac1'
CRS-2673: Attempting to stop 'ora.mdnsd' on '12crac1'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on '12crac1'
CRS-2673: Attempting to stop 'ora.gpnpd' on '12crac1'
CRS-2677: Stop of 'ora.drivers.acfs' on '12crac1' succeeded
CRS-2677: Stop of 'ora.mdnsd' on '12crac1' succeeded
CRS-2677: Stop of 'ora.gpnpd' on '12crac1' succeeded
CRS-2677: Stop of 'ora.ctssd' on '12crac1' succeeded
CRS-2673: Attempting to stop 'ora.evmd' on '12crac1'
CRS-2673: Attempting to stop 'ora.asm' on '12crac1'
CRS-2677: Stop of 'ora.evmd' on '12crac1' succeeded
CRS-2677: Stop of 'ora.asm' on '12crac1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on '12crac1'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on '12crac1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on '12crac1'
CRS-2677: Stop of 'ora.cssd' on '12crac1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on '12crac1'
CRS-2677: Stop of 'ora.gipcd' on '12crac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on '12crac1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
所有节点都正常启动:
[root@12crac1 bin]# ./crsctl start has
CRS-4123: Oracle High Availability Services has been started.
[root@12crac2 bin]# ./crsctl start has
CRS-4123: Oracle High Availability Services has been started.
查看clusterware运行状态
[grid@12crac1 ~]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
[grid@12crac2 ~]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
查看所有资源状态
[grid@12crac1 ~]$ crsctl stat
--------------------------------------------------------------------------------
State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
ora.RACCRS.dg
ora.RACDATA.dg
ora.RACFRA.dg
Started,STABLE
Started,STABLE
ora.net1.network
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.12crac1.vip
ora.12crac2.vip
ora.LISTENER_SCAN1.lsnr
ora.LISTENER_SCAN2.lsnr
ora.LISTENER_SCAN3.lsnr
ora.MGMTLSNR
169.254.171.71 192.1
68.80.154,STABLE
ora.luocs12c.db
Open,STABLE
Open,STABLE
ora.mgmtdb
Instance Shutdown,ST
ora.scan1.vip
ora.scan2.vip
ora.scan3.vip
--------------------------------------------------------------------------------
在这里我们会发现mgmtdb没有正常启动,手动尝试启动,会遇到问题。
[grid@12crac1 ~]$ srvctl start mgmtdb
PRCR-1079 : Failed to start resource ora.mgmtdb
CRS-5017: The resource action "ora.mgmtdb start" encountered the following error:
ORA-01078: failure in processing system parameters
LRM-00109: could not open parameter file '/u01/app/12.1.0/grid/dbs/init-MGMTDB.ora'
. For details refer to "(:CLSN00107:)" in "/u01/app/12.1.0/grid/log/12crac2/agent/crsd/oraagent_grid/oraagent_grid.log".
CRS-2674: Start of 'ora.mgmtdb' on '12crac2' failed
CRS-5017: The resource action "ora.mgmtdb start" encountered the following error:
ORA-01078: failure in processing system parameters
ORA-01565: error in identifying file '+RACCRS/_mgmtdb/spfile-MGMTDB.ora'
ORA-17503: ksfdopn:2 Failed to open file +RACCRS/_mgmtdb/spfile-MGMTDB.ora
ORA-15056: additional error message
ORA-17503: ksfdopn:2 Failed to open file +RACCRS/_mgmtdb/spfile-mgmtdb.ora
ORA-15173: entry '_mgmtdb' does not exist in directory '/'
ORA-06512: at line 4
. For details refer to "(:CLSN00107:)" in "/u01/app/12.1.0/grid/log/12crac1/agent/crsd/oraagent_grid/oraagent_grid.log".
CRS-2674: Start of 'ora.mgmtdb' on '12crac1' failed
CRS-2632: There are no more servers to try to place resource 'ora.mgmtdb' on that would satisfy its placement policy
– 此问题造成原因是,我们dd了存放ocr的ASM磁盘组之后,里面的_MGMTDB相关文件也都将损坏丢失。从报错信息可见无法找到参数文件。
查看mgmtdb配置信息
[grid@12crac1 ~]$ srvctl config mgmtdb -all -verbose
Database unique name: _mgmtdb
Database name:
Oracle home: /u01/app/12.1.0/grid
Oracle user: grid
Spfile: +RACCRS/_mgmtdb/spfile-MGMTDB.ora
Password file:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Database instance: -MGMTDB
Type: Management
Database is enabled
目前还不知如何修复mgmtdb的方法,因此我remove了下
[grid@12crac1 ~]$ srvctl remove mgmtdb
Remove the database _mgmtdb? (y/[n]) y
[grid@12crac1 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
ora.RACCRS.dg
ora.RACDATA.dg
ora.RACFRA.dg
Started,STABLE
Started,STABLE
ora.net1.network
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.12crac1.vip
ora.12crac2.vip
ora.LISTENER_SCAN1.lsnr
ora.LISTENER_SCAN2.lsnr
ora.LISTENER_SCAN3.lsnr
ora.MGMTLSNR
169.254.171.71 192.1
68.80.154,STABLE
ora.luocs12c.db
Open,STABLE
Open,STABLE
ora.scan1.vip
ora.scan2.vip
ora.scan3.vip
--------------------------------------------------------------------------------
mgmtdb备份还原以及修复等方法,有待研究,本次实验先到这里。
176人在看PHP中文网:独家原创,永久免费的在线,php技术学习阵地!
All Rights Reserved | 皖B2-QQ群:关注微信公众号鎵?竴鎵?紝璁块棶寰?ぞ鍖

我要回帖

更多关于 表决磁盘 的文章

 

随机推荐