Let’s fuckup the cluster!
[root@rico1 ~]# dd if=/dev/zero of=/dev/asm_grid bs=1M count=256 256+0 przeczytanych recordów 256+0 zapisanych recordów skopiowane 268435456 bajtów (268 MB), 0,521837 s, 514 MB/s
Of course after this operation, the final state of the processes can look like this:
[root@rico1 ~]# crsctl stat res -t -init -------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.asm 1 ONLINE OFFLINE Instance Shutdown,ST ABLE ora.cluster_interconnect.haip 1 ONLINE OFFLINE STABLE ora.crf 1 ONLINE OFFLINE STABLE ora.crsd 1 ONLINE OFFLINE STABLE ora.cssd 1 ONLINE OFFLINE rico1 STARTING ora.cssdmonitor 1 ONLINE ONLINE rico1 STABLE ora.ctssd 1 ONLINE OFFLINE STABLE ora.diskmon 1 OFFLINE OFFLINE STABLE ora.evmd 1 ONLINE INTERMEDIATE rico1 STABLE ora.gipcd 1 ONLINE ONLINE rico1 STABLE ora.gpnpd 1 ONLINE ONLINE rico1 STABLE ora.mdnsd 1 ONLINE ONLINE rico1 STABLE ora.storage 1 ONLINE OFFLINE STABLE --------------------------------------------------------------------------------
The cssd service will not be able to start, because there are no voting disks:
[root@rico1 ~]# tail -10 /u01/app/oracle/diag/crs/rico1/crs/trace/ocssd.trc 2016-06-10 10:28:32.331227 : CSSD:990865152: clssnmvDiskVerify: Successful discovery of 0 disks 2016-06-10 10:28:32.331229 : CSSD:990865152: clssnmCompleteInitVFDiscovery: Completing initial voting file discovery 2016-06-10 10:28:32.331231 : CSSD:990865152: clssnmvFindInitialConfigs: No voting files found 2016-06-10 10:28:32.331302 : CSSD:990865152: (:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds 2016-06-10 10:28:33.270863 : CSSD:1279616768: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization 2016-06-10 10:28:33.270876 : CSSD:1279616768: clsssc_CLSFAInit_CB: clsfa fencing not ready yet 2016-06-10 10:28:34.271252 : CSSD:1279616768: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization
OK, so let’s try to stop the cluster services:
[root@rico1 ~]# crsctl stop crs -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rico1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'rico1' CRS-2677: Stop of 'ora.mdnsd' on 'rico1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'rico1' CRS-2673: Attempting to stop 'ora.evmd' on 'rico1' CRS-2673: Attempting to stop 'ora.gpnpd' on 'rico1' CRS-2677: Stop of 'ora.gipcd' on 'rico1' succeeded CRS-2677: Stop of 'ora.evmd' on 'rico1' succeeded CRS-2677: Stop of 'ora.gpnpd' on 'rico1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rico1' has completed CRS-4133: Oracle High Availability Services has been stopped.
Now we will have to start CRS in exclusive mode and start fixing stuff:
[root@rico1 ~]# crsctl start crs -excl -nocrs CRS-4123: Oracle High Availability Services has been started. CRS-2672: Attempting to start 'ora.evmd' on 'rico1' CRS-2672: Attempting to start 'ora.mdnsd' on 'rico1' CRS-2676: Start of 'ora.mdnsd' on 'rico1' succeeded CRS-2676: Start of 'ora.evmd' on 'rico1' succeeded CRS-2672: Attempting to start 'ora.gpnpd' on 'rico1' CRS-2676: Start of 'ora.gpnpd' on 'rico1' succeeded CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rico1' CRS-2672: Attempting to start 'ora.gipcd' on 'rico1' CRS-2676: Start of 'ora.cssdmonitor' on 'rico1' succeeded CRS-2676: Start of 'ora.gipcd' on 'rico1' succeeded CRS-2672: Attempting to start 'ora.cssd' on 'rico1' CRS-2672: Attempting to start 'ora.diskmon' on 'rico1' CRS-2676: Start of 'ora.diskmon' on 'rico1' succeeded CRS-2676: Start of 'ora.cssd' on 'rico1' succeeded CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rico1' CRS-2672: Attempting to start 'ora.ctssd' on 'rico1' CRS-2676: Start of 'ora.ctssd' on 'rico1' succeeded CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rico1' succeeded CRS-2672: Attempting to start 'ora.asm' on 'rico1' CRS-2676: Start of 'ora.asm' on 'rico1' succeeded
Of course in this situation KFED will not be helpful 🙂
[root@rico1 ~]# kfed repair /dev/asm_grid KFED-00320: Invalid block num1 = [0], num2 = [1], error = [endian_kfbh]
So now we have to recreate diskgroup GRID and ASM spfile and passwordfile:
SQL> alter system 2 set asm_diskstring='/dev/asm*'; SQL> ed Wrote file afiedt.buf 1 alter system 2* set asm_diskgroups='GRID','DATA' SQL> / System altered. SQL> ; 1* select path, header_status from v$asm_disk SQL> / PATH HEADER_STATU ------------------------------ ------------ /dev/asm_data MEMBER /dev/asm_grid CANDIDATE
Let’s create back our GRID diskgroup:
SQL> ed Wrote file afiedt.buf 1 create diskgroup grid 2 external redundancy 3 disk '/dev/asm_grid' 4 attribute 'compatible.asm'='12.1.0.2', 5* 'compatible.rdbms'='12.1.0.2' SQL> / Diskgroup created.
Now we have to recreate SPFILE for ASM. First step will be creating a simple pfile:
[oracle@rico1 ~]$ cd $ORACLE_HOME/dbs [oracle@rico1 dbs]$ vim init+ASM1.ora [oracle@rico1 dbs]$ cat !$ cat init+ASM1.ora *.asm_diskgroups='GRID' *.asm_diskgroups='DATA' *.asm_diskstring='/dev/asm*'
Next we can create spfile:
SQL> create spfile='+GRID' from pfile; File created. SQL> !rm init+ASM1.ora
And passwordfile:
[oracle@rico1 dbs]$ orapwd file=+GRID password=oracle asm=yes
We are now ready to restore the OCR file – remember to restore the newest one:
[root@rico1 ~]# ocrconfig -restore /u01/app/12.1.0/grid/cdata/rico-cluster/backup_20160610_101746.ocr [root@rico1 ~]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 4 Total space (kbytes) : 409568 Used space (kbytes) : 1460 Available space (kbytes) : 408108 ID : 115130541 Device/File Name : +GRID Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded
Now we can create new voting disk:
[root@rico1 ~]# crsctl query css votedisk Located 0 voting disk(s). [root@rico1 ~]# crsctl replace votedisk +GRID Successful addition of voting disk 62a6bea00e4e4f01bf3ed09c345eedba. Successfully replaced voting disk group with +GRID. CRS-4266: Voting file(s) successfully replaced [root@rico1 ~]# crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 62a6bea00e4e4f01bf3ed09c345eedba (/dev/asm_grid) [GRID] Located 1 voting disk(s).
So it seems, that everything looks fine. It’s time to stop CRS
[root@rico1 ~]# crsctl stop crs CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rico1' CRS-2673: Attempting to stop 'ora.evmd' on 'rico1' CRS-2673: Attempting to stop 'ora.ctssd' on 'rico1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'rico1' CRS-2673: Attempting to stop 'ora.gpnpd' on 'rico1' CRS-2677: Stop of 'ora.evmd' on 'rico1' succeeded CRS-2677: Stop of 'ora.ctssd' on 'rico1' succeeded CRS-2673: Attempting to stop 'ora.asm' on 'rico1' CRS-2677: Stop of 'ora.mdnsd' on 'rico1' succeeded CRS-2677: Stop of 'ora.gpnpd' on 'rico1' succeeded CRS-2677: Stop of 'ora.asm' on 'rico1' succeeded CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'rico1' CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'rico1' succeeded CRS-2673: Attempting to stop 'ora.cssd' on 'rico1' CRS-2677: Stop of 'ora.cssd' on 'rico1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'rico1' CRS-2677: Stop of 'ora.gipcd' on 'rico1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rico1' has completed CRS-4133: Oracle High Availability Services has been stopped.
And start it in normal mode
[root@rico1 ~]# crsctl start crs CRS-4123: Oracle High Availability Services has been started.
And we’re done 🙂
[root@rico1 ~]# crsctl stat res -t -------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.DATA.dg ONLINE ONLINE rico1 STABLE ONLINE ONLINE rico2 STABLE ora.GRID.dg ONLINE ONLINE rico1 STABLE ONLINE ONLINE rico2 STABLE ora.LISTENER.lsnr ONLINE ONLINE rico1 STABLE ONLINE ONLINE rico2 STABLE ora.asm ONLINE ONLINE rico1 Started,STABLE ONLINE ONLINE rico2 Started,STABLE ora.net1.network ONLINE ONLINE rico1 STABLE ONLINE ONLINE rico2 STABLE ora.ons ONLINE ONLINE rico1 STABLE ONLINE ONLINE rico2 STABLE -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE rico2 STABLE ora.LISTENER_SCAN2.lsnr 1 ONLINE ONLINE rico1 STABLE ora.LISTENER_SCAN3.lsnr 1 ONLINE ONLINE rico1 STABLE ora.MGMTLSNR 1 ONLINE ONLINE rico2 169.254.26.98 10.0.0 .12,STABLE ora.cvu 1 ONLINE ONLINE rico1 STABLE ora.dupa.db 1 ONLINE ONLINE rico1 Open,STABLE 2 ONLINE ONLINE rico2 Open,STABLE ora.mgmtdb 1 ONLINE OFFLINE STABLE ora.oc4j 1 ONLINE ONLINE rico1 STABLE ora.rico1.vip 1 ONLINE ONLINE rico1 STABLE ora.rico2.vip 1 ONLINE ONLINE rico2 STABLE ora.scan1.vip 1 ONLINE ONLINE rico2 STABLE ora.scan2.vip 1 ONLINE ONLINE rico1 STABLE ora.scan3.vip 1 ONLINE ONLINE rico1 STABLE --------------------------------------------------------------------------------
The last step would be to recreate -MGMTDB:
[root@rico1 ~]# srvctl remove mgmtdb Remove the database _mgmtdb? (y/[n]) y [root@rico1 ~]# su - oracle [oracle@rico1 ~]$ . oraenv ORACLE_SID = [oracle] ? +ASM1 The Oracle base has been set to /u01/app/oracle (reverse-i-search)`': ^C [oracle@rico1 ~]$ export GI_HOME=$ORACLE_HOME [oracle@rico1 ~]$ dbca -silent -createDatabase -sid -MGMTDB -createAsContainerDatabase true -templateName MGMTSeed_Database.dbc -gdbName _mgmtdb -storageType ASM -diskGroupName +grid -datafileJarLocation $GI_HOME/assistants/dbca/templates -characterset AL32UTF8 -autoGeneratePasswords -skipUserTemplateCheck Registering database with Oracle Grid Infrastructure 5% complete Copying database files 7% complete 9% complete 16% complete 23% complete 30% complete 41% complete Creating and starting Oracle instance 43% complete 48% complete 49% complete 50% complete 55% complete 60% complete 61% complete 64% complete Completing Database Creation 68% complete 79% complete 89% complete 100% complete Look at the log file "/u01/app/oracle/cfgtoollogs/dbca/_mgmtdb/_mgmtdb3.log" for further details.
I hope you won’t have to use this procedure in real life 🙂