Oracle在线 redo log文件丢失后的恢复
今天一个开发库启动不了了,发过来报错一看是日志文件损坏了(见下图),接着说了一下前因后果。说是年前服务器掉电了,然后就再没有启动起来过。今天有人用才想到要处理。
先说一下大体的思路,如果损坏的redo log是INACTIVE状态的,也就是实例崩溃恢复用不到的redo log,那处理起来比较容易,直接alter database clear logfile group #;或alter database clear unarchived logfile group #;重建日志组就行了。建议重建日志文件级后对数据库做一个全库备份,特别是强制clear后,造成的归档日志文件断层。在如果损坏的redo log是ACTIVE或CURRENT状态的,也就是实例崩溃恢复需要用到的redo log,那处理起来就比较麻烦了,损坏这种redo log就意味着丢失数据。
redo log的三种状态:
-
INACTIVE:日志对应的修改已经被写入硬盘
-
ACTIVE:日志对应的修改还没有被写入硬盘
-
CURRENT:实例正在使用的日志文件
由于这个开发库有种种的问题,恢复起来遇到了各种情况,这里用一个虚拟机上的数据库演示如果CURRENT或ACTIVE状态的日志文件损坏的情况下如何恢复。
1、构造场景
删除一张表的数据但不提交,然后在另一个会话中把数据库shutdown abort。再删除所有的redo log文件。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
|
#session 1
sys@ORCL>
delete
from
zx;
2858
rows
deleted.
#session 2
sys@ORCL>
select
group
#,status
from
v$log;
GROUP
# STATUS
---------- ------------------------------------------------
1 INACTIVE
2 ACTIVE
3
CURRENT
sys@ORCL>shutdown abort;
ORACLE instance shut down.
#删除redo log文件
[oracle@rhel6 ~]$ cd /u02/app/oracle/oradata/orcl/
[oracle@rhel6 orcl]$ ls -l
total 1944992
-rw-r
----- 1 oracle oinstall 9748480 Feb 24 23:56 control01.ctl
-rw-r
----- 1 oracle oinstall 9748480 Feb 24 23:56 control02.ctl
-rw-r
----- 1 oracle oinstall 328343552 Feb 24 23:54 example01.dbf
-rw-r
----- 1 oracle oinstall 52429312 Feb 24 23:54 redo01.log
-rw-r
----- 1 oracle oinstall 52429312 Feb 24 23:55 redo02.log
-rw-r
----- 1 oracle oinstall 52429312 Feb 24 23:55 redo03.log
-rw-r
----- 1 oracle oinstall 545267712 Feb 24 23:54 sysaux01.dbf
-rw-r
----- 1 oracle oinstall 796925952 Feb 24 23:54 system01.dbf
-rw-r
----- 1 oracle oinstall 30416896 Feb 24 13:58 temp01.dbf
-rw-r
----- 1 oracle oinstall 110108672 Feb 24 23:54 undotbs01.dbf
-rw-r
----- 1 oracle oinstall 5251072 Feb 24 23:54 users01.dbf
[oracle@rhel6 orcl]$ rm redo*log
l[oracle@rhel6 orcl]$ ls -l
total 1791212
-rw-r
----- 1 oracle oinstall 9748480 Feb 24 23:56 control01.ctl
-rw-r
----- 1 oracle oinstall 9748480 Feb 24 23:56 control02.ctl
-rw-r
----- 1 oracle oinstall 328343552 Feb 24 23:54 example01.dbf
-rw-r
----- 1 oracle oinstall 545267712 Feb 24 23:54 sysaux01.dbf
-rw-r
----- 1 oracle oinstall 796925952 Feb 24 23:54 system01.dbf
-rw-r
----- 1 oracle oinstall 30416896 Feb 24 13:58 temp01.dbf
-rw-r
----- 1 oracle oinstall 110108672 Feb 24 23:54 undotbs01.dbf
-rw-r
----- 1 oracle oinstall 5251072 Feb 24 23:54 users01.dbf
|
2、启动数据库出现报错
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
idle>startup
ORACLE instance started.
Total System Global Area 1603411968 bytes
Fixed Size 2253664 bytes
Variable Size 1476398240 bytes
Database Buffers 117440512 bytes
Redo Buffers 7319552 bytes
Database mounted.
ORA-00313:
open
failed
for
members of log group 2 of thread 1
ORA-00312: online log 2 thread 1:
'/u02/app/oracle/oradata/orcl/redo02.log'
ORA-27037: unable to obtain
file
status
Linux-x86_64 Error: 2: No such
file
or directory
Additional information: 3
|
3、尝试使用clear方式重建日志组出现报错
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
idle>
alter
database
clear logfile
group
2;
alter
database
clear logfile
group
2
*
ERROR
at
line 1:
ORA-01624: log 2 needed
for
crash recovery
of
instance orcl (thread 1)
ORA-00312: online log 2 thread 1:
'/u02/app/oracle/oradata/orcl/redo02.log'
idle>
alter
database
clear unarchived logfile
group
2;
alter
database
clear unarchived logfile
group
2
*
ERROR
at
line 1:
ORA-01624: log 2 needed
for
crash recovery
of
instance orcl (thread 1)
ORA-00312: online log 2 thread 1:
'/u02/app/oracle/oradata/orcl/redo02.log'
|
从报错信息中可以看出log 2是实例崩溃恢复所需要的日志文件,不能直接重建。
4、这种情况下使用隐含参数_allow_resetlogs_corruption,创建pfile,把*._allow_resetlogs_corruption=TRUE加入到pfile中。然后mount数据库,强制不完全恢复,再open resetlogs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
|
idle>create pfile=
'/home/oracle/initorcl.ora'
from spfile;
File created.
[oracle@rhel6 orcl]$
vi
/home/oracle/initorcl
.ora
idle>
shutdown
immediate;
ORA-01109: database not
open
Database dismounted.
ORACLE instance shut down.
idle>startup pfile=
'/home/oracle/initorcl.ora'
mount
;
ORACLE instance started.
Total System Global Area 1603411968 bytes
Fixed Size 2253664 bytes
Variable Size 1476398240 bytes
Database Buffers 117440512 bytes
Redo Buffers 7319552 bytes
Database mounted.
idle>show parameter _allow_
NAME TYPE VALUE
------------------------------------ --------------------------------- ------------------------------
_allow_resetlogs_corruption boolean TRUE
idle>recover database
until
cancel;
ORA-00279: change 1023441 generated at 02
/24/2017
23:54:54 needed
for
thread 1
ORA-00289: suggestion :
/u02/app/oracle/product/11
.2.4
/db1/dbs/arch1_2_936817668
.dbf
ORA-00280: change 1023441
for
thread 1 is
in
sequence
#2
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194:
file
1 needs
more
recovery to be consistent
ORA-01110: data
file
1:
'/u02/app/oracle/oradata/orcl/system01.dbf'
ORA-01112: media recovery not started
idle>alter database
open
resetlogs;
Database altered.
idle>
select
open_mode from
v
$database;
OPEN_MODE
------------------------------------------------------------
READ WRITE
|
可以看到现在数据库已经被open了。
5、再次查看第一步中被删除的数据的表,数据仍然存在说明丢失CURRENT或ACTIVE状态的日志文件会导致数据丢失。
1
2
3
4
5
|
idle>
select
count(*) from zx;
COUNT(*)
----------
2858
|
以上是在虚拟机上做测试的恢复过程,但是对于前面说到的开发库的恢复就没有这个过程简单了。可以说是解决了一个报错又出来新的报错。
在使用_allow_resetlogs_corruption参数执行不完全恢复,open resetlogs 时,遇到了ORA-01248
1
2
3
4
5
|
SQL>
alter
database
open
resetlogs;
alter
database
open
resetlogs
*
ERROR
at
line 1:
ORA-01248: file 5 was created
in
the future
of
incomplete recovery
|
于是先把这个文件offline drop
1
|
SQL> alter database datafile 5 offline drop;
|
再次open resetlogs时又遇到了ORA-00704和ORA-01555
1
2
3
4
5
6
7
8
9
10
11
12
|
SQL>
alter
database
open
resetlogs;
alter
database
open
resetlogs
*
ERROR
at
line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred
at
recursive SQL
level
1
ORA-01555: snapshot too old:
rollback
segment number 5
with
name
"_SYSSMU5_4116806824$"
too small
Process ID: 3396
Session ID: 573 Serial number: 51
|
由于现在的水平有限,在网上查资料也没有能解决这一系列的问题,最后没办法只能重建库,重新导数据了。
如果哪位遇到了类似的问题,而且解决了,也请分享一下经验。
其实上午在模拟这个问题的时候,在open resetlogs时还遇到了一个经典的报错ORA-600 [2662],这个错误可以参考eygle的博客http://www.eygle.com/archives/2005/12/oracle_diagnostics_howto_deal_2662_error.html
参考:http://iquicksandi.blog.163.com/blog/static/13228526220107642655204/
http://www.linuxidc.com/Linux/2012-02/53426.htm
http://www.killdb.com/2014/06/19/%E6%95%B0%E6%8D%AE%E5%BA%93open%E6%8A%A5%E9%94%99ora-01555-snapshot-too-old.html
http://www.askmaclean.com/archives/%E3%80%90oracle%E6%81%A2%E5%A4%8D%E3%80%91ora-704.html
本文转自hbxztc 51CTO博客,原文链接:http://blog.51cto.com/hbxztc/1901100,如需转载请自行联系原作者