常规ORA-01555报错 / 数据库表空间数据文件损坏,无法启动

ORA-01555

[oracle@testosa:/home/oracle]$ oerr ora 01555
01555, 00000, "snapshot too old: rollback segment number %s with name \"%s\" too small"
// *Cause: rollback records needed by a reader for consistent read are
//         overwritten by other writers
// *Action: If in Automatic Undo Management mode, increase undo_retention
//          setting. Otherwise, use larger rollback segments

常规ORA-01555报错

处理方法:

ORA-01555快照过旧的问题,通常可以先去检查undo表空间的大小和undo_retention参数的设置,并进行增大调整的方法进行处理;

  1. 增大undo表空间大小
  2. 增加undo_retention的时间

1、增加undo表空间大小

-- 先查看表空间使用情况
SELECT a.tablespace_name "UNDOTBS1",
a.bytes / 1024 / 1024 "表空间大小(M)",
(a.bytes - b.bytes) / 1024 / 1024 "已使用空间(M)",
b.bytes / 1024 / 1024 "空闲空间(M)",
round(((a.bytes - b.bytes) / a.bytes) * 100, 2) "使用比"
FROM (SELECT tablespace_name, sum(bytes) bytes
FROM dba_data_files
GROUP BY tablespace_name) a,
(SELECT tablespace_name, sum(bytes) bytes, max(bytes) largest
FROM dba_free_space
GROUP BY tablespace_name) b
WHERE a.tablespace_name = b.tablespace_name
ORDER BY ((a.bytes - b.bytes) / a.bytes) DESC;

UNDOTBS1                       表空间大小(M) 已使用空间(M) 空闲空间(M)     使用比
------------------------------ ------------- ------------- ----------- ----------
SYSTEM                                   740        739.25         .75       99.9
SYSAUX                                   530      499.4375     30.5625      94.23
USERS                                      5        1.3125      3.6875      26.25
UNDOTBS1                                  75       16.8125     58.1875      22.42
TEST01                                    50         1.125      48.875       2.25



-- 增加undo表空间
alter tablespace undotbs1 add datafile '/oracle/oradata/testdba/undotbs02.dbf' size 500M autoextend on next 512M maxsize 24G;


-- 再次查询表空间使用情况
UNDOTBS1                       表空间大小(M) 已使用空间(M) 空闲空间(M)     使用比
------------------------------ ------------- ------------- ----------- ----------
SYSTEM                                   740        739.25         .75       99.9
SYSAUX                                   530      499.9375     30.0625      94.33
USERS                                      5        1.3125      3.6875      26.25
UNDOTBS1                                 575        17.625     557.375       3.07
TEST01                                    50         1.125      48.875       2.25

2、增加undo_retention的时间

-- 先查询现有服务器的undo_retention时间
SQL> show parameter undo_retention;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
undo_retention                       integer     10800    -- 10800秒是3小时
SQL>


-- 修改到4小时
SQL> alter system set undo_retention=14400 scope=both;

System altered.



SQL> show parameter undo_retention;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
undo_retention                       integer     14400
SQL>

关于undo_retention官方文档有如下两点说明:

1:For an undo tablespace with the AUTOEXTEND option enabled, the database attempts to honor the minimum retention period specifiedby UNDO_RETENTION. When space is low, instead of overwriting unexpired undo information, the tablespace auto-extends. If the MAXSIZE clause is specified for an auto-extending undo tablespace, when the maximum size is reached, thedatabase may begin to overwrite unexpired undo information.

数据库可以在undo_retention设置的时间里保留undo数据,当空间不足时,如果undo表空间是可以自动扩展的,那是可以自动扩展空间来保留更多的undo数据,而不是覆盖未到期(未到undo_retention设置的时间)的undo数据

2:If the undo tablespace is configured with the AUTOEXTEND option, the database dynamically tunes the undo retention periodto be somewhat longer than the longest-running active query on the system. However,this retention period may be insufficient to accommodate Oracle Flashback operations. Oracle Flashback operations resulting in snapshot too old errors are the indicator that you must intervene to ensure that sufficient undo data isretained to support these operations. To better accommodate Oracle Flashback features, you can either set the undo_retention parameter to a value equal to the longest expected Oracle Flashback operation, or you can change the undo tablespace to fixed size.

如果undo表空间的自动扩展的,那数据库可以动态的调整undo retention的保留时间来支持长查询。Retention保留时间可能不足以支持闪回操作时,会返回快照过旧的错误,必须要保留足够的undo数据来支持闪回查询。最好的建议是设置UNDO_RETENTION参数到一个支持最长查询的时间。

导出lob字段表报ORA-01555和ORA-22924

容易在导出时遇到(无论exp或者expdp)

遇到ORA-01555,如果这个表有lob字段,就应该有有一定警觉

  1. 在导lob字段时,因为无法并行,导出会很慢。

  2. lob字段使用的undo是local undo,是使用一定百分比的lob段空间作为undo使用。默认pctversion 10,即10%,可以字段级动态修改。

    alter table tab modify lob(col_lob) (pctversion 20);
    
    -- 检查 lob undo时间
    select table_name,owner,column_name,pctversion,retention from dba_lobs where table_name= 'TEST_T' and owner='TEST';
    -- 更改字段的retention
    alter table test_t modify lob(col_lob) (retention);
    
  3. lob字段有损坏行的场景。这种情况下也会报错ORA-01555和ORA-22924。需要使用如下sql来查询,然后清理掉lob数据或导出时屏蔽处理。

-- 创建一个记录用表
create table corrupt_lobs (corrupt_rowid rowid, err_num number);

-- 知道表上的lob列
DESC <TABLE_NAME>
Name         Null?     Type
----------   --------- ------------
<COL1>       NOT NULL  NUMBER
<LOB_COLUMN>           BLOB

-- 运行plsql块
-- 表名那里可以写成tab partition(part1)这种形式,以分区规模去查,减少浪费。本质是逐行fetch,然后catch错误打印到记录表。
-- 如果表上有2个lob字段,则要逐个字段来遍历

declare
  error_1578 exception;
  error_1555 exception;
  error_22922 exception;
  pragma exception_init(error_1578,-1578);
  pragma exception_init(error_1555,-1555);
  pragma exception_init(error_22922,-22922);
  n number;
begin
  for cursor_lob in (select rowid r, <LOB_COLUMN> from <TABLE_NAME>) loop
  begin
    n:=dbms_lob.instr(cursor_lob.<LOB_COLUMN>,hextoraw('889911'));
  exception
    when error_1578 then
      insert into corrupt_lobs values (cursor_lob.r, 1578);
      commit;
    when error_1555 then
      insert into corrupt_lobs values (cursor_lob.r, 1555);
      commit;
    when error_22922 then
      insert into corrupt_lobs values (cursor_lob.r, 22922);
      commit;
    end;
  end loop;
end;
/

处理方法1-将有问题的lob行置为空

SQL> update <TABLE_NAME> set <LOB_COLUMN> = empty_blob()
     where rowid in (select corrupt_rowid from corrupt_lobs);
 
( for BLOB and BFILE columns use EMPTY_BLOB; for CLOB and NCLOB columns use EMPTY_CLOB )

处理方法2-导出时直接屏蔽掉该行数据

expdp system/<PASSWORD> DIRECTORY=my_dir DUMPFILE=<dump_name>.dmp LOGFILE=<logfile_name>.log TABLES=<SCHEMA_NAME>.<TABLE_NAME> QUERY=\"WHERE rowid NOT IN \(\'<corrupt_rowid>\'\)\"

参考资料

https://blog.csdn.net/weixin_45618691/article/details/128132821

https://blog.csdn.net/q195136130/article/details/133307347

https://blog.51cto.com/u_13482808/6512776

ORA-1555 Error During Export on LOB Data (文档 ID 1950937.1)
LOBs and ORA-01555 troubleshooting (文档 ID 846079.1)

778004.1

2、cannot identify/lock data file %s - see DBWR trace file

[oracle@testos:/home/oracle]$oerr ora 1157
01157, 00000, "cannot identify/lock data file %s - see DBWR trace file"
// *Cause:  The background process was either unable to find one of the data 
//         files or failed to lock it because the file was already in use.
//         The database will prohibit access to this file but other files will
//         be unaffected. However the first instance to open the database will
//         need to access all online data files. Accompanying error from the
//         operating system describes why the file could not be identified.
// *Action: Have operating system make file available to database. Then either
//         open the database or do ALTER SYSTEM CHECK DATAFILES.
[oracle@testos:/home/oracle]$

场景1

Oracle RAC环境,添加数据文件的时候将数据文件添加到了节点1的本地,没有加到ASM磁盘组里,导致节点2访问这个数据文件的数据时报错,

处理:

迁移到ASM磁盘组即可

场景2

数据库表空间数据文件损坏,无法启动。进行数据库恢复。

现象

#前端报错
ORA-03113: end-of-file on communication channel
Process ID: 10192
Session ID: 19 Serial number: 3301


#alter.log
ORA-63999: 数据文件出现介质故障
ORA-01114: 将块写入文件 156 时出现 IO 错误 (# 2)
ORA-01110: 数据文件 156: 'F:\ORACLE\DATA\TB_SCOTT01.DBF'

处理过程

#1、启动数据库
SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-01157: 无法标识/锁定数据文件 156 - 请参阅 DBWR 跟踪文件
ORA-01110: 数据文件 156: 'F:\ORACLE\DATA\TB_SCOTT01.DBF'


#2、删除故障的数据文件和表空间
SQL> startup nomount;
ORACLE 例程已经启动。

Total System Global Area 3.8655E+10 bytes
Fixed Size                 12295552 bytes
Variable Size            4966056576 bytes
Database Buffers         3.3554E+10 bytes
Redo Buffers              121921536 bytes
SQL> alter database mount;

数据库已更改。

#再删除故障的数据文件,按文件号删除:
SQL> alter database datafile 156 offline drop;

数据库已更改。

#open 数据库,安全起见先把相关的表空间offline ,然后删除。

SQL> alter database open;

数据库已更改。

SQL>
SQL> alter tablespace tb_scott offline immediate ;

表空间已更改。

SQL> drop tablespace tb_scott including contents and datafiles ;

表空间已删除。

为了确保数据库正常,再次关闭数据库,启动数据库,数据库正常。

最近更新

  1. docker php8.1+nginx base 镜像 dockerfile 配置

    2024-07-23 10:08:01       172 阅读
  2. Could not load dynamic library ‘cudart64_100.dll‘

    2024-07-23 10:08:01       190 阅读
  3. 在Django里面运行非项目文件

    2024-07-23 10:08:01       158 阅读
  4. Python语言-面向对象

    2024-07-23 10:08:01       171 阅读

热门阅读

  1. iphone11 如何打开开发者模式?

    2024-07-23 10:08:01       36 阅读
  2. centos7 yum更换国内源【超简洁步骤】

    2024-07-23 10:08:01       33 阅读
  3. Kotlin 继承

    2024-07-23 10:08:01       32 阅读
  4. LeetCode718. 最长重复子数组

    2024-07-23 10:08:01       29 阅读
  5. MySQL的查询优化思路

    2024-07-23 10:08:01       30 阅读
  6. 数据库分表实践

    2024-07-23 10:08:01       37 阅读