副标题[/!--empirenews.page--]

概述
最近生产环境有这么个现象,平时的订单调度只需要2s内可以出结果,但是多个人调度就会卡住,超过15分钟都没有结果出来,有时还会失败然后导致数据不准确。

下面记录一下生产环境卡顿时排查的过程。
1、获取ASH报告
- SQL> @?/rdbms/admin/ashrpt.sql
- --To specify absolute begin time:
- --[MM/DD/YY]] HH24:MI[:SS]
- --08/09/19 08:40:00




2、ASH分析
1、Top User Events

2、相关sql
Top SQL with Top Events

sql明细

3、存储过程

4、TOP sessions

从上面分析可以看到两个明显的等待事件:wait for stopper event to be increased 等待事件和wait for a undo record 等待事件,这个应该是批量任务调度的时候产生了大量的大事务,产生了一些回滚造成了严重的资源消耗
3、处理大事务并发回滚
一般情况下wait for stopper event to be increased 等待事件是跟wait for a undo record 等待事件联系起来的。
对于这个等待事件metalink上面有一篇文档
- 464246.1
- Sometimes Parallel Rollback of Large Transaction may become very slow. After killing a large running transaction
- (either by killing the shadow process or aborting the database) then database seems to hang, or smon and parallel query servers
- taking all the available cpu.
- In fast-start parallel rollback, the background process Smon acts as a coordinator and rolls back a set of transactions in parallel
- using multiple server processes. Fast start parallel rollback is mainly useful when a system has transactions that run a long time
- before comitting, especially parallel Inserts, Updates, Deletes operations. When Smon discovers that the amount of recovery work is
- above a certain threshold, it automatically begins parallel rollback by dispersing the work among several parallel processes.
- There are cases where parallel transaction recovery is not as fast as serial transaction recovery, because the pq slaves are interfering
- with each other. It looks like the changes made by this transaction cannot be recovered in parallel without causing a performance problem.
- The parallel rollback slave processes are most likely contending for the same resource, which results in even worse rollback performance
- compared to a serial rollback.
(编辑:晋中站长网)
【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!
|