整个表147M,执行时一个CPU耗尽, buffer gets 超过1个G, 启用并行也没有用
今天开发的同事问有个表上的数据为什么删不掉?我看了一下,也就不到100000条数据,表上有外键,等了5分钟hang在那里,时间原因,我对表上的外键禁用后,瞬间删除。
现在来还原这个问题。
sys@ANBOB>select count(*) from bjhr.doctor_exam_member;
COUNT(*)
——————–
92102
sys@ANBOB>delete bjhr.doctor_exam_member;
–hang
–等待10分钟都未执行完,检查表的外键信息
bjhr@ANBOB> SELECT /*+RULE*/ D.CONSTRAINT_NAME pk_name,-- d.table_name, D.TABLE_NAME || '.' || D.COLUMN_NAME pk_column, A.CONSTRAINT_TYPE, B.CONSTRAINT_NAME fk_name, B.TABLE_NAME || '.' || B.COLUMN_NAME fk_column FROM user_constraints a JOIN user_cons_columns b ON a.constraint_name = b.constraint_name AND a.owner = b.owner JOIN user_constraints c ON A.R_CONSTRAINT_NAME = C.CONSTRAINT_NAME AND A.R_OWNER = c.owner JOIN user_cons_columns d ON c.constraint_name = d.constraint_name AND c.owner = d.owner WHERE D.table_name = 'DOCTOR_EXAM_MEMBER' bjhr@ANBOB>/ PK_NAME PK_COLUMN C FK_NAME FK_COLUMN -------------------- ---------------------------------------- - ------------------------------ ------------------------------------------------------- PK_DOCTOR_EXAM_MEMBE DOCTOR_EXAM_MEMBER.DOCTOR_EXAM_MEMBER_ID R FK_RESULT_N_REFERENCE_DOCTOR RESULT_NOTIFICATION_RECORD.DOCTOR_EXAM_MEMBER_ID PK_DOCTOR_EXAM_MEMBE DOCTOR_EXAM_MEMBER.DOCTOR_EXAM_MEMBER_ID R RESULT_RE_DOCTOR_MEMBER DOCTOR_EXAM_RESULT.DOCTOR_EXAM_MEMBER_ID
–有外键,之前已对子表进行过删除,否则会报错ORA-02266
delete RESULT_NOTIFICATION_RECORD;
delete DOCTOR_EXAM_RESULT;
commit;
–下面开始分析,创建新的session
sys@ANBOB>select xidsqn,xidusn,object_id,session_id,locked_mode from v$locked_object; XIDSQN XIDUSN OBJECT_ID SESSION_ID LOCKED_MODE -------------------- -------------------- -------------------- -------------------- -------------------- 2102 203 1639631 2290 3 2102 203 1639572 2290 3 sys@ANBOB>select object_name,object_type from dba_objects where object_id in(1639631,1639572); OBJECT_NAME OBJECT_TYPE ------------------------------ ------------------- DOCTOR_EXAM_MEMBER TABLE DOCTOR_EXAM_RESULT TABLE sys@ANBOB>select event,p1,p2,p1text,p2text,seconds_in_wait,state from v$session_wait where sid=2290; EVENT P1 P2 P1TEXT P2TEXT SECONDS_IN_WAIT STATE ------------------------------ ----------- ----- -------------------- -------------------- -------------------- ------------------- latch: shared pool 1611704464 307 address number 213 WAITED SHORT TIME --trace hanganalyze and systemstate alter session set events 'immediate trace name systemstate level 266'; alter session set events 'immediate trace name hanganalyze level 3'; --hanganalyze trace =============================================================================== Chains most likely to have caused the hang: [a] Chain 1 Signature: Chain 1 Signature Hash: 0x673a0128 [b] Chain 2 Signature: 'Streams AQ: waiting for messages in the queue' Chain 2 Signature Hash: 0xa00e2e87 =============================================================================== Sessions in an involuntary wait or not in a wait: ------------------------------------------------------------------------------- Chain 1: ------------------------------------------------------------------------------- Oracle session identified by: { instance: 1 (ANBOB.ANBOB) os id: 27158 process id: 94, oracle@dev-db (TNS V1-V3) session id: 2290 session serial #: 7981 } is not in a wait: { last wait: 11 min 0 sec ago blocking: 0 sessions wait history: 1. event: 'latch: shared pool' time waited: 0.000114 sec wait id: 183 p1: 'address'=0x6010a890 p2: 'number'=0x133 p3: 'tries'=0x0 * time between wait #1 and #2: 1.586255 sec 2. event: 'latch: shared pool' time waited: 0.000032 sec wait id: 182 p1: 'address'=0x6010a890 p2: 'number'=0x133 p3: 'tries'=0x0 * time between wait #2 and #3: 0.133830 sec 3. event: 'latch: shared pool' time waited: 0.000114 sec wait id: 181 p1: 'address'=0x6010a890 p2: 'number'=0x133 p3: 'tries'=0x0 } Chain 1 Signature: Chain 1 Signature Hash: 0x673a0128
–对systemstate 没发现可疑信息
[oracle@dev-db ~]$ awk -f ass109.awk /oracle/diag/rdbms/ANBOB/ANBOB/trace/ANBOB_ora_23020.trc
— 奇怪为什么会发生在latch:shared pool上? 应该是sql解析和shared pool相关的事件,随后结束delete,做10046 观察究竟
sys@ANBOB>oradebug setmypid;
Statement processed.
sys@ANBOB>oradebug event 10046 trace name context forever,level 12
Statement processed.
sys@ANBOB>delete bjhr.doctor_exam_member;
92102 rows deleted.
sys@ANBOB>oradebug tracefile_name
/u01/app/oracle/diag/rdbms/anbob/anbob/trace/anbob_ora_7784.trc
sys@ANBOB>oradebug event 10046 trace name context off;
Statement processed.
— 格式化trace,终于发现了答案.
delete bjhr.doctor_exam_member call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 47.30 48.39 201 222 657611 92102 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 47.31 48.39 201 222 657611 92102 Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ Disk file operations I/O 2 0.00 0.00 db file scattered read 26 0.00 0.00 db file sequential read 24 0.00 0.00 SQL*Net message to client 1 0.00 0.00 SQL*Net message from client 1 0.00 0.00 ******************************************************************************** -- check deferred objects select pctfree_stg, pctused_stg, size_stg,initial_stg, next_stg, minext_stg, maxext_stg, maxsiz_stg, lobret_stg,mintim_stg, pctinc_stg, initra_stg, maxtra_stg, optimal_stg, maxins_stg,frlins_stg, flags_stg, bfp_stg, enc_stg, cmpflag_stg, cmplvl_stg from deferred_stg$ where obj# =:1 ******************************************************************************** select /*+ all_rows */ count(1) from "BJHR"."RESULT_NOTIFICATION_RECORD" where "DOCTOR_EXAM_MEMBER_ID" = :1 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 92102 11.31 11.34 0 0 0 0 Fetch 92102 0.63 0.64 0 0 0 92102 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 184205 11.95 11.99 0 0 0 92102 Misses in library cache during parse: 1 Misses in library cache during execute: 1 Optimizer mode: ALL_ROWS Parsing user id: SYS (recursive depth: 1) Number of plan statistics captured: 1 Rows (1st) Rows (avg) Rows (max) Row Source Operation ---------- ---------- ---------- --------------------------------------------------- 1 1 1 SORT AGGREGATE (cr=0 pr=0 pw=0 time=43 us) 0 0 0 TABLE ACCESS FULL RESULT_NOTIFICATION_RECORD (cr=0 pr=0 pw=0 time=12 us cost=3 size=5 card=1) Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ latch: shared pool 2 0.00 0.00 ******************************************************************************** select /*+ all_rows */ count(1) from "BJHR"."DOCTOR_EXAM_RESULT" where "DOCTOR_EXAM_MEMBER_ID" = :1 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 92102 6.97 7.11 0 0 0 0 Fetch 92102 1012.96 1016.14 0 566243096 92102 92102 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 184205 1019.93 1023.25 0 566243096 92102 92102 Rows (1st) Rows (avg) Rows (max) Row Source Operation ---------- ---------- ---------- --------------------------------------------------- 1 1 1 SORT AGGREGATE (cr=6148 pr=0 pw=0 time=30196 us) 0 0 0 TABLE ACCESS FULL DOCTOR_EXAM_RESULT (cr=6148 pr=0 pw=0 time=30184 us cost=1647 size=5 card=1) ******************************************************************************** OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 39 0.00 0.01 0 0 0 0 Execute 184248 18.29 18.46 0 0 0 0 Fetch 184276 1013.60 1016.79 0 566243218 92102 184238 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 368563 1031.90 1035.27 0 566243218 92102 184238 1129 elapsed seconds in trace file.
TIP:
在删除doctor_exam_member表时,检查了他的所有参照表(子表),然后对doctor_exam_member表的每次记录都要去参照表查询是否存在,此时刚好参考表的外键列上并无索引,导致每一行记录都会导致FTS(full table scan),这也是查询v$session_event时偶尔出现latch: CBC (hot block)的原因。
你可能疑问子表数据都delete了为什么还查询这么久?我做个小测试
sys@ORA10GR2>select count(*) from bjhr_dev.DOCTOR_EXAM_RESULT;
COUNT(*)
——————–
0
sys@ORA10GR2>select bytes,blocks from dba_segments where segment_name=’DOCTOR_EXAM_RESULT’ and owner=’BJHR_DEV’;
BYTES BLOCKS
——————– ——————–
50331648 6144
sys@ORA10GR2>set autot trace stat
sys@ORA10GR2>select count(*) from bjhr_dev.DOCTOR_EXAM_RESULT where DOCTOR_EXAM_MEMBER_ID=1;
Statistics
———————————————————-
0 recursive calls
0 db block gets
6040 consistent gets
0 physical reads
0 redo size
514 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
sys@ORA10GR2>alter table bjhr_dev.doctor_exam_result enable row movement;
Table altered.
sys@ORA10GR2>alter table bjhr_dev.DOCTOR_EXAM_RESULT shrink space;
Table altered.
sys@ORA10GR2>alter table bjhr_dev.doctor_exam_result disable row movement;
Table altered.
sys@ORA10GR2>select bytes,blocks from dba_segments where segment_name=’DOCTOR_EXAM_RESULT’ and owner=’BJHR_DEV’;
BYTES BLOCKS
——————– ——————–
196608 24
sys@ORA10GR2>select count(*) from bjhr_dev.DOCTOR_EXAM_RESULT where DOCTOR_EXAM_MEMBER_ID=1;
Statistics
———————————————————-
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
514 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
TIP:
FTS查询会遍历表segment 已格式化过所有data block.
Summary:
在建有外键约束的子表列上需要创建索引,对子表全表删除时可以采用truncate 或delete(有外键不能truncate时)后对表进行shrink space操作,或删除父表前对子表的外键约束做Disable.