对Group By 语句的一次优化过程
對(duì)Group By 語句的一次優(yōu)化過程
對(duì)Group By 語句的一次優(yōu)化過程
?
作者: fuyuncat
來源: www.HelloDBA.com
?
?
生產(chǎn)環(huán)境中發(fā)現(xiàn)一條語句很慢,拿回來一看,其實(shí)是一個(gè)簡(jiǎn)單的Group By語句:
表CCMMT的數(shù)據(jù)量比較大,5M多條記錄。
?
1、
SQL> select CDE, CID
?2?from CCMMT
?3?GROUP BY CDE, CID
?4?having max(ADT) < sysdate - 180;
?
707924 rows selected.
?
Elapsed: 00:06:17.49
?
Execution Plan
----------------------------------------------------------
?? 0????? SELECT STATEMENT Optimizer=CHOOSE (Cost=414 Card=238583 Bytes=4771660)
?? 1??? 0?? FILTER
?? 2??? 1???? SORT (GROUP BY NOSORT) (Cost=414 Card=238583 Bytes=4771660)
?? 3??? 2?????? TABLE ACCESS (BY INDEX ROWID) OF 'CCMMT' (Cost=414 Card=57969096 Bytes=1159381920)
?? 4??? 3???????? INDEX (FULL SCAN) OF 'CCMMT_TEMP_IDX' (NON-UNIQUE) (Cost=26 Card=57969096)
?
Statistics
----------------------------------------------------------
????????? 0?recursive calls
????????? 0?db block gets
??? 2769177?consistent gets
??? 1089991?physical reads
????????? 0?redo size
?? 23926954?bytes sent via SQL*Net to client
???? 519785?bytes received via SQL*Net from client
????? 47196?SQL*Net roundtrips to/from client
????????? 0?sorts (memory)
????????? 0?sorts (disk)
???? 707924?rows processed
?
要6min多返回。嘗試調(diào)整語句寫法,用minus代替Group By:
?
2、
SQL> select DISTINCT CDE, CID
?2?from CCMMT
?3?where ADT < sysdate - 180
?4?minus
?5?select DISTINCT CDE, CID
?6?from CCMMT
?7?where ADT >= sysdate - 180;
?
707924 rows selected.
?
Elapsed: 00:00:21.53
?
Execution Plan
----------------------------------------------------------
?? 0????? SELECT STATEMENT Optimizer=CHOOSE (Cost=190624 Card=2794940
????????? Bytes=111797600)
?? 1??? 0?? MINUS
?? 2??? 1???? SORT* (UNIQUE) (Cost=95312 Card=2794940 Bytes=55898800) ??????????????????????????????????????????????:Q13049001
?? 3??? 2?????? INDEX* (FAST FULL SCAN) OF 'CCMMT_UQ1' (UNIQUE) (Cost=77305 Card=2898455 Bytes=57969100)??? :Q13049000
?? 4??? 1???? SORT* (UNIQUE) (Cost=95312 Card=2794940 Bytes=55898800)?????????? ????????????????????????????????????:Q13050001
?? 5??? 4?????? INDEX* (FAST FULL SCAN) OF 'CCMMT_UQ1' (UNIQUE) (Cost=77305 Card=2898455 Bytes=57969100)??? :Q13050000
?
?? 2 PARALLEL_TO_SERIAL??????????? SELECT DISTINCT C0 C0,C1 C1 FROM :Q13049000 ORDER BY C0,C1
?? 3 PARALLEL_TO_PARALLEL????????? SELECT /*+ INDEX_RRS(A1 "CCMMT_UQ1")*/ A1."CDE" C0,A1."CA
?? 4 PARALLEL_TO_SERIAL??????????? SELECT DISTINCT C0 C0,C1 C1 FROM :Q13050000 ORDER BY C0,C1
?? 5 PARALLEL_TO_PARALLEL????????? SELECT /*+ INDEX_RRS(A1 "CCMMT_UQ1")*/ A1."CDE" C0,A1."CA
?
Statistics
----------------------------------------------------------
????????? 0?recursive calls
???????? 33?db block gets
???? 126566?consistent gets
???? 129243?physical reads
????????? 0?redo size
?? 18461368?bytes sent via SQL*Net to client
???? 519785?bytes received via SQL*Net from client
????? 47196?SQL*Net roundtrips to/from client
????????? 4?sorts (memory)
????????? 2?sorts (disk)
???? 707924?rows processed
?
效果不錯(cuò),Consistent gets 和 Physical Reads都下降了,同時(shí)只需要21s就返回了。但從查詢計(jì)劃看,用到了并行查詢,因此會(huì)消耗更多的CPU。
在(ADT, CDE, CID )上創(chuàng)建索引,再次執(zhí)行:
?
3、
SQL> select DISTINCT CDE, CID
?2?from CCMMT
?3?where ADT < sysdate - 180
?4?minus
?5?select DISTINCT CDE, CID
?6?from CCMMT
?7?where ADT >= sysdate - 180;
?
707924 rows selected.
?
Elapsed: 00:00:26.94
?
Execution Plan
----------------------------------------------------------
?? 0????? SELECT STATEMENT Optimizer=CHOOSE (Cost=36018 Card=2794940 Bytes=111797600)
?? 1??? 0?? MINUS
?? 2??? 1???? SORT (UNIQUE) (Cost=18009 Card=2794940 Bytes=55898800)
?? 3??? 2?????? INDEX (RANGE SCAN) OF 'CCMMT_IDX3' (NON-UNIQUE) (Cost=2 Card=2898455 Bytes=57969100)
?? 4??? 1???? SORT (UNIQUE) (Cost=18009 Card=2794940 Bytes=55898800)
?? 5??? 4?????? INDEX (RANGE SCAN) OF 'CCMMT_IDX3' (NON-UNIQUE) (Cost=2 Card=2898455 Bytes=57969100)
?
Statistics
----------------------------------------------------------
????????? 0?recursive calls
??????? 118?db block gets
????? 22565?consistent gets
????? 31604?physical reads
????????? 0?redo size
?? 18461368?bytes sent via SQL*Net to client
???? 519785?bytes received via SQL*Net from client
????? 47196?SQL*Net roundtrips to/from client
????????? 1?sorts (memory)
????????? 1?sorts (disk)
???? 707924?rows processed
?
效果也比較理想,consistent gets和physical reads再次大大下降,返回時(shí)間和上面差不多,在一個(gè)數(shù)量級(jí)上,但是不再使用并行查詢了。
用NOT Exists代替minus:
?
4、
SQL> select DISTINCT CDE, CID
?2?from CCMMT a
?3?where ADT < sysdate - 180
?4?AND NOT EXISTS
?5?(SELECT CDE, CID FROM
?6?(select DISTINCT CDE, CID
?7?from CCMMT
?8?where ADT >= sysdate - 180) b
?9?WHERE a.CDE = b.CDE
?10?AND a.CID = b.CID);
?
707924 rows selected.
?
Elapsed: 00:10:35.70
?
Execution Plan
----------------------------------------------------------
?? 0????? SELECT STATEMENT Optimizer=CHOOSE (Cost=600 Card=144923 Bytes=2898460)
?? 1??? 0?? SORT (UNIQUE) (Cost=600 Card=144923 Bytes=2898460)
?? 2??? 1???? INDEX (RANGE SCAN) OF 'CCMMT_IDX3' (NON-UNIQUE)(Cost=2 Card=144923 Bytes=2898460)
?? 3??? 2?????? TABLE ACCESS (BY INDEX ROWID) OF 'CCMMT' (Cost=2 Card=1 Bytes=20)
?? 4??? 3???????? INDEX (RANGE SCAN) OF 'CCMMT_TEMP_IDX' (NON-UNIQUE) (Cost=1 Card=9)
?
Statistics
----------------------------------------------------------
????????? 5?recursive calls
??????? 118?db block gets
?? 40535587?consistent gets
??? 3157604?physical reads
????????? 0?redo size
?? 18461368?bytes sent via SQL*Net to client
???? 519785?bytes received via SQL*Net from client
????? 47196?SQL*Net roundtrips to/from client
????????? 2?sorts (memory)
????????? 1?sorts (disk)
???? 707924?rows processed
?
FT! consistent gets和physical reads爆漲,10min才返回結(jié)果!
用Not In換掉Not Exists:
?
5、
SQL> select DISTINCT CDE, CID
?2?from CCMMT a
?3?where ADT < sysdate - 180
?4?AND (CDE, CID) NOT IN
?5?(select DISTINCT CDE, CID
?6?from CCMMT
?7?where ADT >= sysdate - 180);
?
707924 rows selected.
?
Elapsed: 00:01:00.70
?
Execution Plan
----------------------------------------------------------
?? 0????? SELECT STATEMENT Optimizer=CHOOSE (Cost=36425 Card=1 Bytes=40)
?? 1??? 0?? SORT (UNIQUE NOSORT) (Cost=36425 Card=1 Bytes=40)
?? 2??? 1???? MERGE JOIN (ANTI) (Cost=36423 Card=1 Bytes=40)
?? 3??? 2?????? SORT (JOIN) (Cost=18212 Card=2898455 Bytes=57969100)
?? 4??? 3???????? INDEX (RANGE SCAN) OF 'CCMMT_IDX3' (NON-UNIQUE) (Cost=2 Card=2898455 Bytes=57969100)
?? 5??? 2?????? SORT (UNIQUE) (Cost=18212 Card=2898455 Bytes=57969100)
?? 6??? 5???????? INDEX (RANGE SCAN) OF 'CCMMT_IDX3' (NON-UNIQUE) (Cost=2 Card=2898455 Bytes=57969100)
?
Statistics
----------------------------------------------------------
????????? 0?recursive calls
??????? 419?db block gets
????? 22565?consistent gets
????? 98692?physical reads
????????? 0?redo size
?? 18461368?bytes sent via SQL*Net to client
???? 519785?bytes received via SQL*Net from client
????? 47196?SQL*Net roundtrips to/from client
????????? 1?sorts (memory)
????????? 1?sorts (disk)
???? 707924?rows processed
?
恩,consistent gets和建了索引時(shí)的minus方式一樣,但是physical reads太大,返回時(shí)間太長(zhǎng)---1min。同時(shí)用到了剛才建的索引。(呵呵,所以說,NOT EXISTS并不是什么情況下都比NOT IN更優(yōu)啊)
在嘗試用left join + is null代替not in:
?
6、
SQL> SELECT a.CDE, a.CID
?2?FROM
?3?(select DISTINCT CDE, CID
?4?from CCMMT
?5?where ADT < sysdate - 180) a,
?6?(select DISTINCT CDE, CID
?7?from CCMMT
?8?where ADT >= sysdate - 180) b
?9?WHERE a.CDE = b.CDE(+)
?10?AND a.CID = b.CID(+)
?11?AND b.CDE IS NULL;
?
707924 rows selected.
?
Elapsed: 00:00:25.46
?
Execution Plan
----------------------------------------------------------
?? 0????? SELECT STATEMENT Optimizer=CHOOSE (Cost=54675 Card=2794940 Bytes=117387480)
?? 1?? ?0?? FILTER
?? 2??? 1???? MERGE JOIN (OUTER)
?? 3??? 2?????? VIEW (Cost=18009 Card=2794940 Bytes=58693740)
?? 4??? 3???????? SORT (UNIQUE) (Cost=18009 Card=2794940 Bytes=55898800)
?? 5??? 4?????????? INDEX (RANGE SCAN) OF 'CCMMT_IDX3' (NON-UNIQUE) (Cost=2 Card=2898455 Bytes=57969100)
?? 6??? 2?????? SORT (JOIN) (Cost=36667 Card=2794940 Bytes=58693740)
?? 7??? 6???????? VIEW (Cost=18009 Card=2794940 Bytes=58693740)
?? 8??? 7?????????? SORT (UNIQUE) (Cost=18009 Card=2794940 Bytes=55898800)
?? 9??? 8????????? ???INDEX (RANGE SCAN) OF 'CCMMT_IDX3' (NON-UNIQUE) (Cost=2 Card=2898455 Bytes=57969100)
?
Statistics
----------------------------------------------------------
???????? 10?recursive calls
??????? 118?db block gets
????? 22569?consistent gets
????? 31300 ?physical reads
????????? 0?redo size
?? 18461368?bytes sent via SQL*Net to client
???? 519785?bytes received via SQL*Net from client
????? 47196?SQL*Net roundtrips to/from client
????????? 6?sorts (memory)
????????? 1?sorts (disk)
???? 707924?rows processed
?
效果不錯(cuò),和有索引時(shí)使用minus在同一數(shù)量級(jí)上。
?
總結(jié),以上幾種方式中,效果最好的應(yīng)該是第3種和第6種,buffer gets、磁盤IO和CPU消耗都比較少,返回時(shí)間大大減少,但是需要新建一個(gè)索引,消耗更多磁盤空間,并存在影響其它語句的正常查詢計(jì)劃的風(fēng)險(xiǎn)。而第2種方式應(yīng)該是次好的。在返回時(shí)間上,和上面兩種差不多,不需要新的索引,但是會(huì)消耗更多的內(nèi)存、磁盤和CPU資源。
出于綜合考慮,采用了第2種方式對(duì)生產(chǎn)庫進(jìn)行了優(yōu)化。
(以上例子中的對(duì)象名進(jìn)行了替換,其他都是原版)
總結(jié)
以上是生活随笔為你收集整理的对Group By 语句的一次优化过程的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 如何搭建一个数据库服务器平台 .
- 下一篇: Oracle 外连接和 (+)号的用法