资讯专栏INFORMATION COLUMN

PARTITION RANGE SUBQUERY 导致SQL执行效率降低的处理案例

IT那活儿 / 3015人阅读
PARTITION RANGE SUBQUERY 导致SQL执行效率降低的处理案例

在一个风和日丽阳光明媚的早上,刚到公司屁股还没有坐热,就接到报告:XX系统一条SQL执行效率很低,赶紧上去经过一番犀利操作,啪啪啪敲下熟悉的命令,获取SQL信息如下:


PLAN_TABLE_OUTPUT


------------------------------------------------------------------------------------------------------------------------------------------


SQL_ID  dkusf44y9g1yv, child number 0


-------------------------------------


SELECT A.NO_ID,A.BILL_ID1 FROM MM_USER_INFO A, MM_ATOBSTESTSTATE_INFO B


WHERE A.NO_ID=B.NO_ID AND B.DONETIME < ADD_MONTHS(SYSDATE,-6) AND


B.RUN_IDNO=W AND A.KKK_SERVICE_ID=1111


Plan hash value: 3801554394

------------------------------------------------------------------------------------------------------------------------------


| Id  | Operation                           | Name                   | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |


------------------------------------------------------------------------------------------------------------------------------


|   0 | SELECT STATEMENT                    |                        |       |       |       | 64561 (100)|          |       |       |


|*  1 |  HASH JOIN                          |                        |   255K|    11M|  9744K| 64561   (2)| 00:12:55 |       |       |


|   2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| MM_USER_INFO           |   255K|  6741K|       | 26342   (1)| 00:05:17 | ROWID | ROWID |


|*  3 |    INDEX RANGE SCAN                 | IDX_MMUSER_SERVICEID   |   255K|       |       |   649   (1)| 00:00:08 |       |       |


|   4 |   PARTITION RANGE ALL               |                        |   475K|  8821K|       | 37038   (2)| 00:07:25 |     1 |    17 |


|*  5 |    TABLE ACCESS FULL                | MM_ATOBSTESTSTATE_INFO |   475K|  8821K|       | 37038   (2)|
00:07:25 |     1 |    17 |


--------------------------------------------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):


---------------------------------------------------


   1 - access("A"."NO_ID"="B"."NO_ID")


   3 - access("A"."KKK_SERVICE_ID"=1111)


   5 - filter(("B"."RUN_IDNO"=W AND "B"."DONETIME" 26 rows selected.





 从执行计划上大体可以看出,这条语句走HASHJOIN,主要慢在ID=2和ID=5。值得注意一点:这里的执行计划是指标都是估算的,估算的东西就是可能不准确,特别是执行计划有问题的时候,正是因为各种不准的指标才导致执行计划走的不好,那么我们要分析问题的根源,最好可以通过PredicateInformation里的条件去计算真实的Rows,或用gather_plan_statistics或altersession setstatistics_level=all去看A-Rows,A-Time等信息,从而准确判断慢在哪一步,最终找出慢的根源。

通过上述理论指导,下面赶紧看下真实执行计划(这里有真实的Starts,A-Rows等):

Planhash value: 3801554394

------------------------------------------------------------------------------------------------------------------------------------------------------------

|Id  | Operation                           | Name                   |Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem | 1Mem | Used-Mem |

------------------------------------------------------------------------------------------------------------------------------------------------------------

|  0 | SELECT STATEMENT                    |                        |     1 |        |    842 |00:01:18.55 |     670K|    570K|       |      |          |

|* 1 |  HASH JOIN                          |                        |     1 |    255K|    842 |00:01:18.55 |     670K|    570K|   130M|   13M|  126M (0)|

|  2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| UR_USER_INFO           |     1 |    255K|   2113K|00:00:41.52 |     299K|    200K|       |      |          |

|* 3 |    INDEX RANGE SCAN                 | IDX_USERINFO_SERV_ID   |     1 |    255K|   2113K|00:00:07.14 |   12844 |  12843 |       |      |          |

|  4 |   PARTITION RANGE ALL               |                        |     1 |    475K|    842 |00:00:34.24 |     370K|    370K|       |      |          |

|* 5 |    TABLE ACCESS FULL                | UR_CRMTOBOSSSTATE_INFO |    17 |    475K|    842 |00:00:34.23 |     370K|    370K|       |      |          |

------------------------------------------------------------------------------------------------------------------------------------------------------------

PredicateInformation (identified by operation id):

---------------------------------------------------

  1- access("A"."ID_NO"="B"."ID_NO")

  3- access("A"."MASTER_SERV_ID"=2063)

  5- filter(("B"."RUN_CODE"=W AND"B"."OP_TIME"


通过查看真实执行计划,一眼定位出的确慢在ID=2和ID=5两个步骤,其中ID=2回表慢,从索引查找7s回表变为41s(也就是回表操作需要34s),而ID=5是全表慢。其中ID=5的估算行数是47.5w行,实际上只有842行(这就是估算的行和真实行差别超大,一般统计信息不准),因此适合建立索引,通过下面分析适合建立组合索引。


SQL> select count(*) from MM_ATOBSTESTSTATE_INFO


  2  where DONETIME < ADD_MONTHS(SYSDATE, -6);


  COUNT(*)


----------


  14539090


SQL> select/*+parallel(16)*/ count(*) from MM_ATOBSTESTSTATE_INFO


  2  where RUN_IDNO = W;


  COUNT(*)


----------


     59675



SQL> select/*+parallel(16)*/ count(*) from MM_ATOBSTESTSTATE_INFO


  2  where RUN_IDNO = W and DONETIME < ADD_MONTHS(SYSDATE,-6);


  COUNT(*)


----------


     842



索引创建如下:

CREATE INDEX "OPERTIADM"."IDX1_MM_ATOBSTESTSTATE_INFO" ON "OPERTIADM"."MM_ATOBSTESTSTATE_INFO" ("RUN_IDNO","DONETIME")


 LOCAL TABLESPACE "TBS_IDX_TKO" parallel 16;


 alter index "OPERTIADM"."IDX1_MM_ATOBSTESTSTATE_INFO" noparallel;




然后令人郁闷的事情发生了,执行计划,它竟然没有变:


Execution Plan


----------------------------------------------------------


Plan hash value: 3801554394


--------------------------------------------------------------------------------------------------------------------------------------


| Id  | Operation                           | Name                   | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |


--------------------------------------------------------------------------------------------------------------------------------------


|   0 | SELECT STATEMENT                    |                        |   255K|    11M|       | 64561   (2)| 00:12:55 |       |       |


|*  1 |  HASH JOIN                          |                        |   255K|    11M|  9744K| 64561   (2)| 00:12:55 |       |       |


|   2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| MM_USER_INFO           |   255K|  6741K|       | 26342   (1)| 00:05:17 | ROWID | ROWID |


|*  3 |    INDEX RANGE SCAN                 | IDX_MMUSER_SERVICEID   |   255K|       |       |   649   (1)| 00:00:08 |       |       |


|   4 |   PARTITION RANGE ALL               |                        |   475K|  8821K|       | 37038   (2)| 00:07:25 |     1 |    17 |


|*  5 |    TABLE ACCESS FULL                | MM_ATOBSTESTSTATE_INFO |   475K|  8821K|       | 37038   (2)| 00:07:25 |     1 |    17 |


--------------------------------------------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):


---------------------------------------------------


   1 - access("A"."NO_ID"="B"."NO_ID")


   3 - access("A"."KKK_SERVICE_ID"=1111)


   5 - filter("B"."RUN_IDNO"=W AND "B"."DONETIME"<="" add_months(sysdate@!,-6))



 
 回头一想也正常,执行计划未变,主要是ID=5的cardinality估算不准确。既然cardinality不准确,那么就收集统计信息,收集完统计信息的执行计划如下:

Plan hash value: 1403561594

---------------------------------------------------------------------------------------------------------------------------------------


| Id  | Operation                            | Name                        | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |


---------------------------------------------------------------------------------------------------------------------------------------


|   0 | SELECT STATEMENT                     |                             |      0 |        |      0 |00:00:00.01 |       0 |      0 |


|*  1 |  TABLE ACCESS BY LOCAL INDEX ROWID   | MM_USER_INFO                |      1 |      1 |    842 |00:00:10.81 |     302K|      5 |


|   2 |   NESTED LOOPS                       |                             |      1 |   1349K|   1685 |00:00:10.81 |     301K|      5 |


|   3 |    PARTITION RANGE SUBQUERY          |                             |      1 |   1349K|    842 |00:00:10.80 |     300K|      5 |


|   4 |     TABLE ACCESS BY LOCAL INDEX ROWID| MM_ATOBSTESTSTATE_INFO      |      6 |   1349K|    842 |00:00:00.01 |     842 |      0 |


|*  5 |      INDEX RANGE SCAN                | IDX1_MM_ATOBSTESTSTATE_INFO |      6 |   1349K|    842 |00:00:00.01 |      77 |      0 |


|   6 |    PARTITION RANGE ITERATOR          |                             |    842 |      1 |    842 |00:00:00.01 |    1741 |      0 |


|*  7 |     INDEX UNIQUE SCAN                | PK_MM_USER_INFO             |    842 |      1 |    842 |00:00:00.01 |    1741 |      0 |


---------------------------------------------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):


---------------------------------------------------


   1 - filter("A"."KKK_SERVICE_ID"=1111)


   5 - access("B"."RUN_IDNO"=W AND "B"."DONETIME"    7 - access("A"."NO_ID"="B"."NO_ID")



 现在走NESTEDLOOPS了,而且使用了刚建立的索引,并且原来慢的ID=2变成被驱动了,走UNIQUESCAN,但是还是需要10s多,效率没有啥提升,通过执行计划可以看出主要慢在 PARTITIONRANGE SUBQUERY上,这个查询转换(QueryTransformation)的玩意,没有啥好办法了,只能关闭这个功能对应的参数,当然,最好的是使用SQLPROFILE绑定到这条语句,在语句级进行修改,类似加了下面的HINTS:

SELECT/*+leading(b) use_nl(a) index(b IDX1_MM_ATOBSTESTSTATE_INFO)


opt_param(_subquery_pruning_enabled false)*/ A.NO_ID, A.BILL_ID1


FROM MM_USER_INFO A, MM_ATOBSTESTSTATE_INFO B  WHERE A.NO_ID = B.NO_ID


  AND B.DONETIME < ADD_MONTHS(SYSDATE, -6)    AND B.RUN_IDNO = W


AND A.KKK_SERVICE_ID = 1111



然后用上述加了HINTS的SQL执行计划,绑定到原SQL,最终SQL执行计划如下

Plan hash value: 4290111086

------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                             |      0 |        |      0 |00:00:00.01 |       0 |
|*  1 |  TABLE ACCESS BY LOCAL INDEX ROWID   | MM_USER_INFO                |      1 |      1 |    842 |00:00:00.02 |    3436 |
|   2 |   NESTED LOOPS                       |                             |      1 |   1349K|   1685 |00:00:00.02 |    2594 |
|   3 |    PARTITION RANGE ALL               |                             |      1 |   1349K|    842 |00:00:00.01 |     853 |
|   4 |     TABLE ACCESS BY LOCAL INDEX ROWID| MM_ATOBSTESTSTATE_INFO      |     17 |   1349K|    842 |00:00:00.01 |     853 |
|*  5 |      INDEX RANGE SCAN                | IDX1_MM_ATOBSTESTSTATE_INFO |     17 |   1349K|    842 |00:00:00.01 |      88 |
|   6 |    PARTITION RANGE ITERATOR          |                             |    842 |      1 |    842 |00:00:00.01 |    1741 |
|*  7 |     INDEX UNIQUE SCAN                | PK_MM_USER_INFO             |    842 |      1 |    842 |00:00:00.01 |    1741 |
------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"."KKK_SERVICE_ID"=1111)
   5 - access("B"."RUN_IDNO"=W AND "B"."DONETIME"    7 - access("A"."NO_ID"="B"."NO_ID")

 非常完美,使用到我们建立的索引,并且消除了PARTITIONRANGESUBQUERY,执行效率由原来的10s多降低为0.01s,效率提升上千倍。SQL优化的方式N种,唯有准确找到问题根源才能快速解决,这里我通过分析真实的执行计划快速找到问题的ROOTCAUSE,从而解决之。


总结下这个案例:先通过分析得知需要建立索引,但是建立索引后执行计划未变,发现是cardinality估算不准,那么收集统计信息,收集完毕后,走索引和NL,但是却出现了PARTITIONRANGE SUBQUERY影响效率,通过收集统计信息等方式已经解决不了,那么只能在语句级先关闭这个参数,通过SQLPROFILE绑定,从而达到解决问题的目的。
附:关于PARITITION的详细内容可以参考

VLDBand Partitioning Guide :

https://docs.oracle.com/en/database/oracle/oracle-database/19/vldbg/index.html

里面的AdvancedPartition Pruning Techniques有关于PARTITIONRANGE SUBQUERY的内容。

文章版权归作者所有,未经允许请勿转载,若此文章存在违规行为,您可以联系管理员删除。

转载请注明本文地址:https://www.ucloud.cn/yun/130067.html

相关文章

发表评论

0条评论

IT那活儿

|高级讲师

TA的文章

阅读更多
最新活动
阅读需要支付1元查看
<