SQL Tuning
Analytical Ranking Functions and Windows Sort Pushed Rank access
User Rating: / 0
- Details
-
Parent Category: Articles
-
Created on Friday, 16 November 2012 16:53
-
Last Updated on Tuesday, 10 December 2019 14:50
-
Published on Friday, 16 November 2012 17:37
-
Written by Guy Lambregts
-
Hits: 10652
Analytical Ranking Functions and Windows Sort Pushed Rank access
We use the sample tuning schema
The table T_ORDER_OR contains purchase orders, for different currencies and different suppliers,
Which are the top 5 cheapest orders per currency ?
select * from
( select su_fk,or_totalamount,cr_fk,
rank() over (partition by cr_fk order by or_totalamount) "RANKING"
from t_order_or
) where RANKING <= 5;
SU_FK OR_TOTALAMOUNT CR_ RANKING
---------------- ---------------- --- ----------------
17 50.41 EUR 1
98 50.98 EUR 2
38 51.03 EUR 3
43 51.75 EUR 4
99 51.87 EUR 5
99 50.36 GBP 1
61 51.84 GBP 2
56 52.05 GBP 3
60 52.27 GBP 4
3 52.27 GBP 4
83 50.13 USD 1
40 50.26 USD 2
46 50.98 USD 3
89 51.37 USD 4
85 51.53 USD 5
97 51.01 YEN 1
73 51.65 YEN 2
40 51.93 YEN 3
62 52.03 YEN 4
79 52.23 YEN 5
Which are the top 5 most expensive orders per currency ?
What is the explain plan for this query ( we assume there will be a sort operation )
select * from
( select su_fk,or_totalamount,cr_fk,
rank() over (partition by cr_fk order by or_totalamount DESC) "RANKING"
from t_order_or
) where RANKING <= 5;
SU_FK OR_TOTALAMOUNT CR_ RANKING
---------------- ---------------- --- ----------------
68 99999.8 EUR 1
20 99999.65 EUR 2
61 99998.55 EUR 3
71 99998.47 EUR 4
85 99997.83 EUR 5
91 99999.74 GBP 1
73 99999.6 GBP 2
85 99999.51 GBP 3
31 99998.02 GBP 4
22 99997.93 GBP 5
77 99999.93 USD 1
17 99999.3 USD 2
60 99999.01 USD 3
4 99998.68 USD 4
10 99998.27 USD 5
39 99999.73 YEN 1
7 99999.55 YEN 2
15 99999.15 YEN 3
43 99998.38 YEN 4
82 99996.38 YEN 5
20 rows selected.
SQL> select * from table(dbms_xplan.display_cursor);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID bu91tkf4bxr2a, child number 0
-------------------------------------
select * from ( select su_fk,or_totalamount,cr_fk, rank()
over (partition by cr_fk order by or_totalamount DESC) "RANKING"
from t_order_or ) Where RANKING <= 5
Plan hash value: 3809009122
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | | 11675 (100)| |
|* 1 | VIEW | | 1000K| 44M| | 11675 (1)| 00:02:21 |
|* 2 | WINDOW SORT PUSHED RANK| | 1000K| 12M| 22M| 11675 (1)| 00:02:21 |
| 3 | TABLE ACCESS FULL | T_ORDER_OR | 1000K| 12M| | 6905 (1)| 00:01:23 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RANKING"<=5)
2 - filter(RANK() OVER ( PARTITION BY "CR_FK" ORDER BY
INTERNAL_FUNCTION("OR_TOTALAMOUNT") DESC )<=5)
We face a full table scan of the rather hhuge t_order_or, the table contains 1.000.000 rows, If we would create an index on the columns CR_FK and OR_TOTALAMOUNT, would that index be used ?
create index OR_IDX4 on T_ORDER_OR (CR_FK, OR_TOTALAMOUNT) compress compute statistics;
Index created.
select * from
( select su_fk,or_totalamount,cr_fk,
rank() over (partition by cr_fk order by or_totalamount DESC) "RANKING"
from t_order_or
) Where RANKING <= 5;
Plan hash value: 3809009122
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | | 11675 (100)| |
|* 1 | VIEW | | 1000K| 44M| | 11675 (1)| 00:02:21 |
|* 2 | WINDOW SORT PUSHED RANK| | 1000K| 12M| 22M| 11675 (1)| 00:02:21 |
| 3 | TABLE ACCESS FULL | T_ORDER_OR | 1000K| 12M| | 6905 (1)| 00:01:23 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RANKING"<=5)
2 - filter(RANK() OVER ( PARTITION BY "CR_FK" ORDER BY
INTERNAL_FUNCTION("OR_TOTALAMOUNT") DESC )<=5)
That index is in our test case not used, however if we hint the query with the first rows hint, the index is used ( the first_rows hint can safely be used here )
select /*+ FIRST_ROWS */ * from
( select su_fk,or_totalamount,cr_fk,
rank() over (partition by cr_fk order by or_totalamount DESC) "RANKING"
from t_order_or
) Where RANKING <= 5;
select * from table(dbms_xplan.display_cursor);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 7n9wd6cf4sggf, child number 0
-------------------------------------
select /*+ FIRST_ROWS */ * from ( select
su_fk,or_totalamount,cr_fk, rank() over (partition by cr_fk
order by or_totalamount DESC) "RANKING" from t_order_or )
Where RANKING <= 5
Plan hash value: 4211660427
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | | 13191 (100)| |
|* 1 | VIEW | | 1000K| 44M| | 13191 (1)| 00:02:39 |
|* 2 | WINDOW SORT PUSHED RANK| | 1000K| 12M| 22M| 13191 (1)| 00:02:39 |
| 3 | VIEW | index$_join$_002 | 1000K| 12M| | 8420 (1)| 00:01:42 |
|* 4 | HASH JOIN | | | | | | |
| 5 | INDEX FAST FULL SCAN | OR_IDX4 | 1000K| 12M| | 4675 (1)| 00:00:57 |
| 6 | INDEX FAST FULL SCAN | OR_IDX1 | 1000K| 12M| | 2458 (1)| 00:00:30 |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RANKING"<=5)
2 - filter(RANK() OVER ( PARTITION BY "CR_FK" ORDER BY
INTERNAL_FUNCTION("OR_TOTALAMOUNT") DESC )<=5)
4 - access(ROWID=ROWID)
if we execute the queries with and without the first_rows hint we get the below execution differences
In this example the optimizer only picks up the index in a given context, in some executions context, depending on the amount of data, costof the sort operation, additional fields, average row length, ... it might be interesting to have such an index though.
set autorace traceonly statistics
select /*+ FIRST_ROWS */ * from ( select
su_fk,or_totalamount,cr_fk, rank() over (partition by cr_fk
order by or_totalamount DESC) "RANKING" from t_order_or )
Where RANKING <= 5
20 rows selected.
Statistics
----------------------------------------------------------
46 recursive calls
15 db block gets
5717 consistent gets
2469 physical reads
1644 redo size
1316 bytes sent via SQL*Net to client
534 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
1 sorts (disk)
20 rows processed
select * from ( select
su_fk,or_totalamount,cr_fk, rank() over (partition by cr_fk
order by or_totalamount DESC) "RANKING" from t_order_or )
Where RANKING <= 5
;
Statistics
----------------------------------------------------------
25 recursive calls
11 db block gets
25345 consistent gets
25150 physical reads
1596 redo size
1316 bytes sent via SQL*Net to client
534 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
1 sorts (disk)
20 rows processed
The dense_rank analytical function is only different from the rank function if multiple id ( in this example cr_fk ) have the same value (ranking) in this example or_totalamount )
select * from
( select su_fk,or_totalamount,cr_fk,
dense_rank() over (partition by cr_fk order by or_totalamount DESC) "RANKING"
from t_order_or
) Where RANKING <= 5;
The row_number analytical function is different from the rank rank and dense_rank function if multiple id have the same value, row_number is not to be used when there are identical sorted values.
select * from
( select su_fk,or_totalamount,cr_fk,
ROW_NUMBER () over (partition by cr_fk order by or_totalamount DESC) "RANKING"
from t_order_or
) Where RANKING <= 5;
Note that the rnaking functions can also be used to list the records for positions between y and z ( in this example between 6 and 10 )
select * from ( select
su_fk,or_totalamount,cr_fk, rank() over (partition by cr_fk
order by or_totalamount DESC) "RANKING" from t_order_or )
where RANKING between 6 and 10
;