fast refresh of join-only MVs: _mv_refresh_use_stats and locking log stats

A devastating performance degradation of materialized view fast refreshes can happen in versions after 9i – and can be healed rather easily by simply setting the hidden parameter _mv_refresh_use_stats or, a bit surprisingly, by locking statistics on the logs. The problem can manifest at least in the currently-latest patchsets of 10g, 11gR1 and 11gR2 (10.2.0.4, 11.1.0.7 and 11.2.0.1), seems to hit a lot of people, and its root cause are the utilization of wrong hints by the Oracle refresh engine.

We will investigate the join-only MV case only, since this is the case I have investigated after a question by Christo Kutrovsky, factoring in some observations by Taral Desai and some Support notes; I have some clues that something similar may happen for other types of MVs.

The test case sets up this very common scenario for fast refreshes:

1 – two big base tables joined together by the MV;
2 – only a small fraction of rows modified (actually one deleted, two updated, one inserted);
3 – all tables and indexes with fresh statistics collected;
4 – MV logs with no statistic collected AND with not-locked statistics;
5 – indexes present on the joined columns;
6 – indexes present on the rowid columns of the MV.

Points 1 and 2 make for the ideal scenario for incremental (“fast”) refreshes to be effective; 3 is very common as well, since you normally have many other statements issued on the tables; the relevance of 4 will be clear later, but it happens very often in real life, since people might perhaps consider collecting stats on the log, but locking their statistics is usually not made, at least in my experience.

To understand the importance of points 5 and 6, please check this post of mine; note how those indexes are a necessary prerequisite for the sanity of the DEL and INS steps of the MV process. Without them, the refresh cannot be incremental since it has no physical way to read and propagate only the modified rows and those related to them, but it must scan (uselessly) most of the base tables and MV. But in other for the refresh to be incremental (“fast”), those indexes have to be actually used…

the issue

Let’s illustrate the issue focusing on the DEL step (the easier to discuss about). In the above mentioned post, we have seen that the DEL step uses a single SQL statement whose text, leaving out minor technical details and hints, is:
[sql]
/* MV_REFRESH (DEL) */
delete from test_mv
where test_t1_rowid in
(
select * from
(
select chartorowid (m_row$$)
from mlog$_test_t1
where snaptime$$ > :1
) as of snapshot (:2)
)
[/sql]
In 9.2.0.8, we get this very healthy plan:
[text]
————————————————-
|Id|Operation |Name |
————————————————-
| 0|DELETE STATEMENT | |
| 1| DELETE | |
| 2| NESTED LOOPS | |
| 3| VIEW | |
| 4| SORT UNIQUE | |
| 5| TABLE ACCESS FULL|MLOG$_TEST_T1 |
| 6| INDEX RANGE SCAN |TEST_MV_TEST_T1_ROWID|
————————————————-
[/text]
That is: get the rowid of all modified rows from the log, and use the rowid-based index to delete the “old image” of them from the MV (inserting their “new image” is the job of the INS step). This is truly incremental, since the resource usage and elapsed time are proportional to the number of rows logged in the MV log, not to the dimension of the tables.

That is due to the engine injecting an HASH_SJ hint in the outermost nested subquery:
[text]
… WHERE “TEST_T1_ROWID” IN (SELECT /*+ NO_MERGE HASH_SJ */ …
[/text]
This is recognized as a bug in many scenarios (start from Oracle Support note 578720.1 and follow the references to explore some of them) even if I have not found a clear and exhaustive note that documents the behaviour.

remedy one: set “_mv_refresh_use_stats”

To get back to the healthy plan, simply set “_mv_refresh_use_stats” to “true” (ask Oracle Support first of course for permission); this makes for a set of hint much more adequate for a fast refresh:
[text]
… WHERE “TEST_T1_ROWID” IN (SELECT /*+ NO_MERGE NO_SEMIJOIN */ …
[/text]

Note: The root cause for this bug is probably due to a change hinted in note 875532.1 – in 10.2.0.3 the meaning of _mv_refresh_use_stats was reversed, but not the default, hence (by mistake?) activating a different piece of the engine code.

Even in this case, this is due to the nasty HASH_SJ hint:
[text]
… FROM “TEST_T1” “MAS$” WHERE ROWID IN (SELECT /*+ HASH_SJ */ …
[/text]

If you set _mv_refresh_use_stats, you get back the 9.2.0.8 plan – and thus you are back to incremental for both the DEL and INS steps. As a side note, a cardinality hint is used, where the cardinality is set to the correct value (6 in my test case):
[text]
… FROM “TEST_T1” “MAS$” WHERE ROWID IN (SELECT /*+ CARDINALITY(MAS$ 6) NO_SEMIJOIN …
[/text]

remedy two: collect and lock statistics on the logs

Very interestingly, instead of setting the hidden parameter, you have another way to get back to the healthy plan: gather statistics on the MV logs when they are empty AND lock them (as suggested in note 578720.1, albeit not in this scenario and even if setting the parameter is not necessary; thanks to Taral Desai for pointing me to the note). In this case, no hint at all is injected beside a NO_MERGE for the DEL step:

[text]
… WHERE “TEST_T1_ROWID” IN (SELECT /*+ NO_MERGE */ …
… FROM “TEST_T1” “MAS$” WHERE ROWID IN (SELECT …
[/text]

So, the engine is confident that the CBO will come out with a good plan, and it does not inject any “intelligent” hint. Possibly, and intriguing, this is because by locking the statistics, I am assuring the engine that these statistics are representative of the data anytime. So, locking the statistics is not meant only as a way to prevent dbms_stats from changing them … it is deeper than that. At least in this case, you are taking responsibility for them, and Oracle will take that in consideration.

19 Responses to "fast refresh of join-only MVs: _mv_refresh_use_stats and locking log stats"

Blogroll Report 05/02/2010 – 12/03/2010 « Coskan’s Approach to Oracle says:

April 23, 2010 at 03:08

[…] 19-How to sort performance degradation of materialized view fast refreshes on versions after 9i? Alberto D’ellera-fast refresh of join-only MVs: _mv_refresh_use_stats and locking log stats […]

Reply
Nathan Marston says:

January 5, 2011 at 09:06

I ran a few followup tests to your article, as I was seeing similar issues with join-only fast refresh MVs containing outer joins.

(FYI, my tests were run on a Linux 2-node RAC running Oracle 11g).

It turns out that once you start using outer joins, only one of the two solutions provided in your article will remove the dreaded HASH_SJ hints from all queries generated by Oracle to perform the MV refresh.

The reason outer joins complicates things is the DML issued by Oracle to do the refresh changes when there’s an outer join in the picture – in my tests, after some base table changes it actually issued an UPDATE against the MV which set the fields from the “outer” table to NULL. I couldn’t tell from tkprof’s output, but I’d guess that query would be issued in response to deleting a row from the table on the outer table.

Even with locked stats on the MV logs (which is the solution I wish worked all the time), that query still contained a HASH_SJ hint.

When I switched to setting “_mv_refresh_use_stats” instead, I found performance improved and the HASH_SJ hint no longer appeared in the refresh queries.

I can’t see a good reason for *any* of the hints in those generated queries. Adding join and table access hints are almost always a bad idea – you’re essentially saying you know something about the data that the CBO doesn’t.

In the case of MV logs that might be true – the number of rows in the MV log from refresh to refresh could vary a lot. If stats say MV log is small but it’s actually big, the CBO won’t generate the execution plan we want.

I can think of two responses to that argument:
1. Is a fast refresh MV really the appropriate tool if you have that much base table data changing? Why not refresh more frequently (to reduce the number of changes per refresh), or use full refreshes, or use something other than a MV?
2. If you really want to cater for both small and large MV logs, why not perform dynamic sampling on the MV logs and let the CBO do its job?

Reply
FAST REFRESH w widokach zmaterializowanych – nie taki szybki ?! « Okiem bazodanowca says:

May 7, 2011 at 15:36

[…] Rozwiązanie problemu jest bardzo ładnie przedstawione na blogu Alberto Dell’Ery : problem wolnego FAST REFRESH rozwiązany przez Alberto Pierwsza rzecz, która rzuca się w oczy – problem dotyczy tylko niektórych wersji Oracla. […]

Reply
Don Seiler says:

August 14, 2012 at 15:02

Hi Alberto,

What about gathering real stats on the mview log immediately before the fast refresh?

I have an interesting case from this weekend with an mv using a join. MV log for 1 table always has 0 rows. When the 2nd table had over 229K rows, fast refresh finished in less than 9 minutes. The next fast refresh saw the 2nd table with only 5K rows, but it took over an hour to complete. Looking for a definite root cause and workaround so it doesn’t happen again :p

Reply
1. Alberto Dell'Era says:
  
  August 21, 2012 at 13:53
  
  Hi Don,
  
  sorry that I haven’t answered sooner, but I was on vacation …
  
  Possibly the performance degraded because it had to propagate the deletion of 224K(=229K-5K) rows to the MV in the DEL step – in which case a complete refresh would have been better …
  
  Yes in general, gathering stats on the MV log (and master tables and their indexes) – should make the CBO happier – as well as locking the stats on the MV log if you see the problem described in this post.
  
  After analyzing the MV logs, you could check the stats to decide whether to refresh using the FAST or COMPLETE option – it is something I’m actually considering for one project of mine 🙂
  
  Reply
daniesh shaikh says:

January 8, 2013 at 13:06

Hi Alberto,

Thanks for this post. this is very helpful for me as i am working on MV performance since 1 month. I created a MV by joining 4 tables out of which one table(gl_balances) size is more than 20GB.GL_balances and GL_code_combinations tables are partition by range. I ran the fast refresh which takes more than 1 hr. I can see that the del query is hash join semi. In my case the parameter _mv_refresh_use_stats is set to true.

SQL> show parameter _mv_refresh_use_stats

NAME TYPE VALUE
———————————— ———– ——————————
_mv_refresh_use_stats boolean TRUE

Plan Hash: 4021631375
DELETE FROM “APPS”.”GLBAL_MV1″ SNA$
WHERE
“C2” IN (SELECT /*+ NO_MERGE */ * FROM (SELECT
CHARTOROWID(“MAS$”.”M_ROW$$”) RID$ FROM “GL”.”MLOG$_GL_BALANCES” “MAS$”
WHERE “MAS$”.SNAPTIME$$ > :B_ST2 ) AS OF SNAPSHOT(:B_SCN) MAS$)

call count cpu elapsed disk query current rows
——- —— ——– ———- ———- ———- ———- ———-
Parse 1 0.00 0.01 0 0 0 0
Execute 1 420.66 1428.79 10564633 10634970 758532 635919
Fetch 0 0.00 0.00 0 0 0 0
——- —— ——– ———- ———- ———- ———- ———-
total 2 420.66 1428.80 10564633 10634970 758532 635919

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 175 (APPS) (recursive depth: 1)

Rows Row Source Operation
——- —————————————————
0 DELETE HDFC_REP_GL_TRIAL_BAL_DAN_MV1 (cr=10635012 pr=10564636 pw=0 time=0 us)
635919 HASH JOIN RIGHT SEMI (cr=10634968 pr=10564428 pw=0 time=1400861312 us cost=9744568 size=148 card=1)
674212 TABLE ACCESS BY INDEX ROWID MLOG$_GL_BALANCES (cr=10462 pr=2469 pw=0 time=736435 us cost=1 size=138 card=1)
674212 INDEX RANGE SCAN MLOGSNAP (cr=5610 pr=2469 pw=0 time=273282 us cost=2 size=0 card=1)(object id 2112025)
404672870 MAT_VIEW ACCESS GL_BAL_MV1 (cr=10624506 pr=10561959 pw=0 time=980955008 us cost=9744566 size=4047014600 card=404701460)

I checked with mv_capabilities that the MV is eligible for fast refresh.

Please suggest why the mv is still using hash join after setting the parameter to true. My db version is 11.2.0.1. Please suggest as i am trying to improve the performance of MV last more than 30 days.

Thanks

Reply
1. Alberto Dell'Era says:
  
  January 9, 2013 at 12:11
  
  Hi Daniesh,
  
  does an index on table GL_BAL_MV1(C2) column exist, and are its index statistics fresh ?
  
  The index must have C2 as its first column.
  
  Also check the columns stats of C2 … must be fresh as well.
  
  Reply
daniesh says:

January 10, 2013 at 20:24

Hi Alberto,

This is very helpful. I learned a lot from you.

After creating the index on C2 col now the Mv refresh del query is running fine i can see the performance has improved. Now i see the problem with ins query. Index is impacting the performance of insert query. Is there any way so than Mv refresh should not use index during ins statement.

Please suggest.

Thanks

Reply
1. Alberto Dell'Era says:
  
  January 11, 2013 at 20:09
  
  Daniesh,
  
  no, you can’t disable the index during the INS phase, since it happens straight after DEL under the control of the refresh procedure, and you can’t get control back in the middle.
  
  It seems strange, anyway, that a simple index might affect INS performances that much; have you indexed the joined columns in the BASE tables as well ?
  
  Also, I have noticed that rows from your MV log are fetched using a custom index MLOGSNAP; this is USUALLY inefficient and could be avoided. That kind of index is useful only if you have a lot of empty blocks in the MV log table, because a previous “outlier” refresh grew it to a very large HWM (say, 30 times as the normal size at least) – if that’s not the case, and hence the MV log is more or less the same size for every refresh, the index is just overhead and it is better dropped.
  
  First thing you might try is to collect fresh stats on the MV log before the refresh and check whether the CBO skips the index, improving performance at least for the INS step.
  
  Reply
daniesh says:

January 12, 2013 at 08:04

Hi Alberto,

Thanks,

We can see that the del query which took 1428.80 seconds last time.It took 556.15 seconds after creating the index on C2.
call count cpu elapsed disk query current rows
——- —— ——– ———- ———- ———- ———- ———-
Parse 1 0.00 0.01 0 0 0 0
Execute 1 65.92 556.14 372581 1480763 5879720 636184
Fetch 0 0.00 0.00 0 0 0 0
——- —— ——– ———- ———- ———- ———- ———-
total 2 65.92 556.15 372581 1480763 5879720 636184

MLOGSNAP index is created on SNAPTIME$$ of MLOG$_GL_BALANCES tables.

call count cpu elapsed disk query current rows
——- —— ——– ———- ———- ———- ———- ———-
Parse 1 0.01 0.02 0 0 0 0
Execute 1 62.33 451.38 386122 3829213 2166450 636184
Fetch 0 0.00 0.00 0 0 0 0
——- —— ——– ———- ———- ———- ———- ———-
total 2 62.34 451.41 386122 3829213 2166450 636184

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 175 (APPS) (recursive depth: 1)

Rows Row Source Operation
——- —————————————————
0 LOAD TABLE CONVENTIONAL (cr=3829603 pr=386128 pw=0 time=0 us)
636184 NESTED LOOPS (cr=3719922 pr=43313 pw=0 time=100880216 us)
636184 NESTED LOOPS (cr=3083737 pr=21824 pw=0 time=48839680 us cost=6 size=325 card=1)
636184 NESTED LOOPS (cr=2653648 pr=15136 pw=0 time=27271778 us cost=4 size=223 card=1)
636184 NESTED LOOPS (cr=2016739 pr=15131 pw=0 time=21884716 us cost=3 size=175 card=1)
636184 VIEW (cr=1379832 pr=15130 pw=0 time=16619372 us cost=2 size=145 card=1)
636184 NESTED LOOPS (cr=1379832 pr=15130 pw=0 time=15831465 us cost=2 size=198 card=1)
636184 SORT UNIQUE (cr=1357666 pr=5686 pw=0 time=751081 us cost=0 size=138 card=1)
674932 TABLE ACCESS BY INDEX ROWID MLOG$_GL_BALANCES (cr=1357666 pr=5686 pw=0 time=9713998 us cost=0 size=138 card=1)
674932 INDEX RANGE SCAN MLOGSNAP (cr=677872 pr=4489 pw=0 time=6187752 us cost=0 size=0 card=1)(object id 2112025)
636184 TABLE ACCESS BY USER ROWID GL_BALANCES PARTITION: ROW LOCATION ROW LOCATION (cr=22166 pr=9444 pw=0 time=0 us cost=1 size=60 card=1)
636184 TABLE ACCESS BY INDEX ROWID GL_SETS_OF_BOOKS (cr=636907 pr=1 pw=0 time=0 us cost=1 size=30 card=1)
636184 INDEX UNIQUE SCAN GL_SETS_OF_BOOKS_U2 (cr=723 pr=0 pw=0 time=0 us cost=0 size=0 card=1)(object id 33648)
636184 TABLE ACCESS BY INDEX ROWID GL_PERIODS (cr=636909 pr=5 pw=0 time=0 us cost=1 size=48 card=1)
636184 INDEX UNIQUE SCAN GL_PERIODS_U2 (cr=725 pr=3 pw=0 time=0 us cost=0 size=0 card=1)(object id 33612)
636184 INDEX UNIQUE SCAN GL_CODE_COMBINATIONS_U11 (cr=430089 pr=6688 pw=0 time=0 us cost=1 size=0 card=1)(object id 2110687)
636184 TABLE ACCESS BY GLOBAL INDEX ROWID GL_CODE_COMBINATIONS PARTITION: ROW LOCATION ROW LOCATION (cr=636185 pr=21489 pw=0 time=0 us cost=2 size=102 card=1)

last time ins query took 2 mins. this time it took around 451 seconds.
I will refresh the statistics on MV log and run one more load and update you.

Thanks

Reply
Gaetano says:

January 18, 2013 at 13:15

Hi Alberto,

I ask information about how run the refresh materialized views to type fast on commit.

I have a DB Spatial composed of 25 regional themes that presents many common columns, which often must be interrogated by region and theme.
I created a table with a spatial column geometric t_element_common_det with range partition (id_themes_fk, id_region_fk) with PK Id_seqn.
Each themes then presents specific attributes, really few columns.I created a table for each thematic t_themes_n with (WITH n from 1 to 25) with PK Id_seqn,
t_themes_n.Id_seqn with FK to t_element_common_det.id_seqn.

Now to get the complete alphanumeric data of each themes I created a join-only MV for each theme,
partitioned by region with fast refresh on commit.
Each MV is created as follows:

[sql]
CREATE MATERIALIZED VIEW MV_themes_14
partition by range (id_region_fk)
(
..
)
BUILD IMMEDIATE
REFRESH FAST ON COMMIT with rowid
AS
SELECT e.column_1,
       e.column_2,
       ….
       t.column_1,
       t.column_2,
       t.rowid rowid_theme,
       e.rowid rowid_ele
  FROM T_themes_14 T,
T_element_common_det
WHERE T.ID_SEQN = E.ID_SEQN
  AND E.ID_themes_FK = 14 — id_themes_fk
[/sql]

Now the problem is in performance.In select there is no problem, but I have problems in insert or Update.
For example, if I update a record on t_themes_14 okay (time of 3 of seconds), while if the upgrade t_element_common_det for a record with
id_themes_fk = 14 I have performance problems because if I update / delete a record it takes between 60/90 seconds.
I analized the trace file I changed only a recordupdate a record on t_themes_14 Oracle upgrade on all MV_themes_n (I have 25 themes….)
Performs tasks such as:
[sql]
/ * MV_REFRESH (DEL) * /
DELETE FROM "REGISTER". "MV_themes_8" SNA $
WHERE "ROWID_ELE" IN
       (SELECT / * + NO_MERGE HASH_SJ * /
         *
          FROM (SELECT CHARTOROWID ("MAS $". "M_ROW $ $") $ RID
                  FROM "REGISTER". "MLOG $ _t_element_common_det" "MAS $"
                 WHERE "MAS $". XID = $ $: 1) $ MAS)

/ * MV_REFRESH (INS) * /
INSERT INTO "REGISTER". "MV_themes_8"
  SELECT / * + NO_MERGE ("JV $") * /
   "JV $". "ID_SEQN"
   "JV $". "ID_themes_FK"
   …….
   "MAS $ 1". ROWID,
   "JV $". "$ RID"
    FROM (SELECT "MAS $". "ROWID" "RID $", "MAS $". *
            FROM "REGISTER". "t_element_common_det" "MAS $"
           WHERE ROWID IN (SELECT / * + HASH_SJ * /
                            CHARTOROWID ("MAS $". "M_ROW $ $") $ RID
                             FROM "REGISTER". "MLOG $ _t_element_common_det" "MAS $"
                            WHERE "MAS $". XID $ $ =: 1)) "JV $"
         "T_themes_8" AS SNAPSHOT OF (: B_SCN) "MAS $ 1"
   WHERE "MAS $ 1". "ID_SEQN" = "JV $". "ID_SEQN"
     AND "JV $". "ID_ELEMENTO_FK" = 8
[/sql]

Now my question is: Is there a way to set to refresh the materialized views regarding t_element_common_det, if you upadte a record
the t_element_common_det with ID_themes_FK = 14 to refresh only on MV_themes_14 and not all Materialized view?
Any other suggestions are welcome.

Thanks in advance.

Reply
1. Alberto Dell'Era says:
  
  January 18, 2013 at 20:26
  
  Hi Gaetano,
  
  I haven’t really understood how the two table rows are related – do you have a 1:1 match between the themes_N and the t_element_common_det, or a 1:N, or a N:1 ?
  
  Partitioning is irrelevant by the way; probably even the fact that the table has a spatial-type column is irrelevant.
  
  Do you have these four indexes defined ?
  t_themes_N(rowid_theme)
  t_element_common_det(rowid_ele)
  t_themes_N(id_seqn)
  t_element_common_det(id_seqn,id_themes_fk) (or the opposite order)
  Note: probably some of them are created by the PK constraints; just check that they exist and are valid
  
  Reply
The mess that is fast-refresh join-only Materialized Views | OraStory says:

November 28, 2014 at 15:28

[…] http://www.adellera.it/blog/2010/03/11/fast-refresh-of-join-only-mvs-_mv_refresh_use_stats-and-locki… […]

Reply
Yogendra says:

April 4, 2015 at 04:14

Hi Alberto, Thanks for sharing this info. we faced the same nasty HASH_SJ hint issue with MV FAST REFRESH (with OUTER JOIN though) on 11.2.0.3

Unfortunately, both of remedies did NO work for us on 11.2.0.3

Oracle Doc ID 1949537.1 suggests that setting the hidden _mv_refresh_use_hash_sj parameter to FALSE should prevent it using that hint.

[sql]alter session set "_mv_refresh_use_hash_sj"=FALSE; [/sql]

That stopped CBO using the HASH_SJ hint.

Posting it here in the interests of others.

Reply
1. Alberto Dell'Era says:
  
  April 4, 2015 at 10:19
  
  Yogendra,
  many thanks!
  Nice to know that there is an underscore parameter specifically introduced to solve this issue, probably since 11.2.0.2 according to the note 😉
  
  Reply
Jasmin says:

August 31, 2016 at 14:43

Hi Alberto

We have a scenario in my project, the same scenario you have explained in another post
“For example, consider the typical case of a parent table (say, CUSTOMER), with a child (say, ORDER) and a grandchild (say, ORDER_LINE); if you update a column of a row of the parent (say, ORDERS_TOTAL_AMOUNT), the parent row and its whole progeny (the “slice”) will be deleted and then recreated.”

The child table is just to get one column, where we are not expecting much updates, but due to the delete based on the Child table key, lot many tables in the MV is getting deleted and then recreated. Which is going for hours.
Is there any way we can restrict the mv refresh (del) from the child tables MV log?

Reply
1. Alberto Dell'Era says:
  
  September 2, 2016 at 12:49
  
  Hi Jasmin,
  
  none that I’m aware of … it’s built in the algorithm.
  
  You might try 12c, I haven’t studied that version yet…
  
  ciao
  Alberto
  
  Reply
DANISH says:

June 20, 2017 at 11:03

I have two Materialized View based on join tables. S_CONTACT table is used in both Materialized view. If I am refreshing first Materialized view. Will it refresh S_CONTACT Data of second materialized view as well without refreshing second MV?

Reply
1. Alberto Dell'Era says:
  
  June 30, 2017 at 11:19
  
  Damish,
  
  of course not; if you want to refresh all the MVs that depend on S_CONTACT, you might try invoking dbms_mview.refresh_dependent() on it – something I don’t like doing since I prefer to control exactly what I’m working on.
  
  Reply

Leave a Reply Cancel reply