Comments on: 11gR2: new algorithm for fast refresh of on-commit materialized views

By: The mess that is fast-refresh join-only Materialized Views | OraStory

The mess that is fast-refresh join-only Materialized Views | OraStory — Thu, 27 Nov 2014 15:24:13 +0000

[…] http://www.adellera.it/blog/2009/11/22/11gr2-new-algorithm-for-fast-refresh-of-on-commit-materialize… […]

By: Alberto Dell'Era

Alberto Dell'Era — Thu, 07 Jan 2010 09:25:26 +0000

In reply to Taral Desai. @Taral thanks for coming back and letting me know about your solution.

By: Taral Desai

Taral Desai — Thu, 07 Jan 2010 06:03:38 +0000

Thanks Sir for update and also explaining both things. As usual i learned many things from you today thank you for that. Now, coming to the issue this is actually a bug and i tested using the method provide in document 578720.1 and now it’s using path

[sql]
call count cpu elapsed disk query current rows
——- —— ——– ———- ———- ———- ———- ———-
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.01 0.00 0 8 0 0
Fetch 0 0.00 0.00 0 0 0 0
——- —— ——– ———- ———- ———- ———- ———-
total 2 0.02 0.01 0 8 0 0

Rows Row Source Operation
——- —————————————————
0 NESTED LOOPS (cr=8 pr=0 pw=0 time=432 us)
1 NESTED LOOPS (cr=7 pr=0 pw=0 time=402 us)
1 VIEW (cr=4 pr=0 pw=0 time=269 us)
1 NESTED LOOPS (cr=4 pr=0 pw=0 time=265 us)
1 SORT UNIQUE (cr=3 pr=0 pw=0 time=235 us)
2 TABLE ACCESS FULL MLOG$_xxxx (cr=3 pr=0 pw=0 time=143 us)
1 TABLE ACCESS BY USER ROWID S_XXXX (cr=1 pr=0 pw=0 time=23 us)
1 TABLE ACCESS BY INDEX ROWID S_XXXX (cr=3 pr=0 pw=0 time=127 us)
1 INDEX UNIQUE SCAN SXXXXX_PK (cr=2 pr=0 pw=0 time=55 us)(object id 52121)
0 INDEX UNIQUE SCAN PK_S_XXXXX (cr=1 pr=0 pw=0 time=25 us)(object id 87853)
[/sql]

By: Alberto Dell'Era

Alberto Dell'Era — Wed, 06 Jan 2010 21:16:43 +0000

In reply to Taral Desai.

@Taral

I concurr on the performance problem being the Hash Join; there your statement spends the vast majority of time (22980548 microseconds), that is almost completely CPU or unaccounted-for time (since the two child FTS accounts for only 135+8272772 microseconds).

About the question about the semi-join, you have a SQL semi-join since a fragment of the statement used for the INS phase of the fast refresh is

select .. from test_t1 where rowid in ( select … from mlog$_test_t1 )

There are (conceptually) two ways to calculate this fragment using an hash table:

a) load test_t1 as an hash table “in memory”, then read mlog$_test_t1, and mark the rows in the hash table that match; then, return the marked rows

b) load mlog$_test_t1 as an hash table “in memory”, then read test_t1; whenever a match occurs, return the row, and mark the hash table row as “already returned”, and never return it again even if another match occurs [or, simply remove the row from the hash table].

I’m almost sure that (a) is how an “HASH JOIN SEMI” works, and (b) is how an “HASH JOIN RIGHT SEMI” works, even if I am only 95% sure right now. If that’s true, in the common scenario where the log contains only a few rows and the table a lot of rows (as it seems to be your case), (b) would seem to be the more efficient way (small hash table).

It would be interesting to compare two tkprofs of the fragment, one using (a) and another (b) (you can use the swap_join_inputs and no_swap_join_inputs to force one or another), and see whether it makes any difference.

Of course one would argue that the most efficient way to compute that fragment would be to get the table rows “pointed to” by the log rowids, thus avoiding a very expensive FTS, not by an expensive hash join. The hash join was chosen in my tests because the tables were tiny; in general, I do not expect that path to be chosen very often.

Might you check that the CBO was working with up-to-date information, by checking that
a) table S_R had up-to-date statistics collected
and
b) the log MLOG$_S_R had either up-to-date statistics or no statistics (both scenarios are possible).

By: Taral Desai

Taral Desai — Wed, 06 Jan 2010 05:14:45 +0000

Hello Alberto,

I have some problem with mv where i am using 10.2.0.4. I saw your trace file and in that it uses

[sql]
Rows Row Source Operation
——- —————————————————
2 TABLE ACCESS BY INDEX ROWID TEST_T1 (cr=21 pr=0 pw=0 time=4362 us)
5 NESTED LOOPS (cr=20 pr=0 pw=0 time=4307 us)
2 NESTED LOOPS (cr=18 pr=0 pw=0 time=4217 us)
3 VIEW (cr=14 pr=0 pw=0 time=3887 us)
3 HASH JOIN SEMI (cr=14 pr=0 pw=0 time=3865 us)
100 TABLE ACCESS FULL TEST_T3 (cr=7 pr=0 pw=0 time=308 us)
6 TABLE ACCESS FULL MLOG$_TEST_T3 (cr=7 pr=0 pw=0 time=105 us)
2 TABLE ACCESS BY INDEX ROWID TEST_T2 (cr=4 pr=0 pw=0 time=120 us)
2 INDEX RANGE SCAN TEST_T2_J2_3_IDX (cr=2 pr=0 pw=0 time=67 us)(object id 60236)
2 INDEX RANGE SCAN TEST_T1_J1_2_IDX (cr=2 pr=0 pw=0 time=43 us)(object id 60230)

[/sql]

Hash join semi. Where i have problem uses hash join right semi. What is different between them ? This i think is causing performace issue in my case

[sql]
Rows Row Source Operation
——- —————————————————
1 NESTED LOOPS (cr=55325 pr=53317 pw=0 time=22980681 us)
1 NESTED LOOPS (cr=55324 pr=53317 pw=0 time=22980657 us)
1 VIEW (cr=55321 pr=53317 pw=0 time=22980562 us)
1 HASH JOIN RIGHT SEMI (cr=55321 pr=53317 pw=0 time=22980548 us)
2 TABLE ACCESS FULL MLOG$_S_R (cr=3 pr=0 pw=0 time=135 us)
8264128 TABLE ACCESS FULL S_R (cr=55318 pr=53317 pw=0 time=8272772 us)
1 TABLE ACCESS BY INDEX ROWID SERVICE_REQUESTS (cr=3 pr=0 pw=0 time=86 us)
1 INDEX UNIQUE SCAN xxxx_PK (cr=2 pr=0 pw=0 time=40 us)(object id 52121)
1 INDEX UNIQUE SCAN PK_Sxxxxx (cr=1 pr=0 pw=0 time=12 us)(object id 87853)

call count cpu elapsed disk query current rows
——- —— ——– ———- ———- ———- ———- ———-
Parse 5 0.01 0.01 0 0 0 0
Execute 5 50.73 61.04 266390 276650 80 5
Fetch 0 0.00 0.00 0 0 0 0
——- —— ——– ———- ———- ———- ———- ———-
total 10 50.74 61.06 266390 276650 80 5

[/sql]

Table and index name are changed due to policy. So, i think most time is spent on hash join right semi. Can you please guide me how to improve this

By: Alberto Dell'Era

Alberto Dell'Era — Thu, 17 Dec 2009 18:27:24 +0000

In reply to Joaquin.

@Joaquin

for join-only MVs, that is, the kind of MV investigated in this post, sure.

Anyway, I always check the actual algorithm using the actual SQL statement and the exact database version I am using for important MVs; better safe than sorry… you can easily adapt my test case for that purpose.

I don’t think that setting pctfree to zero is going to improve the performance too much – at most it might reduce the resource consumption of full table scans by 10%, which is not that much.

By: Joaquin

Joaquin — Thu, 17 Dec 2009 17:03:24 +0000

Hi Alberto,

It seems that materialized views are refreshed using only deletes and inserts, am I right? If that’s true, then there’s no reason for the pctfree of the materialized view to be not equal to zero, right?

Thank you!

Joaquin Gonzalez

By: Blogroll Report 20/11/2009-27/11/2009 « Coskan’s Approach to Oracle

Blogroll Report 20/11/2009-27/11/2009 « Coskan’s Approach to Oracle — Sat, 12 Dec 2009 19:04:56 +0000

[…] 4-How does fast refresh of on-commit materialized views works on 11GR2? Alberto Dell’era-11gR2: new algorithm for fast refresh of on-commit materialized views […]

By: Igor

Igor — Fri, 04 Dec 2009 14:44:27 +0000

Hi Alberto,

I agree with your point of view. (just don’t tell to tkyte about it. Just kidding… 🙂

Thank you for these insights.