Fast refresh of aggregate-only materialized views – introduction

dmltype$$, sequence$$, old_new$$, snaptime$$

from mlog$_test_master

where snaptime$$ > :b_st0

) as of snapshot(:b_scn) log

)

where ( (old_new$$ in ('O', 'U') ) and (sequence$$ = min_sequence$$) )

or ( (old_new$$ = 'N' ) and (sequence$$ = max_sequence$$) )

)

The “log” in-line view is the usual one selecting the marked log rows; the outer query blocks, for each rowid, keeps only the first logged value but only if it is old, and the last logged one, but only if it is new.

So for example (check tmpdlt_pair_removal_examples.sql), TMPDLT filters out the rows marked with (*) from this triple update of the same row:

SEQUENCE$$ OLD_NEW$$ GBY DAT WHE

---------- --------- ------ ------ ------

10142 U 0 1 0

10143 N 0 1000 0 *

10144 U 0 1000 0 *

10145 N 0 2000 0 *

10146 U 0 2000 0 *

10147 N 0 3000 0

and it removes completely this pair (obtained by inserting a row and then deleting it):

SEQUENCE$$ OLD_NEW$$ GBY DAT WHE

---------- --------- ------ ------ ------

10162 N -1 -1 -1 *

10163 O -1 -1 -1 *

As we will see, the result of TMPDLT is (almost always) the actual input to the refreshing algorithm, instead of the “raw” log rows. Note that this prefiltering is relatively expensive, and while it might be somehow beneficial to remove some redundant values, it is useful especially when the log contains a mix of new and old values and TMPDLT is able to turn it into a stream of new-only(insert-only) or old-only(delete-only) one. When it happens, the more specialized versions of the algorithm can be used, thus saving resources – even if the savings could not repay the cost of TMPDLT, in general.

Even better, this prefiltering shows its greatest advantage when you have to refresh a so-called “insert-only MV”, that is, a MV that refuses to fast-refresh if updates or deletes where performed, but it fast-refreshes happily when only inserts were performed: you might be able to fast refresh and avoid a complete refresh if TMPDLT is able to filter out all the old values. This happens for example if you insert some rows first, and then modify (or delete) only the newly inserted rows before refreshing – as demonstrated by tmpdlt_enables_fast_refresh_of_insert_only_mv.sql using the classic insert-only MV, a MV containing MAX and a where clause.

TMPDLT caching

The first operation performed by the refresh engine is to classify the log content as new-only(insert-only), old-only(delete-only) or mixed, both to decide which refresh algorithm to use (insert-only, delete-only, general) and to raise, possibly, “ORA-32314 REFRESH FAST unsupported after deletes/updates” for insert-only MVs.

To classify the log, it issues a SQL statement on TMPDLT, that lists for every possible DML (Insert,Update,Delete) the max value of snaptime$$ contained in the log. In passing, this might enable some optimizations such as multi-step refreshes, but I have not investigated this.

Immediately after this, the chosen refreshing algorithm version might reference TMPDLT again – this time (possibly) saving resources since TMPDLT is result-cached, thanks to the hint “result_cache(lifetime=session)”.

The caching is a potentially relevant optimization since analytic functions can use a lot or resources for big (long and/or wide) MV logs. It means also that one must check also the result cache size (and utilization) when tuning for performance – and check that the analytic functions in the first place, of course, have enough resources to operate efficiently.

Side note: the undocumented modifier “lifetime=session” simply means that the result is flushed (at least) when the session ends (check result_cache_session_lifetime.sql), which is a nice optimization since TMPDLT is flashed back in time and hence is NOT flushed when the log is modified. It is anayway explicitly flushed as soon as the refresh ends, hence this is only a safe net just in case the refresh fails for some reason (e.g. disk full).

TMPDLT disabling

What if you don’t benefit from TMPDLT since your log does not contain (enough) redundant values, and you don’t want to pay the cost of its processing and/or caching ?

You can disable it by removing the sequence from the MV log, that actually, as far as I know, seems to be used only by this prefilter. If this is done, all the refreshing statements read directly from the log; script tmpdlt_disabling_all.sql proves this (you will better appreciate how it works and its output after the next two posts that illustrate the actual refreshing algorithms of SUM and MAX, but you can already see that TMPDLT disappears from the refreshing statements).

Page 2 of 3 | Previous page | Next page