Fast refresh of aggregate-only materialized views with MAX – algorithm

In this post I will illustrate the algorithm used by Oracle (in 11.2.0.3) to fast refresh a materialized view (MV) containing only the MAX aggregate function:

create materialized view test_mv

build immediate

refresh fast on demand

with rowid

as

select gby as mv_gby,

count(*) as mv_cnt_star,

max (dat) as mv_max_dat

from test_master

--where whe = 0

group by gby

;

The where clause is commented to enable fast refresh whatever type of DML occurs on the master table, in order to investigate all possible scenarios; the case having the where-clause is anywhere a sub-case of the former and we will illustrate it as well below.

As usual, the MV log is configured to “log everything”:

create materialized view log on test_master

with rowid ( whe, gby, dat ), sequence

including new values;

In the general introduction to aggregate-only MVs we have seen how the refresh engine first marks the log rows, then inspects TMPDLT (loading its rows into the result cache at the same time) to classify its content as insert-only (if it contains only new values), delete-only (if it contains only old values) or general (if it contains a mix of new/old values). In the MAX scenario, a specialized (and much more performant) algorithm exists only for the insert-only case, and every other case falls back to the general algorithm.

Let’s illustrate, with the help of the usual supporting test case, and building on the shoulders of the already illustrated SUM case.

refresh for insert-only TMPDLT

The refresh is made using this single merge statement:

/* MV_REFRESH (MRG) */

/* MV_REFRESH (MRG) */

merge into test_mv

using (

with tmpdlt$_test_master as (

-- check introduction post for statement

)

select gby,

sum( 1 ) as cnt_star,

max( dat ) as max_dat

from (select rid$, gby, dat, dml$$

from tmpdlt$_test_master

) as of snapshot(:b_scn)

-- where whe = 0 (if the where clause is specified in the MV)

group by gby

) deltas

on (sys_op_map_nonnull(test_mv.mv_gby) = sys_op_map_nonnull(deltas.gby))

when matched then

update set

test_mv.mv_cnt_star = test_mv.mv_cnt_star + deltas.cnt_star

test_mv.mv_max_dat =

decode( test_mv.mv_max_dat,

null, deltas.max_dat,

decode( deltas.max_dat,

null, test_mv.mv_max_dat,

greatest( test_mv.mv_max_dat, deltas.max_dat )

)

)

when not matched then

insert ( test_mv.mv_gby, test_mv.mv_max_dat, test_mv.mv_cnt_star )

values ( deltas.gby, deltas.max_dat, deltas.cnt_star )

Similarly to what it is done for the SUM case, it simply calculates the delta values to be propagated by grouping-and-maximazing the new values contained in TMPDLT (essentially, it executes the MV statement on the filtered mv log), and then looks for matches over the grouped-by expression (using the null-aware function sys_op_map_nonnull, already illustrated in the post about SUM). It then applies the deltas to the MV, or simply inserts them if no match is found.

The delta application algorithm is very simple: since only inserts have been performed, the MV max(dat) value cannot decrease, but only increase if max(dat) calculated by the deltas is greater. Hence it is simply a matter to set the new value to the greatest of the old and the max of the deltas, with a few decodes to handle nulls in the obvious way.

Note that count(dat) is not used, and even not present in the MV definition.

Note especially, as in the SUM case, that the master table, test_master, is not accessed at all – unfortunately that cannot be done for the general case, as we will see shortly.

This algorithm is used also when the where clause is specified in the MV (adding this clause makes the MV an “insert-only MV”, as per Oracle definition, which means that can be fast refreshed only after inserts and not after other DML types); the only difference is the obvious addition of the where clause in the deltas calculation as well (as commented in the statement above).

It’s also very interesting to remember that this algorithm can be used when only inserts(new values) are present in TMPDLT, not in the log, and hence it can be used even when deletes or inserts are present in the log, provided they are redundant (as seen in the general introduction post). This is especially useful for where-clause MVs, since it widens the possibility to refresh beyond insert-only, as already demonstrated in script tmpdlt_enables_fast_refresh_of_insert_only_mv.sql of the introduction post.

refresh for mixed-DML TMPDLT

The refresh is accomplished using two statements, a delete that removes every gby value referenced in the log:

/* MV_REFRESH (DEL) */

delete from test_mv

where sys_op_map_nonnull(mv_gby) in (

select sys_op_map_nonnull(gby)

from (select gby

from mlog$_test_master

Page 1 of 2 | Next page