11gR2: new algorithm for fast refresh of on-commit materialized views

This post investigates the improvements that have been made in 11gR2 to the fast refresh engine of materialized views (MVs) that are set to be automatically refreshed at commit time. We speak about join-only materialized views only in this post, as always with the help of a test case.

As noted in the post of mine “11gR2: materialized view logs changes“, in 11gR2 a new column, xid$$, is now part of materialized view logs; this column records the id of the transaction that logged the changes of the base table which the log is defined on. It is important to stress that this column is added regardless of the type of the MV log, that is, to both the brand-new “commit SCN-based” logs and the old fashioned “timestamp-based” ones. That means that both types of MV logs can take advantage of the new improvements – albeit I haven’t tested whether MVs (logs) migrated from a previous version are automatically upgraded by the migration scripts and get the new xid$$ column added.

algorithm before 11gR2

In versions before 11gR2, the refresh algorithm for on-commit MVs was the same as the one for on-demand ones, with only minor variants. That is, the algorithm was almost completely the same, just triggered by the commit event instead of by the user.

For an in-depth analysis of the algorithm, I will refer the reader to the discussion about the on-demand algorithm in the post “fast refresh of join-only materialized views – algorithm summary“; in passing, the test case for this post is in fact the very same three-table join MV, just redefined as “on commit” instead of “on demand”. To recap, the “old” algorithm (until 11.1.0.7) was:

1) new log rows are inserted with snaptime$$=4000 A.D;

2) at refresh time (commit time), a snapshot of the new rows is taken, that is, all new rows are marked with snaptime$$= “commit time”, using the statement

update MLOG$_TEST_T1

set snaptime$$ = :1

where snaptime$$ > to_date('2100-01-01:00:00:00','YYYY-MM-DD:HH24:MI:SS')

3) all modifications whose snaptime$$ is between the date of the last refresh (excluded) and the commit date(included) are propagated to the MV. The propagation consists of two steps.

First a DEL step:

/* MV_REFRESH (DEL) */

delete from test_mv

where test_t1_rowid in

(

select * from

(

select chartorowid (m_row$$)

from mlog$_test_t1

where snaptime$$ > :1

) -- no "as of snapshot (:2)" clause

)

Then an INS one:

/* MV_REFRESH (INS) */

insert into test_mv

select jv.j1_2, jv.x1, jv.pk1, jv.rid$,

mas2.j2_1, mas2.j2_3, mas2.x2, mas2.pk2, mas2.rowid,

mas3.j3_2, mas3.x3, mas3.pk3, mas3.rowid

from (

select log.rowid rid$, log.*

from test_t1 log

where rowid in

(

select chartorowid(log.m_row$$)

from mlog$_test_t1

where snaptime$$ > :1

)

) jv, -- no "as of snapshot (:2) jv" clause

test_t2 as of snapshot (:2) mas2,

test_t3 as of snapshot (:2) mas3

where jv.j1_2 = mas2.j2_1

and mas2.j2_3 = mas3.j3_2

Note that the only small difference from the on-demand case is the absence of the “as of snapshot” clause, but the statements are otherwise identical. Note also that the rows in the MV log are identified in both statements by snaptime, using the subquery

select chartorowid(log.m_row$$)

from mlog$_test_t1

where snaptime$$ > :1

4) all obsolete log rows are deleted, that is, all rows whose snaptime$$ is less than or equal the lowest of all refresh times are removed from the log, using the the statement

delete from mlog$_test_t1

where snaptime$$ <= :1

algorithm starting from 11gR2

In 11gR2, the on-commit algorithm is still almost the same as the on-demand one; the “only” change is how modified rows to be propagated are identified, and in general, how logs are managed. Not surprisingly, log rows are now directly identified by the transaction id, which is logged in xid$$. In detail:

1) new log rows are inserted with xid$$ = transaction id;

2) at refresh time (commit time), no snapshot is taken, that is, the MV log is not updated at all;

3) all modifications made by the committing transaction are propagated to the MV, still using the same two steps.

The DEL step is now:

/* MV_REFRESH (DEL) */

delete from test_mv

where test_t1_rowid in

(

select * from

(

select chartorowid (m_row$$)

from mlog$_test_t1

where xid$$ = :1

)

)

The INS one is:

/* MV_REFRESH (INS) */

insert into test_mv

select jv.j1_2, jv.x1, jv.pk1, jv.rid$,

mas2.j2_1, mas2.j2_3, mas2.x2, mas2.pk2, mas2.rowid,

mas3.j3_2, mas3.x3, mas3.pk3, mas3.rowid

from (

select log.rowid rid$, log.*

from test_t1 log

where rowid in

(

select chartorowid(log.m_row$$)

from mlog$_test_t1

where xid$$ = :1

)

Page 1 of 2 | Next page