Tuning Oracle for Siebel – SQL template

The time has come to write down some of the most relevant discoveries I’ve made so far while being part of a team that is tuning a huge Siebel installation for a leading Italian company (“huge” especially because of the user base dimension and secondarily because of the hardware deployed, a three-node RAC on pretty powerful SMP machines).

This blog entry is about the structure of the Siebel queries and the effect of the settings of some CBO-related parameters – settings made by the Siebel clients by altering the session at connect time, or required as mandatory in the Siebel installation notes. Other postings may follow in case I discover something worth writing about.

But first of all, let me thank for their support all the fellow members of the OakTable network (especially Tim Gorman that has exchanged long emails with me) and Andy Cowling (introduced to me by Doug Burns) that kindly provided me with a lot of detailed information coming from their vast hands-on experience with Siebel on Oracle.

For the record, the environment is Siebel 8.0, using a three-node 10.2.0.4 RAC cluster on identical SMP machines with 16 CPUs each.

Most of the Siebel queries follow this template:

select ...

from base, inner1, ..., innerN

where base.j1 = inner1.j1(+)

and base.j2 = inner2.j2(+)

...

and base.jN = innerN.jN(+)

and "filter conditions on base"

order by base.order1 [asc|desc], base.order2 [asc|desc], ..., base.orderK [asc|desc];

which must be discussed keeping in mind another critical and subtle information: the Siebel clients read only a subset of the returning rows (probably the ones that fit on the screen, sometimes only 2 or 3 on average) – an intention that the Siebel client wisely communicates to the CBO by altering the session and setting optimizer_mode = first_rows_10.

Side note: there are variants to this template; sometimes there are two or three base tables that are joined and filtered together, and more frequently, some of the outer join relations are based on one of the innerN tables, not on the base tables (eg. innerM.jN = innerN.jN), but that is inessential for what we are going to discuss here.

The Siebel client’s intention is clear: get some rows from the base table, order them by the columns order1, order2, …, orderK, get only the first rows, and then get additional information (columns) by following the outer join relations. Note the order of the operations.

Visiting the outer-joined tables almost always comes out as the most expensive part. The reason is two-fold; first, the sheer number of outer-joined tables (ten tables is the norm, up to about 50, since the Siebel relational model is highly normalized and hence the information is highly dispersed), and second and most importantly, because of the join method the CBO is forced to follow.

In fact, Siebel blocks HASH JOINS (_hash_join_enabled=false) and SORT MERGE JOINS (_optimizer_sortmerge_join_enabled=false), which leaves NESTED LOOPS as the only surviving join method. When nested looping, Oracle must obviously start from the outer table listed in the SQL statement (base) and then visit the SQL inner tables (inner1 … innerN) using the join conditions, something that can be efficiently done only by visiting an index that has innerN.jN as the leading column of its definition. Each of these visits requires blevel+1 consistent gets for the index and some additional consistent gets for every row that is found – usually one or two, sometimes zero, at least in the scenarios I’ve seen.

So it is clear that every row that is fetched from the base table may easily produce tens and tens of consistent gets coming from the outer joins – and that can easily lead to a disaster if the number of rows from the base table is not very very small. In one of our queries, for example, about one million rows came out from the base table, and since the CBO chose to outer join first and then order, each execution caused a whopping 15,000,000 consistent gets – only to have the Siebel client select the first 3 rows or so. Needless to say, such a huge number of gets is completely unacceptable; the impact on the response time and CPU consumption is obvious, but scalability suffers as well (gets cause latches or mutexes acquisitions that are the most frequent cause of contention) – that means that even a slight increase on the workload may cause severe degradation of response time (especially on RAC).

Page 1 of 2 | Next page