October 2009

CBO: NewDensity for Frequency Histograms,11g-10.2.0.4 (densities part IV)

As we have seen in the previous posts of this series, in 11g a new figure named “NewDensity” has been introduced as a replacement for the “density” column statistic for columns whose histogram has been collected; this change has been backported in 10.2.0.4 also.

In the previous post we discussed how NewDensity influences the CBO cardinality estimate for Height-Balanced histograms; in this one we are going to investigate the same for Frequency Histograms.…

CBO: “NewDensity” replaces “density” in 11g, 10.2.0.4 (densities part III)

In this post we are going to explore and explain the rationale for the formula used by the CBO to compute the “NewDensity” figure that replaces, from 10.2.0.4 onwards, the “density” column statistic in the cardinality estimation formulae for columns with height-balanced (HB) histograms defined.

In a previous post, we already discussed the pre-10.2.0.4 scenario: we saw how and when the “density” column statistic is used in the cardinality formula for equality filter predicates, we explained its statistical rationale and defining formula, introduced the concept of the NPS (Not Popular Subtable), and built a test case.…

CBO: the formula for the “density” column statistic (densities part II)

In this post we are going to explore and explain the rationale for the formula used by dbms_stats to compute the “density” column statistic, used by the CBO in versions less than 10.2.0.4 to estimate the cardinality of a class of SQL statements. In the next post, we will speak about its replacement, named “NewDensity” in 10053 trace files.

We will consider only the non-trivial case of Height-Balanced histograms, since for Frequency Histograms density is a constant (0.5 / num_rows) and for columns without histogram, it is simply 1/num_distinct.…

CBO: about the statistical definition of “cardinality” (densities part I)

Let’s explore the concept of cardinality from the point of view of the statistician; this is both to get a clearer vision of the matter (i.e. for fun) and to path the way for understanding the rationale for the “density” statistics as calculated by dbms_stats (the topic of an upcoming post).

Let’s consider a statement with input parameters (bind variables), and consider the most fundamental of them all, the one with a filter predicate:
[sql]
select …
from t
where x = :x;
[/sql]
the cardinality “card” of the set of rows retrieved depends on the table possible values and the actual inputs provided by the client as bind variable values.…