Starting from Jonathan Lewis' test case "Discrete Dangers" on page 126 of his book "Cost Based Oracle",

we have investigated (and illustrated with graphs) the CBO cardinality estimation algorithm for range-based predicates:

select x from t where x > low_x and x < high_x (open , open ) select x from t where x >= low_x and x <= high_x (closed, closed) select x from t where x > low_x and x <= high_x (open, closed) select x from t where x >= low_x and x < high_x (closed, open) template: select x from t where x #low low_x and x #high high_x #low in (">", ">=") #high in ("<", "<=")when column x has no associated histogram.

Letting

min_x = min (x) over all rows max_x = max (x) over all rows B = (max_x - min_x) / num_distinct (Band width)and defining

left band : min_x < x < min_x + B central region : min_x + B < x < max_x - B right band : max_x - B < x < max_xwe have seen that

- For ranges
__completely__contained in one of the two bandsmin_x <= low_x < high_x <= min_x + B or max_x - B <= low_x < high_x <= max_x

the cardinality can depend on both low_x and high_x, only low_x, only high_x, or none (= constant), depending on the value of #low and #high, in a somewhat counterintuitive way. We have found the relevant formulae. - In
**all the other cases**, a slightly modified (modifications underlined) version of Jonathan Lewis' "standard formula" applies:cardinality = num_rows * (

This fitness test verifies the formula against the CBO, using its pl/sql implementation.__high_x_effective__-__low_x_effective__) / (max_x - min_x) + num_rows * correction_for_or_equal_operators__- num_rows * correction_for_special_case__if #low = ">="**and**low_x inside the left band then low_x_effective = min_x + B else low_x_effective = low_x end if #high = "<="**and**high_x inside the right band high_x_effective = max_x - B else high_x_effective = high_x end correction_for_or_equal_operators = 0 + 1/num_distinct if #low = ">=" + 1/num_distinct if #high = "<=" correction_for_special_case = 0 + 1/num_distinct if low_x = min_x**and**#low = ">" + 1/num_distinct if high_x = max_x**and**#high = "<"

The discussion starts here.

And here's my speculation about the rationale behind this strange (but quite sound in hindsight) behaviour.

alberto.dellera@gmail.com