Read an Excerpt
The SQL Optimizers
Whenever you execute a SQL statement, a component of the database known as the optimizer must decide how best to access the data operated on by that statement. Oracle supports two optimizers: the rule-base optimizer (which was the original), and the cost-based optimizer.
To figure out the optimal execution path for a statement, the optimizers consider the following:
- The syntax you've specified for the statement
- Any conditions that the data must satisfy (the WHERE clauses)
- The database tables your statement will need to access
- All possible indexes that can be used in retrieving data from the table
- The Oracle RDBMS version
- The current optimizer mode
- SQL statement hints
- All available object statistics (generated via the ANALYZE command)
- The physical table location (distributed SQL)
- INIT.ORA settings (parallel query, async I/O, etc.)
Oracle gives you a choice of two optimizing alternatives: the predictable rule-based optimizer and the more intelligent cost-based optimizer.
Understanding the Rule-Based Optimizer
The rule-based optimizer (RBO) uses a predefined set of precedence rules to figure out which path it will use to access the database. The RDBMS kernel defaults to the rule-based optimizer under a number of conditions, including:
- OPTIMIZER_MODE = RULE is specified in your INIT.ORA file
- OPTIMIZER_MODE = CHOOSE is specified in your INIT.ORA file, and no statistics exist for any table involved in the statement
- An ALTER SESSION SET OPTIMIZER_MODE = RULE command has been issued
- An ALTER SESSION SET OPTIMIZER_MODE = CHOOSE command has been issued, and no statistics exist for any table involved in the statement
- The rule hint (e.g., SELECT /*+ RULE */. . .) has been used in the statement
The rule-based optimizer is driven primarily by 20 condition rankings, or "golden rules." These rules instruct the optimizer how to determine the execution path for a statement, when to choose one index over another, and when to perform a full table scan. These rules, shown in Table 1, are fixed, predetermined, and, in contrast with the cost-based optimizer, not influenced by outside sources (table volumes, index distributions, etc.).
Rank |
Condition |
---|---|
1 |
ROWID = constant |
2 |
Cluster join with unique or primary key = constant |
3 |
Hash cluster key with unique or primary key = constant |
4 |
Entire Unique concatenated index = constant |
5 |
Unique indexed column = constant |
6 |
Entire cluster key = corresponding cluster key of another table in the same cluster |
7 |
Hash cluster key = constant |
8 |
Entire cluster key = constant |
9 |
Entire non-UNIQUE CONCATENATED index = constant |
10 |
Non-UNIQUE index merge |
11 |
Entire concatenated index = lower bound |
12 |
Most leading column(s) of concatenated index = constant |
13 |
Indexed column between low value and high value or indexed column LIKE "ABC%" (bounded range) |
14 |
Non-UNIQUE indexed column between low value and high value or indexed column like `ABC%' (bounded range) |
15 |
UNIQUE indexed column or constant (unbounded range) |
16 |
Non-UNIQUE indexed column or constant (unbounded range) |
17 |
Equality on non-indexed = column or constant (sort/merge join) |
18 |
MAX or MIN of single indexed columns |
19 |
ORDER BY entire index |
20 |
Full table scans |
While knowing the rules is helpful, they alone do not tell you enough about how to tune for the rule-based optimizer. To overcome this deficiency, the following sections provide some information that the rules don't tell you.
What the RBO rules don't tell you #1
Only single column indexes are ever merged. Consider the following SQL and indexes:
SELECT col1, ...
FROM emp
WHERE emp_name = 'GURRY'
AND emp_no = 127
AND dept_no = 12
Index1 (dept_no)
Index2 (emp_no, emp_name)
The SELECT statement looks at all three indexed columns. Many people believe that Oracle will merge the two indexes, which involve those three columns, to return the requested data. In fact, only the two-column index is used; the single-column index is not used. While Oracle will merge two single-column indexes, it will not merge a multi-column index with another index.
There is one thing to be aware of with respect to this scenario. If the single-column index is a unique or primary key index, that would cause the single-column index to take precedence over the multi-column index. Compare rank 4 with rank 9 in Table 1.
NOTE: Oracle8i introduced a new hint, INDEX_JOIN, that allows you to join multi-column indexes.
What the RBO rules don't tell you #2
If all columns in an index are specified in the WHERE clause, that index will be used in preference to other indexes for which some columns are referenced. For example:
SELECT col1, ...
FROM emp
WHERE emp_name = 'GURRY'
AND emp_no = 127
AND dept_no = 12
Index1 (emp_name)
Index2 (emp_no, dept_no, cost_center)
In this example, only Index1 is used, because the WHERE clause includes all columns for that index, but does not include all columns for Index2.
What the RBO rules don't tell you #3
If multiple indexes can be applied to a WHERE clause, and they all have an equal number of columns specified, only the index created last will be used. For example:
SELECT col1, ...
FROM emp
WHERE emp_name = 'GURRY'
AND emp_no = 127
AND dept_no = 12
AND emp_category = 'CLERK'
Index1 (emp_name, emp_category) Created 4pm Feb 11th 2002
Index2 (emp_no, dept_no) Created 5pm Feb 11th 2002
In this example, only Index2 is used, because it was created at 5 p.m. and the other index was created at 4 p.m. This behavior can pose a problem, because if you rebuild indexes in a different order than they were first created, a different index may suddenly be used for your queries. To deal with this problem, many sites have a naming standard requiring that indexes are named in alphabetical order as they are created. Then, if a table is rebuilt, the indexes can be rebuilt in alphabetical order, preserving the correct creation order. You could, for example, number your indexes. Each new index added to a table would then be given the next number.
What the RBO rules don't tell you #4
If multiple columns of an index are being accessed with an = operator, that will override other operators such as LIKE or BETWEEN. Two ='s will override two ='s and a LIKE. For example:
SELECT col1, ...
FROM emp
WHERE emp_name LIKE 'GUR%'
AND emp_no = 127
AND dept_no = 12
AND emp_category = 'CLERK'
AND emp_class = 'C1'
Index1 (emp_category, emp_class, emp_name)
Index2 (emp_no, dept_no)
In this example, only Index2 is utilized despite Index1 having three columns accessed and Index2 having only two column accessed....