Oracle8i interMedia Text Reference Release 8.1.5 A67843-01 |
|
This appendix discusses how to use the extensible query optimizer to optimize queries with CONTAINS predicates.
The following topics are covered:
Query optimization with statistics involves using the collected statistics on the tables and indexes involved in a query to select an execution plan that can process the query in the most efficient manner. The optimizer attempts to choose the best execution plan based on the following parameters:
The following sections describe how to use statistics with the extensible query optimizer. Optimizing with statistics allows for a more accurate estimation of the selectivity and costs of the CONTAINS predicate and thus a better execution plan.
By default, the extensible query optimizer is enabled. To use the extensible optimizer, you must calculate the statistics on the table you query. To do so, issue the following statement:
ANALYZE TABLE <table_name> COMPUTE STATISTICS;
This statement collects statistics on all the objects associated with table_name including the table columns and any indexes (b-tree, bitmap or Text domain) associated with the table. You can issue the above ANALYZE command as many times as necessary to re-collect the statistics on a table.
See Also:
For more information on the ANALYZE command, see Oracle8 SQL Reference and Oracle8 Tuning. |
By collecting statistics on the Text domain index, the extensible query optimizer is able to do the following:
Knowing the selectivity of a CONTAINS predicate is useful for queries that contain more than one predicate, such as in structured queries. This way the extensible query optimizer can better decide whether to use the domain index to evaluate CONTAINS or to apply the CONTAINS predicate as a post filter.
Consider the following structured query:
select score(1) from tab where contains(txt, 'freedom', 1) > 0 and author = 'King' and year > 1960;
Assume the author column is of type VARCHAR2 and the year column is of type NUMBER. Assume that there is a b-tree index on the author column.
Also assume that the structured author predicate is highly selective with respect to the CONTAINS predicate and the year predicate; that is, the structured predicate (author = 'King') returns a much smaller number of rows with respect to the year and CONTAINS predicates individually, say 5 rows versus 1000 and 1500 rows respectively.
In this situation, Oracle can execute this query more efficiently by first doing a b-tree index range scan on the structured predicate (author = 'King'), followed by a table access by rowid, and then applying the other two predicates to the rows returned from the b-tree table access.
Without associating a statistics type with indextype context, the extensible query optimizer will always choose to process the CONTAINS() predicate using the text domain index.
Note: When the statistics are not collected for a Text index, the behavior is the same as not enabling the extensible query optimizer. |
You can re-collect statistics on a single index by issuing:
ANALYZE INDEX <index_name> COMPUTE STATISTICS;
You can delete the statistics associated with a table by issuing:
ANALYZE TABLE <table_name> DELETE STATISTICS;
You can delete statistics on one index by issuing the following statement:
ANALYZE INDEX <index_name> DELETE STATISTICS;
By default the extensible query optimizer is enabled. To disable the extensible query optimizer, issue the following statements:
DISASSOCIATE STATISTICS FROM INDEXTYPES ConText; DISASSOCIATE STATISTICS FROM PACKAGES ctx_contains;
After disabling the extensible query optimizer, you can re-enable it. To do so, issue the following SQL statements as CTXSYS:
ASSOCIATE STATISTICS WITH INDEXTYPES ConText USING textoptstats; ASSOCIATE STATISTICS WITH PACKAGES ctx_contains USING textoptstats;
By default, Oracle optimizes queries for throughput. This results in queries returning all rows in shortest time possible.
However, in many cases, especially in a web-application scenario, queries must be optimized for response time, when you are only interested in obtaining the first n hits of a potentially large hitlist in the shortest time possible.
The following sections describe how to optimize Text queries for response time. You can do so in two ways:
Note: Although both methods optimize for response time, the execution plans of the two methods obtained with EXPLAIN PLAN might be different for a given query. |
You can change the default query optimizer mode to optimize for response time using the FIRST_ROWS hint. When queries are optimized for response time, Oracle returns the first n rows in the shortest time possible.
For example, consider the following PL/SQL block that uses a cursor to retrieve the first 20 hits of a query and uses the FIRST_ROWS hint to optimize the response time:
declare cursor c is select /*+ FIRST_ROWS */ pk, score(1), col from ctx_tab where contains(txt_col, 'test', 1) > 0 order by score(1) desc; begin for i in c loop insert into t_s values(i.pk, i.col); exit when c%rowcount > 21; end loop; end; /
The cursor c is a SELECT statement that returns the rowids that contain the word test in sorted order. The code loops through the cursor to extract the first 20 rows. These rows are stored in the temporary table t_s.
With the FIRST_ROWS hint, Oracle instructs the Text index to return rowids in score-sorted order, if possible.
Without the hint, Oracle sorts the rowids after the Text index has returned all the rows in unsorted order that satisfy the CONTAINS predicate. Retrieving the entire result set as such takes time.
Since only the first 20 hits are needed in this query, using the hint results in better performance.
Besides instructing the Text index to return hits in score-sorted order, the FIRST_ROWS hint also tries to avoid blocking operations when optimizing queries for response time. Blocking operations include merge joins, hash joins and bitmap operations.
As a result, using the FIRST_ROWS hint to optimize for response time might result in a different execution plan than using CHOOSE with DOMAIN_INDEX_SORT, which also optimizes for response time.
You can examine query execution plans using the EXPLAIN PLAN command in SQL.
See Also:
For more information about the query optimizer and using hints such as FIRST_ROWS and CHOOSE, see Oracle8i Concepts and Oracle8i Tuning. For more information about the EXPLAIN PLAN command, see Oracle8i SQL Reference |
When you use the CHOOSE or ALL_ROWS optimizer hints, the query is optimized for throughput. This is the default optimizer mode. In this mode, Oracle does not instruct the Text domain index to return score-sorted rows, choosing instead to sort all the rows fetched from the Text index.
To optimize for fast response time under CHOOSE or ALL_ROWS modes, you can use the DOMAIN_INDEX_SORT hint as follows:
declare cursor c is select /*+ CHOOSE DOMAIN_INDEX_SORT */ pk, score(1), col from ctx_tab where contains(txt_col, 'test', 1) > 0 order by score(1) desc; begin for i in c loop insert into t_s values(i.pk, i.col); exit when c%rowcount > 21; end loop; end; /
See Also:
For more information about the query optimizer and using hints such as FIRST_ROWS and CHOOSE, see Oracle8i Concepts and Oracle8i Tuning. For more information about the EXPLAIN PLAN command, see Oracle8i SQL Reference |
By default, queries are optimized for throughput under the CHOOSE and ALL_ROWS modes. When queries are optimized for throughput, Oracle returns all rows in the shortest time possible.
In FIRST_ROWS mode, the extensible query optimizer optimizes for fast response time by having the Text domain index return score-sorted rows, if possible. This is the default behavior when you use the FIRST_ROWS hint.
If you want to optimize for better throughput under FIRST_ROWS, you can use the DOMAIN_INDEX_NO_SORT hint. Better throughput means you are interested in getting all the rows to a query in the shortest time.
The following example achieves better throughput by not using the Text domain index to return score-sorted rows. Instead, Oracle sorts the rows after all the rows that satisfy the CONTAINS predicate are retrieved from the index:
select /*+ FIRST_ROWS DOMAIN_INDEX_NO_SORT */ pk, score(1), col from ctx_tab where contains(txt_col, 'test', 1) > 0 order by score(1) desc;
See Also:
For more information about the query optimizer and using hints such as FIRST_ROWS and CHOOSE, see Oracle8i Concepts and Oracle8i Tuning. |