31
Query Rewrite

This chapter contains:

Overview of Query Rewrite

One of the major benefits of creating and maintaining materialized views is the ability to take advantage of query rewrite, which transforms a SQL statement expressed in terms of tables or views into a statement accessing one or more materialized views that are defined on the detail tables. The transformation is transparent to the end user or application, requiring no intervention and no reference to the materialized view in the SQL statement. Because query rewrite is transparent, materialized views can be added or dropped just like indexes without invalidating the SQL in the application code.

Before the query is rewritten, it is subjected to several checks to determine whether it is a candidate for query rewrite. If the query fails any of the checks, then the query is applied to the detail tables rather than the materialized view. This can be costly in terms of response time and processing power.

The Oracle optimizer uses two different methods to recognize when to rewrite a query in terms of one or more materialized views. The first method is based on matching the SQL text of the query with the SQL text of the materialized view definition. If the first method fails, the optimizer uses the more general method in which it compares join conditions, data columns, grouping columns, and aggregate functions between the query and a materialized view.

Query rewrite operates on queries and subqueries in the following types of SQL statements:

SELECT
CREATE TABLE ... AS SELECT
INSERT INTO ... SELECT

and on subqueries in the set operators UNION, UNION ALL, INTERSECT, and MINUS.

Several factors affect whether or not a given query is rewritten to use one or more materialized views:

Enabling/disabling query rewrite
- by the CREATE or ALTER statement for individual materialized views
- by the initialization parameter QUERY_REWRITE_ENABLED
- by the REWRITE and NOREWRITE hints in SQL statements
Rewrite integrity levels
Dimensions and constraints

Cost-Based Rewrite

Query rewrite is available with cost-based optimization. Oracle optimizes the input query with and without rewrite and selects the least costly alternative. The optimizer rewrites a query by rewriting one or more query blocks, one at a time.

If the rewrite logic has a choice between multiple materialized views to rewrite a query block, it will select one to optimize the ratio of the sum of the cardinality of the tables in the rewritten query block to that in the original query block. Therefore, the materialized view selected would be the one which can result in reading in the least amount of data.

After a materialized view has been picked for a rewrite, the optimizer performs the
rewrite, and then tests whether the rewritten query can be rewritten further with another materialized view. This process continues until no further rewrites are possible. Then the rewritten query is optimized and the original query is optimized. The optimizer compares these two optimizations and selects the least costly alternative.

Since optimization is based on cost, it is important to collect statistics both on tables involved in the query and on the tables representing materialized views. Statistics are fundamental measures, such as the number of rows in a table, that are used to calculate the cost of a (rewritten) query. They are created with the ANALYZE statement or by using the DBMS_STATISTICS package.

Queries that contain in-line or named views are also candidates for query rewrite. When a query contains a named view, the view name is used to do the matching between a materialized view and the query. That is, the set of named views in a materialized view definition should match exactly with the set of views in the query. When a query contains an inline view, the inline view may be merged into the query before matching between a materialized view and the query occurs.

Enabling Query Rewrite

Several steps must be followed to enable query rewrite:

Individual materialized views must have the ENABLE QUERY REWRITE clause.
The initialization parameter QUERY_REWRITE_ENABLED must be set to TRUE.
Cost-based optimization must be used either by setting the initialization parameter OPTIMIZER_MODE to "ALL_ROWS" or "FIRST_ROWS", or by analyzing the tables and setting OPTIMIZER_MODE to "CHOOSE".

If step 1 has not been completed, a materialized view will never be eligible for query rewrite. ENABLE QUERY REWRITE can be specified either when the materialized view is created, as illustrated below, or via the ALTER MATERIALIZED VIEW statement.

  CREATE MATERIALIZED VIEW store_sales_mv
   ENABLE QUERY REWRITE 
   AS  
   SELECT s.store_name,  
     SUM(dollar_sales) AS sum_dollar_sales
   FROM store s,  fact f 
   WHERE f.store_key = s.store_key 
   GROUP BY s.store_name;

You can use the initialization parameter QUERY_REWRITE_ENABLED to disable query rewrite for all materialized views, or to enable it again for all materialized views that are individually enabled. However, the QUERY_REWRITE_ENABLED parameter cannot enable query rewrite for materialized views that have disabled it with the CREATE or ALTER statement.

The NOREWRITE hint disables query rewrite in a SQL statement, overriding the QUERY_REWRITE_ENABLED parameter, and the REWRITE (mview_name, ...) hint restricts the eligible materialized views to those named in the hint.

Initialization Parameters for Query Rewrite

Query rewrite requires the following initialization parameter settings:

OPTIMIZER_MODE = "ALL_ROWS", "FIRST_ROWS", or "CHOOSE"
QUERY_REWRITE_ENABLED = TRUE
COMPATIBLE = 8.1.0 (or greater)

The QUERY_REWRITE_INTEGRITY parameter is optional, but must be set to STALE_TOLERATED, TRUSTED, or ENFORCED if it is specified (see "Accuracy of Query Rewrite"). It will default to ENFORCED if it is undefined.

Privileges for Enabling Query Rewrite

A materialized view is used based not on privileges the user has on that materialized view, but based on privileges the user has on detail tables or views in the query.

The system privilege GRANT REWRITE allows you to enable materialized views in your own schema for query rewrite only if all tables directly referenced by the materialized view are in that schema. The GRANT GLOBAL REWRITE privilege allows you to enable materialized views for query rewrite even if the materialized view references objects in other schemas.

The privileges for using materialized views for query rewrite are similar to those for definer-rights procedures. See Oracle8i Concepts for further information.

When Does Oracle Rewrite a Query?

A query gets rewritten only when a certain number of conditions are met:

Query rewrite must be enabled for the session.
A materialized view must be enabled for query rewrite.
The rewrite integrity level should allow the use of the materialized view. For example, if a materialized view is not fresh and query rewrite integrity is set to ENFORCED, then the materialized view will not be used.
Either all or part of the results requested by the query must be obtainable from the precomputed result stored in the materialized view.

To determine this, the optimizer may depend on some of the data relationships declared by the user via constraints and dimensions. Such data relationships include hierarchies, referential integrity, and uniqueness of key data, and so on.

The following sections use an example schema and a few materialized views to illustrate how the data relationships are used by the optimizer to rewrite queries. A retail database consists of these tables:

STORE   (store_key, store_name, store_city, store_state, store_country)  
PRODUCT (prod_key, prod_name, prod_brand)   
TIME    (time_key, time_day, time_week, time_month)
FACT    (store_key, prod_key, time_key, dollar_sales)

Two materialized views created on these tables contain only joins:

CREATE MATERIALIZED VIEW join_fact_store_time  
  ENABLE QUERY REWRITE   
  AS    
  SELECT s.store_key, s.store_name, f.dollar_sales, t.time_key, t.time_day,  
         f.prod_key, f.rowid, t.rowid    
  FROM   fact f, store s, time t    
  WHERE  f.time_key = t.time_key AND f.store_key = s.store_key;    
  
CREATE MATERIALIZED VIEW join_fact_store_time_oj  
  ENABLE QUERY REWRITE   
  AS    
  SELECT s.store_key, s.store_name, f.dollar_sales, t.time_key,   
         f.rowid, t.rowid    
  FROM   fact f, store s, time t    
  WHERE  f.time_key = t.time_key(+) AND f.store_key = s.store_key(+);

and two materialized views contain joins and aggregates:

CREATE MATERIALIZED VIEW sum_fact_store_time_prod  
  ENABLE QUERY REWRITE   
  AS    
  SELECT  s.store_name, time_week, p.prod_key,   
          SUM(f.dollar_sales) AS sum_sales,  
          COUNT(f.dollar_sales) AS count_sales   
  FROM   fact f, store s, time t, product p    
  WHERE  f.time_key = t.time_key  AND  f.store_key = s.store_key AND  
         f.prod_key = p.prod_key  
  GROUP BY s.store_name, time_week, p.prod_key;    
  
CREATE MATERIALIZED VIEW sum_fact_store_prod  
  ENABLE QUERY REWRITE   
  AS    
  SELECT  s.store_city, p.prod_name    
          SUM(f.dollar_sales) AS sum_sales,  
          COUNT(f.dollar_sales) AS count_sales   
  FROM   fact f, store s, product p  
  WHERE   f.store_key = s.store_key  AND f.prod_key = p.prod_key
  GROUP BY store_city, p.prod_name;

You must compute statistics on the materialized views so that the optimizer can determine based on cost whether to rewrite the queries.

ANALYZE TABLE join_fact_store_time COMPUTE STATISTICS;
ANALYZE TABLE join_fact_store_time_oj COMPUTE STATISTICS;
ANALYZE TABLE sum_fact_store_time_prod COMPUTE STATISTICS;
ANALYZE TABLE sum_fact_store_prod COMPUTE STATISTICS;

Query Rewrite Methods

The optimizer uses a number of different methods to rewrite a query. The first, most important step is to determine if all or part of the results requested by the query can be obtained from the precomputed results stored in a materialized view.

The simplest case occurs when the result stored in a materialized view exactly matches what is requested by a query. The Oracle optimizer makes this type of determination by comparing the SQL text of the query with the SQL text of the materialized view definition. This method is most straightforward and also very limiting.

When the SQL text comparison test fails, the Oracle optimizer performs a series of generalized checks based on the joins, grouping, aggregates, and column data fetched. This is accomplished by individually comparing various clauses (SELECT, FROM, WHERE, GROUP BY) of a query with those of a materialized view.

SQL Text Match Rewrite Methods

Two methods are used by the optimizer:

Full SQL text match
Partial SQL text match

In full SQL text match, the entire SQL text of a query is compared against the entire SQL text of a materialized view definition (that is, the entire SELECT expression), ignoring the white space during SQL text comparison. The following query

SELECT s.store_name, time_week, p.prod_key,   
       SUM(f.dollar_sales) AS sum_sales,  
       COUNT(f.dollar_sales) AS count_sales
FROM   fact f, store s, time t, product p    
WHERE  f.time_key = t.time_key  AND    
       f.store_key = s.store_key AND  
       f.prod_key = p.prod_key  
GROUP BY s.store_name, time_week, p.prod_key;

which matches sum_fact_store_time_prod (white space excluded) will be rewritten as:

SELECT store_name, time_week, product_key, sum_sales, count_sales
FROM   sum_fact_store_time_prod;

When full SQL text match fails, the optimizer then attempts a partial SQL text match. In this method, the SQL text starting from the FROM clause of a query is compared against the SQL text starting from the FROM clause of a materialized view definition. Therefore, this query:

SELECT s.store_name, time_week, p.prod_key,   
       AVG(f.dollar_sales) AS avg_sales  
FROM   fact f, store s, time t, product p    
WHERE  f.time_key = t.time_key  AND    
       f.store_key = s.store_key AND  
       f.prod_key = p.prod_key  
GROUP BY s.store_name, time_week, p.prod_key;

will be rewritten as:

SELECT store_name, time_week, prod_key, sum_sales/count_sales AS avg_sales
FROM   sum_fact_store_time_prod;

Note that under the partial SQL text match rewrite method, the average of sales aggregate required by the query is computed using sum of sales and count of sales aggregates stored in the materialized view.

When neither SQL text match succeeds, the optimizer uses a general query rewrite method.

General Query Rewrite Methods

The general query rewrite methods are much more powerful than SQL text match methods because they can enable the use of a materialized view even if it contains only part of the data requested by a query, or it contains more data than what is requested by a query, or it contains data in a different form which can be converted into a form required by a query. To achieve this, the optimizer compares the SQL clauses (SELECT, FROM, WHERE, GROUP BY) individually between a query and a materialized view.

The Oracle optimizer employs four different checks called:

Join Compatibility
Data Sufficiency
Grouping Compatibility
Aggregate Computability

Depending on the type of a materialized view, some or all four checks are made to determine if the materialized view can be used to rewrite a query as illustrated in the table below.

Table 31-1 Materialized View Types and General Query Rewrite Methods

	MV with Joins Only	MV with Joins and Aggregates	MV with Aggregates on a Single Table
Join Compatibility	X	X	-
Data Sufficiency	X	X	X
Grouping Compatibility	-	X	X
Aggregate Computability	-	X	X

To perform these checks, the optimizer uses data relationships on which it can depend. For example, primary key and foreign key relationships tell the optimizer that each row in the foreign key table joins with at most one row in the primary key table. Furthermore, if there is a NOT NULL constraint on the foreign key, it indicates that each row in the foreign key table joins with exactly one row in the primary key table.

Data relationships such as these are very important for query rewrite because they tell what type of result is produced by joins, grouping, or aggregation of data. Therefore, to maximize the rewritability of a large set of queries when such data relationships exist in a database, they should be declared by the user.

Join Compatibility Check

In this check, the joins in a query are compared against the joins in a materialized view. In general, this comparison results in the classification of joins into three categories:

Common joins that occur in both the query and the materialized view
Delta joins that occur in the query but not in the materialized view
Delta joins that occur in the materialized view but not in the query

Common Joins

The common join pairs between the two must be of same type, or the join in the query must be derivable from the join in the materialized view. For example, if a materialized view contains an outer join of table A with table B, and a query contains an inner join of table A with table B, the result of the inner join can be derived by filtering the anti-join rows from the result of the outer join.

For example, consider this query:

SELECT s.store_name, t.time_day, SUM(f.dollar_sales)  
FROM   fact f, store s, time t    
WHERE  f.time_key = t.time_key AND   
       f.store_key = s.store_key AND  
       t.time_day BETWEEN '01-DEC-1997' AND '31-DEC-1997'  
GROUP BY s.store_name, t.time_day;

The common joins between this query and the materialized view join_fact_store_time are:

f.time_key = t.time_key AND  f.store_key = s.store_key

They match exactly and the query can be rewritten as:

SELECT store_name, time_day, SUM(dollar_sales)  
FROM   join_fact_store_time
WHERE  time_day BETWEEN '01-DEC-1997' AND '31-DEC-1997'  
GROUP BY store_name, time_day;

The query could also be answered using the join_fact_store_time_oj materialized view where inner joins in the query can be derived from outer joins in the materialized view. The rewritten version will (transparently to the user) filter out the anti-join rows. The rewritten query will have the structure:

SELECT store_name, time_day, SUM(f.dollar_sales)  
FROM   join_fact_store_time_oj  
WHERE  time_key IS NOT NULL AND store_key IS NOT NULL AND  
       time_day BETWEEN '01-DEC-1997' AND '31-DEC-1997'  
GROUP BY store_name, time_day;

In general, if you use an outer join in a materialized view containing only joins, you should put in the materialized view either the primary key or the rowid on the right side of the outer join. For example, in the previous example, join_fact_store_time_oj there is a primary key on both store and time.

Another example of when a materialized view containing only joins is used is the case of a semi-join rewrites. That is, a query contains either an EXISTS or an IN subquery with a single table.

Consider this query, which reports the stores that had sales greater than $10,000 during the 1997 Christmas season.

SELECT DISTINCT store_name
FROM store s
WHERE EXISTS (SELECT *
              FROM fact f
              WHERE f.store_key = s.store_key
                AND f.dollar_sales > 10000
                and f.time_key between '01-DEC-1997' and '31-DEC-1997');

This query could also be seen as:

SELECT DISTINCT store_name
FROM store s
WHERE s.store_key in (SELECT f.store_key
                     FROM fact f
                     WHERE f.dollar_sales > 10000);

This query contains a semi-join 'f.store_key = s.store_key' between the store and the fact table. This query can be rewritten to use either the join_fact_store_time materialized view, if foreign key constraints are active or join_fact_store_time_oj materialized view, if primary keys are active. Observe that both materialized views contain 'f.store_key = s.store_key' which can be used to derive the semi-join in the query.

The query is rewritten with join_fact_store_time as follows:

SELECT store_name
FROM (SELECT DISTINCT store_name, store_key
      FROM join_fact_store_time
      WHERE dollar_sales > 10000
      AND f.time_key BETWEEN '01-DEC-1997' and '31-DEC-1997');

If the materialized view join_fact_store_time is partitioned by time_key, then this query is likely to be more efficient than the original query because the original join between store and fact has been avoided.

The query could be rewritten using join_fact_store_time_oj as follows.

SELECT store_name
FROM (SELECT DISTINCT store_name, store_key
      FROM join_fact_store_time_oj
      WHERE dollar_sales > 10000
        AND store_key IS NOT NULL
        AND time_key BETWEEN '01-DEC-1997' and '31-DEC-1997');

Rewrites with semi-joins are currently restricted to materialized views with joins only and are not available for materialized views with joins and aggregates.

Query Delta Joins

A query delta join is a join that appears in the query but not in the materialized view. Any number and type of delta joins in a query are allowed and they are simply retained when the query is rewritten with a materialized view. Upon rewrite, the materialized view is joined to the appropriate tables in the delta joins.

For example, consider this query:

SELECT store_name, prod_name, SUM(f.dollar_sales)  
FROM   fact f, store s, time t, product p  
WHERE  f.time_key = t.time_key AND   
       f.store_key = s.store_key AND  
       f.prod_key = p.prod_key AND   
       t.time_day BETWEEN '01-DEC-1997' AND '31-DEC-1997'  
GROUP BY  store_name, prod_name;

Using the materialized view join_fact_store_time, common joins are: f.time_key = t.time_key AND f.store_key = s.store_key. The delta join in the query is f.prod_key = p.prod_key.

The rewritten form will then join the join_fact_store_time materialized view with the product table:

SELECT store_name, prod_name, SUM(f.dollar_sales)  
FROM   join_fact_store_time mv,  product p  
WHERE  mv.prod_key = p.prod_key AND   
       mv.time_day BETWEEN '01-DEC-1997' AND '31-DEC-1997'  
GROUP BY store_name, prod_name;

Materialized View Delta Joins

All delta joins in a materialized view are required to be lossless with respect to the result of common joins. A lossless join guarantees that the result of common joins is not restricted. A lossless join is one where, if two tables called A and B are joined together, rows in table A will always match with rows in table B and no data will be lost, hence the term lossless join. For example, every row with the foreign key matches a row with a primary key provided no nulls are allowed in the foreign key. Therefore, to guarantee a lossless join, it is necessary to have FOREIGN KEY, PRIMARY KEY, and NOT NULL constraints on appropriate join keys. Alternatively, if the join between tables A and B is an outer join (A being the outer table), it is lossless as it preserves all rows of table A.

All delta joins in a materialized view are required to be non-duplicating with respect to the result of common joins. A non-duplicating join guarantees that the result of common joins is not duplicated. For example, a non-duplicating join is one where, if table A and table B are joined together, rows in table A will match with at most one row in table B and no duplication occurs. To guarantee a non-duplicating join, the key in table B must be constrained to unique values by using a primary key or unique constraint.

Consider this query which joins FACT and TIME:

SELECT t.time_day, sum(f.dollar_sales)  
FROM   fact f, time t    
WHERE  f.time_key = t.time_key AND   
       t.time_day BETWEEN '01-DEC-1997' AND '31-DEC-1997'  
GROUP  t.time_day;

The materialized view join_fact_store_time has an additional join between FACT and STORE: 'f.store_key = s.store_key'. This is the delta join in join_fact_store_time.

We can rewrite the query if this join is lossless and non-duplicating. This is the case if f.store_key is a foreign key to p.store_key and is not null. The query is therefore rewritten as:

SELECT time_day, SUM(f.dollar_sales)  
FROM   join_fact_store_time  
WHERE  time_day BETWEEN '01-DEC-1997' AND '31-DEC-1997'  
GROUP BY time_day;

The query could also be rewritten with the materialized view join_fact_store_time_oj where foreign key constraints are not needed. This view contains an outer join between fact and store: 'f.store_key = s.store_key(+)' which makes the join lossless. If s.store_key is a primary key, then the non-duplicating condition is satisfied as well and optimizer will rewrite the query as:

SELECT time_day, SUM(f.dollar_sales)  
FROM   join_fact_store_time_oj  
WHERE  time_key IS NOT NULL AND   
       time_day BETWEEN '01-DEC-1997' AND '31-DEC-1997'  
GROUP BY time_day;

The current limitations restrict most rewrites with outer joins to materialized views with joins only. There is very limited support for rewrites with materialized aggregate views with outer joins. Those views should rely on foreign key constraints to assure losslessness of delta materialized view joins.

Data Sufficiency Check

In this check, the optimizer determines if the necessary column data requested by a query can be obtained from a materialized view. For this, the equivalence of one column with another is used. For example, if an inner join between table A and table B is based on a join predicate A.X = B.X, then the data in column A.X will equal the data in column B.X in the result of the join. This data property is used to match column A.X in a query with column B.X in a materialized view or vice versa.

For example, consider this query:

SELECT s.store_name, f.time_key, SUM(f.dollar_sales)  
FROM   fact f, store s, time t    
WHERE  f.time_key = t.time_key AND   
       f.store_key = s.store_key    
GROUP BY s.store_name, f.time_key;

This query can be answered with join_fact_store_time even though the materialized view doesn't have f.time_key. Instead, it has t.time_key which, through a join condition 'f.time_key = t.time_key', is equivalent to f.time_key.

Thus, the optimizer may select this rewrite:

SELECT store_name, time_day, SUM(dollar_sales)  
FROM   join_fact_store_time  
GROUP BY store_name, time_key;

If some column data requested by a query cannot be obtained from a materialized view, the optimizer further determines if it can be obtained based on a data relationship called functional dependency. When the data in a column can determine data in another column, such a relationship is called functional dependency or functional determinance. For example, if a table contains a primary key column called prod_key and another column called prod_name, then, given a prod_key value, it is possible to look up the corresponding prod_name. The opposite is not true, which means a prod_name value need not relate to a unique prod_key.

When the column data required by a query is not available from a materialized view, such column data can still be obtained by joining the materialized view back to the table that contains required column data provided the materialized view contains a key that functionally determines the required column data.

For example, consider this query:

SELECT  s.store_name, t.time_week, p.prod_name,   
        SUM(f.dollar_sales) AS sum_sales,  
FROM   fact f, store s, time t, product p    
WHERE  f.time_key = t.time_key  AND  f.store_key = s.store_key AND  
       f.prod_key = p.prod_key  AND p.prod_brand = 'KELLOGG'
GROUP BY s.store_name, t.time_week, p.prod_name;

The materialized view sum_fact_store_time_prod contains p.prod_key, but not p.prod_brand. However, we can join sum_fact_store_time_prod back to PRODUCT to retrieve prod_brand because prod_key functionally determines prod_brand. The optimizer rewrites this query using sum_fact_store_time_prod as:

SELECT  mv.store_name, mv.time_week, p.product_key, mv.sum_sales,   
FROM    sum_fact_store_time_prod mv, product p    
WHERE   mv.prod_key = p.prod_key  AND p.prod_brand = 'KELLOGG'
GROUP BY mv.store_name, mv.time_week, p.prod_key;

Here the PRODUCT table is called a joinback table because it was originally joined in the materialized view but joined back again in the rewritten query.

There are two ways to declare functional dependency:

Using the primary key constraint
Using the DETERMINES clause of a dimension

The DETERMINES clause of a dimension definition may be the only way you could declare functional dependency when the column that determines another column cannot be a primary key. For example, the STORE table is a denormalized dimension table which has columns store_key, store_name, store_city, city_name, and store_state. Store_key functionally determines store_name and store_city functionally determines store_state.

The first functional dependency can be established by declaring store_key as the primary key, but not the second functional dependency because the store_city column contains duplicate values. In this situation, you can use the DETERMINES clause of a dimension to declare the second functional dependency.

The following dimension definition illustrates how the functional dependencies are declared.

CREATE DIMENSION store_dim 
LEVEL store_key      IS store.store_key 
LEVEL city           IS store.store_city 
LEVEL state          IS store.store_state 
LEVEL country        IS store.store_country 
  HIERARCHY geographical_rollup           ( 
             store_key     CHILD OF 
             city          CHILD OF 
             state         CHILD OF 
             country                    ) 
ATTRIBUTE store_key DETERMINES store.store_name; 
ATTRIBUTE store_city DETERMINES store.city_name;

The hierarchy geographic_rollup declares hierarchical relationships which are also 1:n functional dependencies. The 1:1 functional dependencies are declared using the DETERMINES clause, such as store_city functionally determines city_name.

The following query:

SELECT  s.store_city, p.prod_name
          SUM(f.dollar_sales) AS sum_sales,
FROM    fact f, store s, product p
WHERE   f.store_key = s.store_key AND f.prod_key = p.prod_key
          AND s.city_name = 'BELMONT'
GROUP BY s.store_city, p.prod_name;

can be rewritten by joining sum_fact_store_prod to the STORE table so that city_name is available to evaluate the predicate. But the join will be based on the store_city column, which is not a primary key in the STORE table; therefore, it allows duplicates. This is accomplished by using an inline view which selects distinct values and this view is joined to the materialized view as shown in the rewritten query below.

SELECT  iv.store_city, mv.prod_name, mv.sum_sales  
FROM    sum_fact_store_prod mv, (SELECT DISTINCT store_city, city_name  
                                   FROM store) iv  
WHERE   mv.store_city = iv.store_city AND  
          iv.store_name = 'BELMONT'  
GROUP BY iv.store_city, mv.prod_name;

This type of rewrite was possible because the fact that store_city functionally determines city_name as declared in the dimension.

Grouping Compatibility Check

This check is required only if both the materialized view and the query contain a GROUP BY clause. The optimizer first determines if the grouping of data requested by a query is exactly the same as the grouping of data stored in a materialized view. That means, the level of grouping is the same in both the query and the materialized view. For example, a query requests data grouped by store_city and a materialized view stores data grouped by store_city and store_state. The grouping is the same in both provided store_city functionally determines store_state, such as the functional dependency shown in the dimension example above.

If the grouping of data requested by a query is at a coarser level compared to the grouping of data stored in a materialized view, the optimizer can still use the materialized view to rewrite the query. For example, the materialized view sum_fact_store_time_prod groups by store_name, time_week, and prod_key. This query groups by store_name, a coarser grouping granularity:

SELECT s.store_name, SUM(f.dollar_sales) AS sum_sales,
FROM   fact f, store s
WHERE  f.store_key = s.store_key
GROUP BY s.store_name;

Therefore, the optimizer will rewrite this query as:

SELECT store_name, SUM(sum_dollar_sales) AS sum_sales,
FROM   sum_fact_store_time_prod
GROUP BY s.store_name;

In another example, a query requests data grouped by store_state whereas a materialized view stores data grouped by store_city. If store_city is a CHILD OF store_state (see the dimension example above), the grouped data stored in the materialized view can be further grouped by store_state when the query is rewritten. In other words, aggregates at store_city level (finer granularity) stored in a materialized view can be rolled up into aggregates at store_state level (coarser granularity).

For example, consider the following query:

SELECT  store_state, prod_name, SUM(f.dollar_sales) AS sum_sales   
FROM    fact f, store s, product p  
WHERE   f.store_key = s.store_key  AND  f.prod_key = p.prod_key   
GROUP BY store_state, prod_name;

Because store_city functionally determines store_state, sum_fact_store_prod can be used with a joinback to store table to retrieve store_state column data, and then aggregates can be rolled up to store_state level, as shown below:

SELECT  store_state, prod_name, sum(mv.sum_sales) AS sum_sales   
FROM    sum_fact_store_prod mv, (SELECT DISTINCT store_city, store_state
                                 FROM store) iv  
WHERE   mv.store_city = iv.store_city   
GROUP BY store_state, prod_name;

Note that for this rewrite, the data sufficiency check determines that a joinback to the STORE table is necessary, and the grouping compatibility check determines that aggregate rollup is necessary.

Aggregate Computability Check

This check is required only if both the query and the materialized view contain aggregates. Here the optimizer determines if the aggregates requested by a query can be derived or computed from one or more aggregates stored in a materialized view. For example, if a query requests AVG(X) and a materialized view contains SUM(X) and COUNT(X), then AVG(X) can be computed as SUM(X) / COUNT(X).

If the grouping compatibility check determined that the rollup of aggregates stored in a materialized view is required, then aggregate computability check determines if it is possible to roll up each aggregate requested by the query using aggregates in the materialized view.

For example, SUM(sales) at the city level can be rolled up to SUM(sales) at the state level by summing all SUM(sales) aggregates in a group with the same state value. However, AVG(sales) cannot be rolled up to a coarser level unless COUNT(sales) is also available in the materialized view. Similarly, VARIANCE(sales) or STDDEV(sales) cannot be rolled up unless COUNT(sales) and SUM(sales) are also available in the materialized view. For example, given the query:

SELECT  p.prod_name, AVG(f.dollar_sales) AS avg_sales   
FROM    fact f, product p  
WHERE   f.prod_key = p.prod_key   
GROUP BY p.prod_name;

The materialized view sum_fact_store_prod can be used to rewrite it provided the join between FACT and STORE is lossless and non-duplicating. Further, the query groups by prod_name whereas the materialized view groups by store_city, prod_name, which means the aggregates stored in the materialized view will have to be rolled up. The optimizer will rewrite the query as:

SELECT  mv.prod_name, SUM(mv.sum_sales)/SUM(mv.count_sales) AS avg_sales  
FROM    sum_fact_store_prod mv  
GROUP BY mv.prod_name;

The argument of an aggregate such as SUM can be an arithmetic expression like A+B. The optimizer will try to match an aggregate SUM(A+B) in a query with an aggregate SUM(A+B) or SUM(B+A) stored in a materialized view. In other words, expression equivalence is used when matching the argument of an aggregate in a query with the argument of a similar aggregate in a materialized view. To accomplish this, Oracle converts the aggregate argument expression into a canonical form such that two different but equivalent expressions convert into the same canonical form. For example, A*(B-C), A*B-C*A, (B-C)*A, and -A*C+A*B all convert into the same canonical form and, therefore, they are successfully matched.

When are Constraints and Dimensions Needed?

To clarify when dimensions and constraints are required for the different types of query rewrite, refer to Table 31-2.

Table 31-2 Dimension and Constraint Requirements for Query Rewrite

Rewrite Checks	Dimensions	Primary Key/Foreign Key/Not Null Constraints
Matching SQL Text	Not Required	Not Required
Join Compatibility	Not Required	Required
Data Sufficiency	Required OR	Required
Grouping Compatibility	Required	Required
Aggregate Computability	Not Required	Not Required

Accuracy of Query Rewrite

Query rewrite offers three levels of rewrite integrity that are controlled by the initialization parameter QUERY_REWRITE_INTEGRITY, which can either be set in your parameter file or controlled using the ALTER SYSTEM or ALTER SESSION command. The three values it can take are:

ENFORCED

This is the default mode. The optimizer will only use materialized views which it knows contain fresh data and only use those relationships that are based on enforced constraints.
TRUSTED

In TRUSTED mode, the optimizer trusts that the data in the materialized views based on prebuilt tables is correct, and the relationships declared in dimensions and RELY constraints are correct. In this mode, the optimizer uses prebuilt materialized views, and uses relationships that are not enforced as well as those that are enforced. In this mode, the optimizer also 'trusts' declared but not enforced constraints and data relationships specified using dimensions.
STALE_TOLERATED

In STALE_TOLERATED mode, the optimizer uses materialized views that are valid but contain stale data as well as those that contain fresh data. This mode offers the maximum rewrite capability but creates the risk of generating wrong results.

If rewrite integrity is set to the safest level, ENFORCED, the optimizer uses only enforced primary key constraints and referential integrity constraints to ensure that the results of the query are the same as the results when accessing the detail tables directly.

If the rewrite integrity is set to levels other than ENFORCED, then there are several situations where the output with rewrite may be different from that without it.

A materialized view can be out of synchronization with the master copy of the data. This generally happens because the materialized view refresh procedure is pending following bulk load or DML operations to one or more detail tables of a materialized view. At some data warehouse sites, this situation is desirable because it is not uncommon for some materialized views to be refreshed at certain time intervals.
The relationships implied by the dimension objects are invalid. For example, values at a certain level in a hierarchy do not roll up to exactly one parent value.
The values stored in a PREBUILT materialized view table may be incorrect.
Partition operations such as DROP and MOVE PARTITION on the detail table could affect the results of the materialized view.

Did Query Rewrite Occur?

Since query rewrite occurs transparently, special steps have to be taken to verify that a query has been rewritten. Of course, if the query runs faster, this should indicate that rewrite has occurred but this isn't proof. Therefore, to confirm that query rewrite does occur, use the EXPLAIN PLAN statement.

Explain Plan

The EXPLAIN PLAN facility is used as described in Oracle8i SQL Reference. For query rewrite, all you need to check is that the object_name column in PLAN_TABLE contains the materialized view name. If it does, then query rewrite will occur when this query is executed.

In this example, the materialized view store_mv has been created.

CREATE MATERIALIZED VIEW store_mv 
 ENABLE QUERY REWRITE 
 AS  
 SELECT  
   s.region, SUM(grocery_sq_ft) AS sum_floor_plan 
 FROM store s 
 GROUP BY s.region;

If EXPLAIN PLAN is used on this SQL statement, the results are placed in the default table PLAN_TABLE.

EXPLAIN PLAN  
FOR 
SELECT  s.region, SUM(grocery_sq_ft) 
FROM store s 
GROUP BY s.region;

For the purposes of query rewrite, the only information of interest from PLAN_TABLE is the OBJECT_NAME, which identifies the objects that will be used to execute this query. Therefore, you would expect to see the object name STORE_MV in the output as illustrated below.

SELECT  object_name FROM plan_table; 

OBJECT_NAME                    
------------------------------ 
                               
STORE_MV                   
2 rows selected.

See Also:
For more information about hints, please refer to Chapter , "Using Hints".

Controlling Query Rewrite

A materialized view is only eligible for query rewrite if the ENABLE QUERY REWRITE clause has been specified, either initially when the materialized view was first created or subsequently via an ALTER MATERIALIZED VIEW command.

The initialization parameters described above can be set using the ALTER SYSTEM SET command. For a given user's session, ALTER SESSION can be used to disable or enable query rewrite for that session only. For example:

ALTER SESSION SET QUERY_REWRITE_ENABLED = TRUE;

The correctness of query rewrite can be set for a session, thus allowing different users to work at different integrity levels.

ALTER SESSION SET QUERY_REWRITE_INTEGRITY = STALE_TOLERATED;
ALTER SESSION SET QUERY_REWRITE_INTEGRITY = TRUSTED;
ALTER SESSION SET QUERY_REWRITE_INTEGRITY = ENFORCED;

Rewrite Hints

Hints may be included in SQL statements to control whether query rewrite occurs. Using the NOREWRITE hint with no argument in a query prevents the optimizer from rewriting it.

The REWRITE hint with no argument in a query forces the optimizer to use a materialized view (if any) to rewrite it regardless of the cost.

The REWRITE (mv1, mv2, ...) hint with argument(s) forces rewrite to select the most suitable materialized view from the list of names specified.

For example, to prevent a rewrite, you can use:

SELECT  /*+ NOREWRITE */ s.city, SUM(s.grocery_sq_ft)
FROM store s
GROUP BY s.city;

and to force a rewrite using mv1, you can use:

SELECT /*+ REWRITE (mv1) */ s.city, SUM(s.grocery_sq_ft)
FROM store s
GROUP BY s.city;

Guidelines for Using Query Rewrite

The following guidelines will help in getting the maximum benefit from query rewrite. They are not mandatory for using query rewrite and rewrite is not guaranteed if you follow them. They are general rules of thumb.

Constraints

Make sure all inner joins referred to in a materialized view have referential integrity (foreign key - primary key constraints) with additional NOT NULL constraints on the foreign key columns. Since constraints tend to impose a large overhead, you could make them NONVALIDATE and RELY and set the parameter QUERY_REWRITE_INTEGRITY to STALE_TOLERATED or TRUSTED. However, if you set QUERY_REWRITE_INTEGRITY to ENFORCED, all constraints must be enforced to get maximum rewritability.

Dimensions

You can express the hierarchical relationships and functional dependencies in normalized or denormalized dimension tables using the HIERARCHY clause of a dimension. Dimensions can express intra-table relationships which cannot be expressed by any constraints. Set the parameter QUERY_REWRITE_INTEGRITY to TRUSTED or STALE_TOLERATED for query rewrite to take advantage of the relationships declared in dimensions.

Outer Joins

Another way of avoiding constraints is to use outer joins in the materialized view. Query rewrite will be able to derive an inner join in the query, such as (A.a = B.b), from an outer join in the materialized view (A.a = B.b(+)), as long as the rowid of B or column B.b is available in the materialized view. Most of the support for rewrites with outer joins is provided for materialized views with joins only. To exploit it, a materialized view with outer joins should store the rowid or primary key of the inner table of an outer join. For example, the materialized view join_fact_store_time_oj stores the primary keys store_key and time_key of the inner tables of outer joins.

SQL Text Match

If you need to speed up an extremely complex, long-running query, you could create a materialized view with the exact text of the query.

Aggregates

In order to get the maximum benefit from query rewrite, make sure that all aggregates which are needed to compute ones in the targeted query are present in the materialized view. The conditions on aggregates are quite similar to those for incremental refresh. For instance, if AVG(x) is in the query, then you should store COUNT(x) and AVG(x) or store SUM(x) and COUNT(x) in the materialized view. Refer to Table 29-1, "Requirements for Fast Refresh of Materialized Views" .

Grouping Conditions

Aggregating data at lower levels in the hierarchy is better than aggregating at higher levels because lower levels can be used to rewrite more queries. Note, however, that doing so will also take up more space. For example, instead of grouping on state, group on city (unless space constraints prohibit it).

Instead of creating multiple materialized views with overlapping or hierarchically related GROUP BY columns, create a single materialized view with all those GROUP BY columns. For example, instead of using a materialized view that groups by city and another materialized view that groups by month, use a materialized view that groups by city and month.

Use GROUP BY on columns which correspond to levels in a dimension but not on columns that are functionally dependent, because query rewrite will be able to use the functional dependencies automatically based on the DETERMINES clause in a dimension. For example, instead of grouping on city_name, group on city_id (as long as there is a dimension which indicates that the attribute city_id determines city_name, you will enable the rewrite of a query involving city_name).

Statistics

Optimization with materialized views is based on cost and the optimizer needs statistics of both the materialized view and the tables in the query to make a cost-based choice. Materialized views should thus have statistics collected using either the ANALYZE TABLE statement or the DBMS_STATISTICS package.

See Also:
For more information about collecting statistics, please refer to "Generating Statistics".

31 Query Rewrite

Overview of Query Rewrite

Cost-Based Rewrite

Enabling Query Rewrite

Initialization Parameters for Query Rewrite

Privileges for Enabling Query Rewrite

When Does Oracle Rewrite a Query?

Query Rewrite Methods

SQL Text Match Rewrite Methods

General Query Rewrite Methods

Table 31-1 Materialized View Types and General Query Rewrite Methods

Join Compatibility Check

Common Joins

Query Delta Joins

Materialized View Delta Joins

Data Sufficiency Check

Grouping Compatibility Check

Aggregate Computability Check

When are Constraints and Dimensions Needed?

Table 31-2 Dimension and Constraint Requirements for Query Rewrite

Accuracy of Query Rewrite

Did Query Rewrite Occur?

Explain Plan

Controlling Query Rewrite

Rewrite Hints

Guidelines for Using Query Rewrite

Constraints

Dimensions

Outer Joins

SQL Text Match

Aggregates

Grouping Conditions

Statistics

31
Query Rewrite