Oracle8i interMedia Text Migration Release 8.1.5 A67845-01 |
|
This chapter describes how to migrate your pre-8.1.5 queries to interMedia Text 8.1.5. The following topics are covered:
The basic interMedia Text query takes a query expression, usually a word with or without operators, as input. Oracle returns all documents (previously indexed) that contain that satisfy the expression along with a relevance score for each document. Scores can be used to order the documents in the result set.
A Text query can include one or more CONTAINS clauses and one or more structured clauses.
The basic Text query has not changed for 8.1.5.
Apart from a few operators (discussed in this chapter) that are no longer supported, the basic Text query expression syntax in 8.1.5 (everything between the single quotes) is the same as in pre-8.1.5.
In pre-8.1.5, the system enabled you to execute queries using one of three methods, namely the one-step, two-step, or cursor query, formerly known as an in-memory query.
In 8.1.5, Oracle no longer supports the two-step method that uses the PL/SQL CONTAINS procedure followed by a join on the result and base table. The two-step query functionality is available through the new Text query which uses the SELECT statement.
In 8.1.5, the only query method is the standard SQL SELECT statement in which you use the CONTAINS operator in the WHERE clause. As this query is standard SQL, you can use it programatically wherever you can use the SELECT statement, such as in PL/SQL cursors.
In 8.1.5, the Text query replaces the pre-8.1.5 two-step method. The Text query is akin to the pre-8.1.5 one-step query in so far as it is executed with a single SELECT statement. In addition, the new 8.1.5 Text query uses no result tables.
This section describes how to migrate your two-step queries to the new Text query.
In the pre-8.1.5 method, you create a result table as follows:
create table CTX_TEMP( textkey varchar2(64), score number, conid number);
Alternatively, you can also create a result table using CTX_QUERY.GETTAB.
You execute the CONTAINS procedure as follows:
execute ctx_query.contains('ARTICLE_POLICY','petroleum','CTX_TEMP');
You then join the result table with the base table to retrieve the document text as follows:
SELECT SCORE, title FROM CTX_TEMP, TEXTTAB WHERE texttab.PK=ctx_temp.textkey ORDER BY SCORE DESC;
In the SELECT statement, specify the query in the WHERE clause with the CONTAINS operator. Also specify the SCORE operator to return the score of each hit in the hitlist. The following example shows how to issue a query:
SELECT SCORE(1) title from news WHERE CONTAINS(text, 'oracle', 1) > 0;
You can order the results from the highest scoring documents to the lowest scoring documents using the ORDER BY clause as follows:
SELECT SCORE(1), title from news WHERE CONTAINS(text, 'oracle', 1) > 0 ORDER BY SCORE(1) DESC;
In a PL/SQL application, you can use a cursor to fetch the results of the query. The following example issues a query against the NEWS table to find all articles that contain the word oracle. The titles and scores of the first ten hits are output to standard out.
declare rowno number := 0; begin for c1 in (SELECT SCORE(1) score, title FROM news WHERE CONTAINS(text, 'oracle', 1) > 0 ORDER BY SCORE(1) DESC) loop rowno := rowno + 1; dbms_output.put_line(c1.title||': '||c1.score); exit when rowno = 10; end loop; end;
This example uses a cursor FOR loop to retrieve the first ten hits. An alias score is declared for the return value of the SCORE operator. The score and title are output to standard out using cursor dot notation.
You can also optimize this query for response time.
See Also:
For more information about optimizing for response time, see "Cursor Query" in this chapter. |
In pre-8.1.5, you use a cursor query, formerly known as an in-memory query, over a Text query when you want only a small portion of a potentially large hitlist.
In 8.1.5, the PL/SQL interface for in-memory queries is obsolete. This means that the following procedures are obsolete in 8.1.5:
To migrate pre-8.1.5 in-memory queries, use a cursor. Use the FIRST_ROWS hint in the SELECT statement to obtain the first n hits of a potentially large hitlist.
The following in-memory query finds all documents that contain the word oracle and returns them in score sorted order. The mechanism of the query returns the hits row by row in order, thus allowing you to extract the first n hits without spending the overhead of obtaining the entire hitlist first.
declare pk varchar2(80); scr number; cur number; begin cur := ctx_query.open_con('mypolicy','oracle',TRUE); while (ctx_query.fetch_hit(cur, pk, scr) > 0) loop -- deal with hit end loop; ctx_query.close_con(cur); end;
The pre-8.1.5 cursor query procedures OPEN_CON, FETCH_HIT, CLOSE_CON, COUNT_LAST are obsolete in 8.1.5.
To obtain the first n hits of a potentially large hitlist, execute the CONTAINS query using a cursor. Use the FIRST_ROWS hint to optimize for response time in the SELECT statement as follows:
begin for c1 in (select /*+ FIRST_ROWS */ pk, score(1) scr from basetable where contains(textcol, 'oracle', 1) > 0 order by scr desc) loop -- deal with hit dbms_output.put_line('KEY is '||c1.pk); dbms_output.put_line('SCORE is '||c1.scr); end loop; end;
See Also: To learn more about using the FIRST_ROWS hint with CONTAINS queries, see the Oracle8i interMedia Text Reference.
A structured Text query, also called a mixed query, is a query that has a CONTAINS predicate to query a text column and has another predicate to query a structured data column.
In pre-8.1.5, you specified the structured predicate as a parameter to the CONTAINS procedure.
In 8.1.5, a Text query uses standard SQL. To issue a structured query, you specify the structured clause in the WHERE condition of the SELECT statement.
To query on structured columns, you use the struct_query parameter in the CONTAINS procedure. The following example returns all articles that contain the word oracle that were written on or after October 1st, 1996:
exec ctx_query.contains('news','oracle','res_tab', struct_query => 'issue_date >= (''1-OCT-1996'')')
In older versions of ConText, the struct_query parameter is not available in the CONTAINS procedure. In these releases, you specify the structured condition when you join the result and base table.
For example, a query on the word oracle against a base table CTX_TEMP, looks like this:
execute ctx_query.contains('ARTICLE_POLICY','oracle','CTX_TEMP')
When you join the result table with the base table, you specify the structured condition to retrieve the document text as follows:
SELECT score, title FROM CTX_TEMP, TEXTTAB WHERE texttab.PK=ctx_temp.textkey AND texttab.issue_date >= ('01-OCT-96') ORDER BY score DESC;
Specify the structured condition in the WHERE condition of the SELECT statement. The following SELECT statement does the same thing as the above query. It returns all articles that contain the word Oracle that were written on or after October 1st, 1997:
SELECT SCORE(1), title, issue_date from news WHERE CONTAINS(text, 'oracle', 1) > 0 AND issue_date >= ('01-OCT-97') ORDER BY SCORE(1) DESC;
A theme query is a query on a concept. The query string is usually a concept or theme that represents the idea to be searched on. Oracle returns the documents that contain the theme.
In pre-8.1.5, you issue a theme query by first creating a theme policy to create a separate theme index. You then specify the theme policy in the CONTAINS procedure.
In 8.1.5, a single Text index contains word and theme information. You issue them queries using the ABOUT operator.
To issue a theme query, you first index your text column with a policy that has a theme lexer associated with it. To issue the query, you specify the same theme policy and the string for the theme query.
For example, assuming that THEME_POL is a theme policy, you retrieve all documents about the theme of insects using a two-step query as follows:
execute ctx_query.contains('THEME_POL', 'insects', 'CTX_TEMP');
Word information and theme information are combined into a single index. To issue a theme query, your index must have a theme component
.
See Also:
For more information about creating a theme component to your index, see Chapter 5, "Indexing". |
You issue a theme query using the ABOUT operator inside the query expression. For example, to retrieve all documents that are about insects, write your query as follows:
SELECT SCORE(1), title FROM news WHERE CONTAINS(text, 'about(insects)', 1) > 0 ORDER BY SCORE(1) DESC;
See Also: For more information about using the ABOUT operator, see Oracle8i interMedia Text Reference
Composite textkey queries are queries on a base table that is indexed with a composite textkey.
The first step in issuing a composite textkey query is to create a result table manually with a composite textkey consisting of two columns as follows:
create table CTX_TEMP2( textkey varchar2(64), textkey2 varchar2(64), score number, conid number);
You then join the result and base table with and AND operator in the WHERE condition. For example:
exec ctx_query.contains('ARTICLE2_POLICY','petroleum','CTX_TEMP2') SELECT score, title FROM CTX_TEMP2, TEXTTAB2 WHERE texttab2.PK=ctx_temp2.textkey AND texttab2.PK2=ctx_temp2.textkey2 ORDER BY score DESC;
The 8.1.5 query is a basic SELECT statement. Because no result tables are used, there is no join between a result table and a base table as in pre-8.1.5 CONTAINS. You thus issue queries against a composite textkey table the same way you issue a query against a table with a single column textkey.
If textab2 is the composite textkey table, the above query is written as:
SELECT SCORE(1),title FROM textab2 WHERE CONTAINS(text,'petroleum') > 0 ORDER BY SCORE(1) DESC;
The max and first/next result-set operators are no longer supported in interMedia Text 8.1.5.
The Max operator is used to obtain a given number of the highest scoring documents in a query result set. For example, to obtain the twenty highest scoring documents that contain the word dog, you can write:
'dog:20'
The first/next operator is used to obtain a range of documents in an unsorted query result-set. For example, to obtain documents 11 through twenty that contain the word dog, you can write:
'dog#11-20'
The max and first/next operators are not supported in interMedia Text 8.1.5. You can use a cursor query optimized for response time in PL/SQL to achieve the results for a max or first/next type of query.
A query optimized for response time provides a fast solution for when you need the highest scoring documents from a hitlist.
The example below returns the first twenty hits to standard out. This example uses the FIRST_ROWS hint and a cursor.
declare cursor c is select /*+ FIRST_ROWS */ title, score(1) score from news where contains(txt_col, 'dog', 1) > 0 order by score(1) desc; begin for c1 in c loop dbms_output.put_line(c1.score||':'||substr(c1.title,1,50)); exit when c%rowcount = 20; end loop; end; /
See Also: To learn more about optimizing queries for response time, see the Oracle8i interMedia Text Reference.
A query optimized for response time provides a fast solution for when you need a range of documents from a hitlist sorted by score.
The solution is similar to the max doc solution in that it uses the FIRST_ROWS hint in a cursor. The code loops through the cursor to process only the hits in the required range. The example below returns the sorted documents 11 to 20 to standard out.
declare cursor c is select /*+ FIRST_ROWS */ title, score(1) score from news where contains(txt_col, 'dog', 1) > 0 order by score(1) desc; begin for c1 in c loop if (c%rowcount > 10) then dbms_output.put_line(c1.score||':'||substr(c1.title,1,50)); end if; exit when c%rowcount = 20; end loop; end; /
See Also: To learn more about optimizing queries for response time, see the Oracle8i interMedia Text Reference.
Oracle8i interMedia Text 8.1.5 no longer supports the execute operator which allows you to call a PL/SQL function in a query.
As a result of interMedia Text's integration with Oracle8, you can use standard SQL, which allows you to call functions in a SELECT statement as long as the function satisfies the requirements for being named in a SQL statement.
Calling a PL/SQL function within a query is useful for converting words to alternate forms. For example, assuming the function french returns the French equivalent of English words, you as ctxuser can search on the French word for cat by issuing:
'@ctxuser.french(cat)'
You can call user functions directly in the CONTAINS clause as long as the function satisfies the requirements for being named in a SQL statement. The caller must also have EXECUTE privilege on the function.
For example, assuming the function french returns the French equivalent of English words, you can search on the French word for cat by writing:
SELECT SCORE(1), title from news WHERE CONTAINS(text, french('cat'), 1) > 0 ORDER BY SCORE(1);
See Also:
For more information about creating user functions and calling user functions from SQL, see Oracle8 SQL Reference. |
The interMedia Text 8.1.5 release supports the CTX_QUERY.COUNT_HITS function, which you use in pre-8.1.5 to count the number of hits in a query before issuing the query. However in 8.1.5, you specify the index name rather than a policy. In addition, the struct_query parameter used in pre-8.1.5 to specify the structured predicate is obsolete.
In 8.1.5, to count the number of hits returned from a query with only a CONTAINS predicate, you can use CTX_QUERY.COUNT_HITS or COUNT(*) in a SELECT statement.
To count the number of hits returned from a query that contains a structured predicate, use the COUNT(*) function in a SELECT statement.
Because in-memory queries are obsolete in 8.1.5, the pre-8.1.5 procedure CTX_QUERY.COUNT_LAST procedure is also obsolete in 8.1.5.
You count query hits with COUNT_HITS as follows:
declare count number; begin count := ctx_query.count_hits(policy_name => my_pol, text_query => 'oracle', exact => TRUE); dbms_output.put_line('Number of docs with oracle:'); dbms_output.put_line(count); end;
To find the number of documents that contain the word oracle, you can do one of the following:
SELECT count(*) FROM news WHERE CONTAINS(text, 'oracle', 1) > 0;
declare count number; begin count := ctx_query.count_hits(index_name => my_index, text_query => 'oracle', exact => TRUE); dbms_output.put_line('Number of docs with oracle:'); dbms_output.put_line(count); end;
See Also:
To learn more about the syntax of CTX_QUERY.COUNT_HITS, see the Oracle8i interMedia Text Reference. |
To find the number of documents returned by a query with a structured predicate, use count(*) as follows:
SELECT count(*) FROM news WHERE CONTAINS(text, 'oracle', 1) > 0 and author = 'jones';
In pre-8.1.5 you can store the definition and results of a query. You can then use the SQE operator in a query expression to obtain the results. For queries such as wildcard queries, using stored query expression improves performance since results are stored.
In 8.1.5, the procedure CTX_QUERY.STORE_SQE stores only the definition of the query. No results are stored. Referencing the query with the SQE operator merely references the definition of the query. In this way, SQEs make it easy for defining long or often used query expressions.
Stored query expressions are not attached to an index. When you call CTX_QUERY.STORE_SQE, you specify only the name of the stored query expression and the query expression.
The pre-8.1.5 notion of a session SQE has gone away. The query definitions are stored in the Text data dictionary. Any user can reference a stored query expression.
See Also:
To learn more about the syntax of CTX_QUERY.STORE_SQE, see the Oracle8i interMedia Text Reference. |
The administrative procedures of REFRESH_SQE and PURGE_SQE are obsolete in interMedia Text 8.1.5.
In pre-8.1.5, you define and use a stored query expression as follows:
Administration of stored query expressions can be performed using the REFRESH_SQE, REMOVE_SQE, and PURGE_SQE procedures in the CTX_QUERY PL/SQL package.
To create a session SQE named PROG_LANG, use CTX_QUERY.STORE_SQE as follows:
exec ctx_query.store_sqe('emp_resumes', 'prog_lang', 'computer science', 'session');
This SQE queries the text column for the EMP_RESUMES policy (in this case, EMP.RESUMES) and returns all documents that contain the term cobol. It stores the results in the SQE table for the policy.
PROG_LANG can then be called within a query expression as follows:
select score, docid from emp where contains(resume, 'sqe(prog_lang)')>0 order by score;
You define and use a stored query expression as follows:
The procedures REFRESH_SQE and PURGE_SQE are obsolete. You delete using REMOVE_SQE.
The following example creates a stored query expression called disaster that searches for documents containing the words tornado, hurricane, or earthquake:
begin ctx_query.store_sqe('disaster', 'tornado | hurricane | earthquake'); end;
To execute this query in an expression, write your query as follows:
SELECT SCORE(1), title from news WHERE CONTAINS(text, 'SQE(disaster)', 1) > 0 ORDER BY SCORE(1);
See Also: To learn more about the syntax of CTX_QUERY.STORE_SQE, see the Oracle8i interMedia Text Reference.
With query explain plan, formerly known as query expression feedback, you can obtain an execution plan of a Text query before actually issuing the query. Oracle returns the explain plan information in a table, from which you can construct a parse tree.
In interMedia Text 8.1.5, this feature has been renamed from query expression feedback to query explain plan. Query explain plan should not be confused with hierarchical query feedback, which is a new, different feature.
The user interface for query explain plan has changed in the following ways:
See Also:
For more information on the syntax of CTX_QUERY.EXPLAIN and the structure of the explain table, see Oracle8i interMedia Text Reference. |
You use the CTX_QUERY.EXPLAIN to obtain expression feedback. The procedure for obtaining this information has not changed from pre-8.1.5 to 8.1.5. You must do the following:
You must use the new 8.1.5 explain syntax when you code the first three steps.
The way you construct and expansion tree from the explain table in step 4 is the same as in pre-8.1.5
.
See Also:
For examples on constructing the expansion tree, see the CTX_QUERY.EXPLAIN command syntax in the Oracle8i interMedia Text Reference. |