Oracle8(TM) Server Replication Release 8.0 A54651-01 |
|
This chapter describes several common problems that you may encounter when using the advanced replication facility, and suggested resolutions. The topics include:
Note: When you diagnose a replication problem, you often need to consult one or more data dictionary views. For the replication catalog views, always use the SYS.DBA_name views if you are authorized. Otherwise, use either the ALL_name or USER_name views. The views for deferred remote procedure calls are all owned by SYS and have no prefix.
Note: Most of the activities described in this chapter can be accomplished much more easily by using Oracle's Replication Manager, a GUI interface for replication. See the documentation for Oracle Replication Manager.
Different database links can have different propagation and refresh intervals.
If you think a database link is not functioning properly, you can drop and recreate it using Oracle Enterprise Manager, Oracle Server Manager, or another tool.
Make sure that the scheduled interval is what you want.
Make sure that the scheduled interval is not shorter than the required execution time.
If you used a connection qualifier in a database link a given site, the other sites that link to that site must have the exact same connection qualifier. For example, if you create a database link as follows:
CREATE DATABASE LINK myethernet CONNECT TO repsys IDENTIFIED BY secret
USING 'connect_string_myethernet'
all the sites, whether masters or snapshots, associated with the myethernet
database link must include the 'connect_string_myethernet'
connect string.
The DBA_REPCATLOG view shows the interim and final status for asynchronous administrative activities. You should examine this table before enabling a replication environment with RESUME_MASTER_ACTIVITY. You should also examine it whenever you suspect replication administration problems. The master definition site uses its DBA_REPCATLOG view to record both local and remote activities. Each of these activities is explained below.
For each local activity, there is a row in the master definition site's DBA_REPCATLOG view. The STATUS column in this row begins with the value READY. If the activity completes normally, the row is deleted from the DBA_REPCATLOG view. If the activity encounters a problem, the Oracle error number is captured in the ERRNUM column and the error message is captured in the MESSAGE column. These columns are helpful when diagnosing advanced replication problems.
For a remote activity, the advanced replication facility creates two rows that appear in the DBA_REPCATLOG view: one at the master definition site with a STATUS value of AWAIT_CALLBACK, and one at the remote master with a STATUS value of READY. What happens to these two log rows depends on whether the remote activity completes normally.
DO_DEFERRED_REPCAT_ADMIN executes the requests in the local DBA_REPCATLOG submitted by the user that invoked DO_DEFERRED_REPCAT_ADMIN in the order determined by the ID column. When DO_DEFERRED_REPCAT_ADMIN is executed at a master that is not the master definition site, it does as much as possible. Some asynchronous activities such as populating a replicated table require communication with the master definition site. If this communication is not possible, DO_DEFERRED_REPCAT_ADMIN stops executing rows from DBA_REPCATLOG to avoid executing DBA_REPCATLOG rows out of order. Some communication with the master definition site, such as the final step of updating or deleting a DBA_REPCATLOG row at the master definition site, can be deferred and will not prevent DO_DEFERRED_REPCAT_ADMIN from executing additional rows in the DBA_REPCATLOG.
Occasionally, you may notice that an entry in the DBA_REPCATLOG view is not removed as anticipated, yet the STATUS is not ERROR. Here are some items to check if the advanced replication facility does not appear to be working properly.
Submit a trivial job at the master site to ensure that it runs as expected. In a newly created database, jobs are not automatically enabled until the database is shut down and restarted. You can call DBMS_IJOB.SET_ENABLED(TRUE) to avoid restarting the database. (Note the I, for internal, in DBMS_IJOB.)
Additionally, check the LOG_USER column in the DBA_JOBS view to ensure that the replication job is being run on behalf of the replication administrator. Check the USERID column of the DBA_REPCATLOG view to ensure that the replication administrator was the user that submitted the request. DBMS_REPCAT.DO_DEFERRED_REPCAT_ADMIN only performs those administrative requests submitted by the user that calls this procedure.
When diagnosing a replication problem, you may find it useful to disable the job queue at one or more masters. To do this, shut down the master and restart it with a value of zero for JOB_QUEUE_PROCESSES. To avoid restarting the database, you can call DBMS_IJOB.SET_ENABLED(FALSE). (Note the I, for internal, in DBMS_IJOB.) You must then connect to the master site and execute the procedure DO_DEFERRED_REPCAT_ADMIN, to execute asynchronous administrative activities at that master. This lets you have better control over the execution time of administrative activities.
If a job is not executed as expected, check the following:
There are a number of ways to troubleshoot problems with master sites.
If you add a new master site to your replicated environment, and the appropriate replicated objects are not created at the new site, try the following:
If you call a procedure in the DBMS_REPCAT package to make a schema-level change at the master definition site that is not propagated to a master site, try the following:
DDL submitted to REPCAT executes on behalf of the user who submits the DDL. When a DDL statement applies to an object in a schema other than the submitter's schema, the submitter needs appropriate privileges to execute the statement. In addition, the statement must explicitly name the schema. For example, assume that you, the replication administrator, supply the following as the DDL_TEXT parameter to the DBMS_REPCAT.CREATE_MASTER_ REPOBJECT procedure:
CREATE TABLE scott.new_emp AS SELECT * FROM hr.emp WHERE...;
Because each table name contains a schema name, this statement works whether the replication administrator is SCOTT, HR, or another user--as long as the administrator has the required privileges.
Suggestion: Qualify the name of every schema object with the appropriate schema.
If you make an update to your data at a master site, and that change is not properly asynchronously propagated to the other sites in your replicated environment, try the following:
If you receive the DEFERRED_RPC_QUIESCE exception when you attempt to modify a replicated table, one or more replicated object groups at your local site are "quiescing" or "quiesced". To proceed, your replication administrator must either call DBMS_REPCAT.RESUME_MASTER_ACTIVITY, or DBMS_REPCAT.DROP_MASTER_REPSCHEMA for each quiesced, replicated object group.
A single update statement applied to a replicated table can update zero or more rows. The update statement causes zero or more update requests to be queued for deferred execution, one for each row updated. This distinction is important when constraints are involved, because Oracle effectively performs constraint checking at the end of each statement. While a bulk update might not violate a uniqueness constraint, for example, some equivalent sequence of individual updates might violate uniqueness.
If the ordering of updates is important, update one row at a time in an appropriate order. This lets you define the order of the update requests in the deferred RPC queue.
If you replicate an object that already exists at the master definition site with DBMS_REPCAT.CREATE_MASTER_REPOBJECT, the status of the object must be VALID. If the status is INVALID, recompile the object, or drop and recreate the object. Then invoke CREATE_MASTER_REPOBJECT with the RETRY argument set to TRUE.
When you call GENERATE_REPLICATION_SUPPORT for a replicated table, Oracle generates a trigger at the local site. If the table will be propagating changes asynchronously, this trigger uses the DBMS_DEFER package to build the calls that are placed in the local deferred transaction queue. EXECUTE privileges for most of the packages involved with advanced replication, such as DBMS_REPCAT and DBMS_DEFER, need to be granted to replication administrators and users that own replicated objects. The DBMS_REPCAT_ADMIN package performs the grants needed by the replication administrators for many typical replication scenarios. When the owner of a replicated object is not a replication administrator, however, you must explicitly grant EXECUTE privilege on DBMS_DEFER to the object owner.
If you discover an unexpected unresolved conflict, and you were mixing procedural and row-level replication on a table, carefully review the procedure to ensure that the replicated procedure did not cause the conflict. Ensure that ordering conflicts between procedural and row-level updates are not possible. Check if the replicated procedure locks the table in EXCLUSIVE mode before performing updates (or uses some other mechanism of avoiding conflicts with row-level updates). Check that row-level replication is disabled at the start of the replicated procedure and re-enabled at the end. Ensure that row-level replication is re-enabled even if exceptions occur when the procedure executes. In addition, check to be sure that the replicated procedure executed at all master sites. You should perform similar checks on any replicated triggers that you have defined on replicated tables.
When you call DBMS_DEFER_SYS.SCHEDULE_EXECUTION, Oracle adds this job to the job queue. If you have scheduled your transaction queue to be pushed at a periodic interval, and you encounter a problem, you should first be certain that you are not experiencing a problem with the job queue.
When the advanced replication facility pushes a deferred transaction to a remote site, it uses a distributed transaction to ensure that the transaction has been properly committed at the remote site before the transaction is removed from the queue at the local site.
For information on diagnosing problems with distributed transactions (two-phase commit), see Oracle8 Server Distributed Databases.
If you notice that transactions are not being pushed to a given remote site, you may have a problem with how you have specified the destination for the transaction. If you specify a destination database when you call DBMS_DEFER_SYS.SCHEDULE_EXECUTION (using the DBLINK parameter), or DBMS_DEFER_SYS.EXECUTE using the DESTINATION parameter), you must provide the full database link. These procedures do not expand the database link name.
Having the wrong view definitions can lead to erroneous deferred transaction behavior. The DEFCALLDEST and DEFTRANDEST views are defined differently in CATDEFER.SQL and CATREPC.SQL. The definitions in CATREPC.SQL should be used whenever advanced replication is used. If CATDEFER.SQL is ever (re)loaded, ensure that the view definitions in CATREPC.SQL are subsequently loaded.
There are a number of ways to troubleshoot problems with snapshots.
If you unsuccessfully attempt to create a new replicated object at a snapshot site, try the following:
If you have a problem refreshing a snapshot, try the following:
Additional Information: See "Troubleshooting Refresh Problems" on page 2-39.