Troubleshooting

SQL Server replication has been designed to require minimal user interaction. In most cases, once a subscription has been successfully established, and maintenance routines such as scheduled backup and replication cleanup have been configured, little or no user intervention is required.

However, if your particular replication configuration experiences problems during initial setup or while running replication, review the following troubleshooting information. While not inclusive, this troubleshooting overview contains helpful hints and techniques for isolating and correcting problems that may arise during data replication.

The following situations are discussed:

No Subscription Servers Receiving Changes

The best way to troubleshoot a situation where no subscribers are receiving changes is to determine which of these two replication components has a problem:

To determine which component has a problem, first check whether the changes have been stored in the distribution database of the distribution server. The commands themselves can be viewed by examining the command column of the MSjob_commands table.

Note For security reasons, distribution tables are only accessible to the SA.

In the distribution database, execute the following Transact-SQL statement:

SELECT command FROM MSjob_commands

If the changes to the replicated tables are not stored in the table, then the problem resides in the replication log reader task. If the changes are stored in the command column, then the problem resides in the distribution task.

This result set can be further restricted by specifying in the SELECT statement both the publisher_id and publisher_db columns of the table.

    To troubleshoot the log reader task
  1. In the Server Manager window select the distribution server, then from the Tools menu choose Task Scheduling.
  2. In the Task Scheduling dialog box, select the appropriate log reader task and choose the Task History button.

    The Task History is displayed for that log reader task. Review the history, then choose Close. The Task Scheduling dialog box returns.

    Note If the log reader task is running in Auto Start mode, recent status may not be available.

  3. With the log reader task still highlighted, choose the Edit Task button.

    The Edit Task dialog box appears.

  4. Under the Scheduling options, select Recurring and then select Change.

    The Task Schedule dialog box appears.

  5. Within the Task Scheduling dialog select Occurs Daily, and then under Daily Frequency select Occurs every 1 Minute(s).
  6. Choose OK, then in the Edit Task dialog box that returns choose OK.
  7. In the Server Manager window, stop and restart SQL Executive.

    To do this, click the right mouse button on the SQL Executive icon top, and then from the drop-down menu that appears choose Stop. Repeat this process, and choose Start.

    This terminates the currently running autostart log reader task. The log reader will now run once every minute and record the appropriate status messages in the Task History (see step 2) to help you determine why it is not moving table changes to the distribution database.

To troubleshoot the distribution task, follow the same steps outlined above for obtaining status for the log reader task, but focus on a distribution task instead.

If subscribers are all failing on distribution of a replicated transaction, this can be repaired as follows:

  1. For all associated distribution tasks, set the distribution task to Commit Every 1 Transaction.

    To do this, open the Task Scheduling window, select the appropriate distribution task, and choose the Edit Task button. In the Edit Task dialog box that appears, change the -c parameter in the Command window to -c1.

    Once the distribution task commits each individual transaction, it will fail on the specific transaction that is causing your problem.

  2. The job_id for this transaction can be obtained in two steps:
  3. Using a utility such as ISQL/w you may now select the command column of the failing transaction
    SELECT command FROM MSjob_commands
    WHERE publisher_id = 'publisher_id'
    AND publisher_db = 'publisher_db'
    AND job_id = 'job_id'
  4. At this point there are three alternatives:
One Subscription Server (of many) Not Receiving Changes

If only one subscription server is not receiving changes, there can be a problem with the distribution task. Troubleshoot the distribution task following the steps described in "No Subscription Servers Receiving Changes", earlier in this chapter.

If a single subscription server is not receiving changes, also check for (and if necessary correct) the following error conditions:

Transaction Log on the Publishing Database Filling Up

If the transaction log of a publication database is filling up there is a problem with the log reader task. Troubleshoot the log reader task following the instructions provided in "No Subscription Servers Receiving Changes", earlier in this chapter.

You can use the sp_repltrans and sp_replcmds stored procedures to help you determine whether there are replicated transactions residing within the transaction log that have not been moved into the tables of the distribution database.

sp_repltrans
Returns a result set of all the transactions in the publication database transaction log that are marked for replication but that have not been marked as distributed. It returns a results set displaying the row identifier (RID, containing the page and row) and the timestamp for each nondistributed transaction.

If sp_repltrans returns no results, then there are no transactions available within the publication database transaction log for the log reader to move into the distribution database.

sp_replcmds
Used when sp_repltrans returns a replicated transaction that is awaiting distribution. This stored procedure allows you to view the transactions that are currently not distributed (those transactions remaining in the transaction log that have not been sent to the distribution server). sp_replcmds returns article ID, partial command (true or false), and command, page, row, and timestamp.

For more information about sp_repltrans and sp_replcmds, see the Microsoft SQL Server Transact-SQL Reference.

Transaction log on the Publishing Database is Completely Full

If you are unable to resolve a problem with the log reader task, the transaction log of the publication database can become completely full. Replicated transactions that have not been moved into the distribution database may not be truncated from the transaction log. This can prevent you from truncating the transaction log of the publication database. If this occurs, two solutions can allow you to continue using the publication database.

    

Note If non-distributed transactions are removed from the transaction log of the publication database, they will not be delivered to subscribing servers. All servers subscribing to publications within this database may need to re-synchronize to the published tables by unsubscribing and resubscribing.

To remove non-distributed transactions from the transaction log of the publication database, the system administrator must execute the following stored procedure, using the parameters shown:

sp_repldone 0, 0, NULL, 0, 0, 1

This marks all replicated transactions as distributed. The transaction log may then be truncated by executing this statement:

DUMP TRANSACTION database_name WITH NO_LOG

This allows normal processing to continue within the publication database.

Since you don't know which subscribers expect to receive the replicated data that was just truncated from the log, normally you should unsubscribe and resubscribe each subscriber to any publications in that database. This will force new initial synchronization events to occur.

Initial Synchronization Task Not Applied at Subscription Server

The best way to troubleshoot a situation where the initial synchronization has not been applied at a subscriber is to determine which of these two replication components has a problem:

To determine which component has a problem, check to see if the sync events have been stored in the distribution database of the distribution server. The commands themselves can be viewed by examining the command column of the MSjob_commands table.

If the sync events are not stored in the command column, then the problem resides in the sync task. If the changes are stored in the command column, then the problem resides in the distribution task. (Sync commands appears as Sync, followed by table and file information.)

The following actions can help you troubleshoot a sync task:

To troubleshoot the distribution of an initial synchronization, you can perform the following tasks:

Monitoring Scheduled Table Refresh

If you have set up an article to be replicated using scheduled table refresh, how can you determine whether the destination table on the subscriber has been updated? This can be checked by monitoring the MSlast_job_info table in the destination database.

A table refresh will distribute at least two and sometimes three jobs. If the existing table is being deleted from or truncated, there are two jobs:

If you monitor MSlast_job_info on the subscriber and job_id increments by two, the table refresh was successful.

If the existing table is being dropped, then there are three jobs:

If you monitor MSlast_job_info on the subscriber and job_id increments by three, the table refresh was successful.

Deadlocks Caused by Replication Processes

Under some conditions, deadlocks can occur between the log reader and distribution processes due to log reader page locks being escalated to an exclusive table lock. To avoid this, you can:

For more information about using the log reader -c commitbatchsize option and the distribution -b jobbatchsize option, see Chapter 16, Scheduling Tasks.

You can also customize the lock escalation level by increasing the LE threshold maximum setting of sp_configure, allowing more page locks. This change is primarily useful on remote distribution servers. However, be aware that changing this option affects the entire server (not just the replication processes).

Using ODBCPING.EXE to Verify a Connection

If replication between two servers is not functioning properly you can use the ODBCPING.EXE utility to check whether ODBC is properly installed (by connecting to a server using the ODBC SQL Server Driver). This utility is a 32-bit application that is stored in the \SQL60\BINN directory.

    To verify ODBC connectivity

If the ODBC connection is established, this message is displayed:

CONNECTED TO SQL SERVER

If the ODBC connection cannot be established, this message is displayed:

COULD NOT CONNECT TO SQL SERVER