Database Synchronization with Microsoft sync framework 534x423

Database Synchronization with Microsoft Sync Framework

In the real world, there are often scenarios where an application needs to be available offline, allowing the users to proceed with their work and once they go online again to synchronize their data back and forth with the server. This scenario is known as Occasionally Connected Applications (OCA). With OCA, users work with a local copy of the database which is later synchronized into the central database. Undoubtedly, this increases the server’s scalability and availability because it will have more users working with their own data cache while not generating any additional network traffic.

There are new technologies that have recently been released which allow offline capabilities like HTML 5 LocalStorage, Web SQL, and IndexedDB, but those will not be covered in this blog post. I will talk about Database Synchronization using the Microsoft Sync Framework platform where we have SQL Server Integration Services (SSIS) both client and server side.

standaloneDB
Fig. 1 - Occasionally Connected Apps (OCA)

If we look from a database point of view in SQL Server 2005, there is a support for Merge Replication where one central server (Publisher) can synchronize the data with many client database instances (Subscribers). This works only with limited number of client instances. In an instance where remote users of the application will total a few hundred, this solution will not scale well.

The Microsoft Sync Framework provides libraries that can utilize synchronization in applications. Sync Framework database synchronization providers enable synchronization between ADO.NET-enabled databases. Since the database synchronization providers are part of the Sync Framework, any database that uses these providers can then also exchange information with other data sources that are supported by Sync Framework, such as web services, file systems, or custom data stores.

However, we only get support for SQL Server 2008 and SQL Compact databases right out of the box (Sync Framework 4.0 October 2010 CTP adds Sql Azure). So in a common scenario we will use the SQL Server 2008 edition for the server DB instance and a choice between SQL Compact or SQL Express as the client's. SQL Server Express support does not come in the Sync Framework, but you still can extend the provider and add support for it. It will require a substantial amount of code and SQL.

See below:

First for each client DB, we need two client-specific tables:

  1. Guid table - one record that uniquely identifies the client
  2. Anchor table that holds TimeStamps of last update time for each table
table structure
Fig. 2 - Guid and Anchor Tables

An important class is the SyncAgent class. SyncAgent does the actual synchronization process. First, you attach the Remote and Client Database providers:

Then, the SyncTables tables that will be synced. Tables can be grouped into SyncGroup so PK-FK relationship is handled properly. Grouping will ensure that the FK change won't be applied before its PK is applied:

After the tables are added, we need to create ADO.NET style adapters for each table. Adapters have commands that tie to stored procedures or T-SQL code. Each adapter has a command for:

Deletes are stored in the so called "tombstone tables" which serve as a “tombstone” for deleted records. Upon deleting these from original tables, a trigger populates the tombstone table with the deleted record:

We also need store procedures for insert, update, and delete records at the time of synchronization:

In those stored procedures, we can also apply conflict resolution logic by an additional paramater, @sync_force_write, which is provided by the user:

Deleting stored procedure updates the tombstone table, but does not delete the records. A cleanup procedure for the tombstone table has to be created after that:

There are also two helper stored procedures which are used when handling conflicts:

This procedure needs to be applied on both the server and the client databases.
Once we are ready, there are two provider wide commands for the client ID and the Anchor. SelectNewAnchorCommand returns the new watermark (timestamp) for the next sync.

We are ready to kick off the synchronization process, but before we do, we should subscribe to events that provide helpful information when the synchronization process fails:

If everything goes well, the SyncStatistics object is populated with applied changes. Otherwise, if synchronization fails, the ApplyChangeFailedEventArgs object contains valuable information for the conflict that occurred such as:

  1. Client or Server Conflict
  2. Conflict Types - ClientUpdateServerUpdate, ClientUpdateServerDelete, ClientDeleteServerUpdate, ClientInsertServerInsert

Upon evaluating the conflict information, we have three choices to continue with:

  1. ApplyAction.RetryApplyingRow - Simple retry of applying row.
  2. ApplyAction.RetryWithForceWrite - Retry with setting the @sync_force_write parameter set to true. Usually this will allow the client's changes to overwrite the server's.
  3. ApplyAction.Continue - Skip client's change.
Conflict Resolution Window
Fig. 3 - Conflict Resolution Window

As a final result, we can use the SyncProgressEventArgs SyncStage property to create a ProgressForm which lists the result of the synchronization process.

Conflict Resolution Window
Fig. 4 - Progress Window

The disadvantages to the approach mentioned above include:

  • Changes are required in the central database schema to add columns and tables that may affect current applications.
  • Triggers are fired for each change made to a row, which has performance implications.
  • Logic for maintaining proper row versions and row deletions can get extremely complicated.
  • Long running transactions can result in some data being missed during synchronization, resulting in data inconsistencies.

Anyways, if you architect a new system and a SQL Server Compact fits the storage needs for the remote users, SQL Server Compact is a single-file database that lacks triggers, views, and stored procedures in comparison to SQL Server Express. SQL Server 2008 shipped a new feature which can greatly optimize the synchronization process.

SQL Server 2008 had introduced a Change Tracking feature to the database. The concept behind change tracking allows an administrator to mark certain tables to be monitored for changes. From that point forward, SQL Server 2008 will keep track of any inserts, updates, or deletes that are made. The Sync Framework database synchronization providers have been built to take advantage of SQL Server 2008 change tracking.

Fig. 5 - Change Tracking Option

You should enable Change Tracking for both the Server and for each table you want to synchronize. With that set up, it is fairly easy to set up your application in a few steps.

First, you add a “Local Database Cache” item to your project.

Fig. 6 - Local Database Cache Item

Then, a wizard pops up that will help you configure the synchronization options like server’s and client’s connection strings, tables being monitored, whether to SQL Server change tracking or a custom one, the option to synchronize all tables in one transaction, database model (ADO.NET Dataset or Entity Framework model), etc.

Fig. 7 - Configure Data Synchronization

After completing the wizard steps, a .sync XML file is added to the project which contains all configuration options that we have set. If we examine it more carefully, we will see that it consists of all elements we manually configured by using the code above, including the SQL for the adapter commands, anchor commands, etc.

Fig. 8 - Sync Configuration File

With a few mouse clicks we have the synchronization setup, we can easily kick off the synchronization process with a few lines of code:

Adopting this approach has a lot more advantages:

  • No changes to the schema.
  • No triggers are added to the database which means that it has less impact on performance.
  • All of the logic for tracking changes is internal to the SQL Server engine and as such reduces the complexity for setting up this type of system.
  • Includes integrated database administration feature such as Dynamic Management Views and Security.

The features like Conflict Detection and resolution still come out-of-the-box, so the user still has control over it. We should also be aware of the fact that the change tracking tables will grow faster and we should take care of frequent clean up. This can be done by setting the Retention Period property of the Change Tracking setting in SQL Server and taking into account that within this period, no remote user will be left out of sync.

Resources and Links:

Subscribe for Free Updates


2 replies
  1. Bob
    Bob says:

    Martin..awesome article. Very informative and helpful. I am wondering should I just stick with SQL Compact vs trying to make SQL Express work or does SQL Express 2008 or 2012 now support sync. If my master DB is 2008 can I use express 2012 or do they have to be the same version

    Reply
  2. Syed Farhan
    Syed Farhan says:

    Very Good article... I am facing issues with .net 4.5 and 4.5.1 framework installed clients during sync.. it gives an error saying System.MissingMethodException: Method not found:'Boolean System.Data.SqlServerCe.SqlCeChangeTracking.UpgradePublicTracking(System.String)'

    Sync works fine with other frameworks.. please help!!

    Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">