Sunday, January 10, 2016

Data Compare and Update using SQL Server Database Project

Synchronizing two database is a common scenario but the task sometimes take so much time and effort to synchronize two databases perfectly. Some developers maintain script to sync the databases and runs those script time to time. I found it works well but maintain the script properly is a difficult job if more than one developer works in a single database. In this blog post we will discuss how we can sync two databases using SQL Server Database Project, its advantages and disadvantages.


SQL Server Database Project is a wonderful tool and rescue for database developer and help them to perform some time consuming task easily. We can use this project for sync database schema as well as data itself. In this post our main concern is to sync data. So let's create a SQL Server Database Project first. To add database project open visual studio and click New Project from the File menu. Then select SQL Server Template and SQL Server Database Project.  Give it a meaningful name and click ok.


Figure: Add SQL Server Database Project

Our database project is ready now and we can use it to compare data between two databases. Before go  further lets setup a environment. We have created two database stg_database and prod_database and both database has a single table name Person. Below is the table script.
CREATE TABLE Person
(
 Id int IDENTITY(1,1) PRIMARY KEY NOT NULL,
 FirstName varchar(250) NOT NULL,
 LastName varchar(250) NOT NULL,
 EmailId varchar(200) NULL,
 PhoneNumber varchar(50) NULL,
 Address varchar(250) NULL
)
GO

Now we insert some sample data in the Person table for both database. Below is the insert script.
USE stg_database
GO
INSERT INTO dbo.Person(FirstName, LastName, EmailId, PhoneNumber, [Address])
VALUES
('Shakib','Al-Hasan','shakib@gmail.com','01725446677','Dhaka'),
('Tamim','Iqbal','tamim@gmail.com','01725553311','Chitagong'),
('Mushfiq','Rahim','mushi@gmail.com','01725550099','Bogura'),
('Masrafee','Mortaza','mash@gmail.com','01725552244','Khulna'),
('Nasir','Hossain','nasir@gmail.com','01727752244','Rangpur'),
('Mominul','Haque','mominul@gmail.com','01956896641','Cox''s Bazar')
GO 
USE prod_database
GO
INSERT INTO dbo.Person(FirstName, LastName, EmailId, PhoneNumber, [Address])
VALUES
('Shakib','Al-Hasan','shakib@gmail.com','01725446677','Dhaka'),
('Mushfiq','Rahim','mushi@gmail.com','01725550099','Bogura'),
('Tamim','Iqbal','tamim@gmail.com','01725553311','Chitagong'),
('Masrafee','Mortaza','mash@gmail.com','01725552244','Norail'),
('Somuya','Sarkar','sarkar@gmail.com','01856225633','Satkhira')
GO

After inserting the record in the table we update one record of stg_database table so that we can visualize the effect perfectly. Below is the update script.

USE stg_database
GO
UPDATE dbo.Person
   SET LastName = 'Bin-Mortaza',
    [Address] = 'Norail, Khulna'
WHERE Id = 4

Below screenshot shows the two database records.

Figure: Two tables data comparison.

Let's do some manual comparison before we use our data project.
Id 1: Identical in both database
Id 2 & 3: Id swap in production database.
Id 4: Last Name and Address updated in stg_database
Id 5: Different record in two databases
Id 6: Exists in stg_database but not in prod_database

Now we will use our Database Project and examine how the data comparison works. To start a comparison open a data comparison window by clicking Tools menu from visual studio then SQL Server then New Data Comparison.... Follow the below image


Figure: Initiate new data comparison

In the New Data Comparison window we need to configure source and target database and comparison options. It has four different options to compare records in database, in our case we will select all of them. See the below image and configure the source and target database and select data compare options.


Figure: Configure source and target and Data Compare Options.

Clicking next will forward us to a new page where we can select tables and views that we want to compare. Also we can choose the columns that we actually interested to compare. Please see the below image.


Figure: Choose tables and columns that will be compared.

Clicking Finish will start data comparison and when finish it will open a SqlDataCompare window in visual studio. The lower part of the SqlDataCompare window looks like the below image.

Figure: Data comparison result window.

It shows four different tabs. The first tab shows the records that are different in compared databases. In our case it says four records differ in source and target. Since the data project compare the data based on primary key, it will identify the record different based on the primary key. Second tab shows the records that are exists in source but not in target. In our case it has one record, record 6 is in stg_database but not in prod_database. Third tab shows the records that are in target but not in source and fourth tab shows the identical records. Let's see what happen when we update the target database based on the comparison.

In source database id 2 contains the record for Tamim Iqbal and 3 contains record for Mushfiq Rahim, in target id 2 contains record for Mushfiq Rahim and 3 contains record for Tamim Iqbal. So if we run the update, target id will swap but information does  not lost. Id 4 will updated by the update record in source database. Id 5 source and target contains different records see the above image marked portion. If we run update target will lost Somuya Sarkar record since it will replace by Nasir Hossain :(. See the below image for final result.


Figure: Records of target database after sync with source.

So what can we do if we want both Nasir Hossain and Somuya Sarkar in target database. There is no easy way, for this kind of scenario we need to declare composite primary key so that the data project compare the data based on those keys. Let's try to do an experiment and see what happens.

We will drop our Person table and recreate it with composite primary key (Id, EmailId). Below is the script.
DROP TABLE Person
GO
CREATE TABLE Person
(
 Id int IDENTITY(1,1) NOT NULL,
 FirstName varchar(250) NOT NULL,
 LastName varchar(250) NOT NULL,
 EmailId varchar(200) NOT NULL,
 PhoneNumber varchar(50) NULL,
 Address varchar(250) NULL,
 PRIMARY KEY (Id, EmailId)
)
GO

Data insertion and update will be same as before. After that when run the data comparison again the data comparison result window looks like the below image.


Figure: Different Records tab.

Now it shows only one record in Different Records Tab. And see the below image, the Only in Target tab we see there are three records.


Figure: Only in Target tab.

We uncheck the record 5 because we want it in our target database. Below snapshot shows the target database after running update operation, prod_database sync with the stg_database database.


Figure: Update result when use composite primary key.

We see that it is dangerous to use Data Project blindly for data compare and update. In this case there is a possibility that we lose data. Actually it will work to update/sync database table that is not affected by user interaction. And we can use this tools for update configurations related table easily that is not usually changed by user interactions. Before use update any database using Data Tool carefully check the changes.

1 comment: