Identity Management in Merge Replication
Published: 3rd May 2013
One of the most crucial and most complex stages in selecting the Article Properties of a table with an Identity column is how to utilise replication's Identity Range Management.

Personally I find it best to avoid this if you can, either by having your application take care of your Identities, or by create a composite primary key using a combination of the Identity and a unique server tag. For example if you have a server in Europe and one in America, you could have a composite key made up of ID and Locale. Therefore your European keys would all be of the form "1E, 2E, 3E... 999E" etc, wheras your American server would have "1A, 2A, 3A... 999A" etc. Therefore you avoid the complex world of Identity Range Management.

However, if neither of these options can help you then read on...

When accessing Article Properties (read my webpage here), the Identity Range Management section looks like this...

Okay, so here's what you need to know:

Setting Automatically manage identity range to Manual means that you look after the identity column yourself (for which I gave options above). Changing this to Automatic switches responsibility to SQL Server instead.

Once switched to Automatic the other Properties become available and need to be set depending upon your table usage.

To demonstrate how these are used, let's invent an example. Your setup is as follows:

  • Merge Replication
  • 1 Publisher, 3 Subscribers (Servers A, B, & C)
  • A table with an int Identity Column starting at 1
  • Records are loaded 1000 items at a time from any of the servers

How identity range management works is that you allocate a "pool" of IDs to the Publisher. It then allocates subsets of these to each subscriber.

In our example we shall set the Publisher Range as 1,000,000. This therefore gives the Publisher the set of IDs 1 - 1,000,000.

Let's say we now set the Subscriber Range as 100,000. This will allocate 100,000 IDs to each Subscriber (including the Publisher itself).

So we now have the following:

  • Publisher reserving IDs 1 - 100,000
  • Server A reserving IDs 100,001 - 200,000
  • Server B reserving IDs 200,001 - 300,000
  • Server C reserving IDs 300,001 - 400,000

If we were to insert 2 records into the table on Server A, for example, the table would show those records with IDs of 100,001 and 100,002. We then add two records on Server C and the table would show those records with IDs of 300,001 and 300,002. Therefore as you can see with just these 4 records this can lead to huge gaps in the Identity column of a table, but this is necessary to ensure that the Identity column is kept unique under merge replication conditions.

So, what happens if, on Server A, we have reached ID 199,500 and we try to insert 1000 records? Well... the insert fails.

This is where the "Range Threshold Percentage" field comes into play.

We use this field to specify at what point the publisher allocates the subscriber a new set of IDs.

In this case let's say we set the property to 80. Therefore once the subscriber has used 80% of its allocation it will request a new one.

To put this into our example, if Server A does an insert that puts its ID above 180,000 then the next time the Merge Agent runs (another reason not to use continuous subscriptions) it will request a new set of IDs and will be allocated 400,001 - 500,000. Therefore when you attempt your next insert it will succeed (presuming it's less than 100,000 records of course).

Also, if you were to re-initialise any subscription, they will automatically be allocated a new range of IDs, potentially leaving large gaps in your ID column.

Considerations


This shows that the datatype of your identity column can be very important when the table is being considered for merge replication.

Imagine you are using a tinyint as the identity column and you have 50 subscribers (I've seen this in production... very scary!!!). And you allocate a publisher range of 255, subscriber range of just 5 records... with 5 being used for the publisher and 5 x 50 for the subscribers, this means that you have 0 IDs left. Therefore if a replication fails then you cannot re-initialise because you have no IDs left to allocate. This can cause major problems. As such, always plan your datatype carefully along with your ranges and threshold.
Comments:
NB: Comments will only appear once they have been moderated.