Only Update Stats You Need
Published: 12th September 2015
I’ve had a LOT of issues with stats over the years, mostly due to ascending keys, and I’m not the only one. However, the way in which I see people try to deal with these issues is quite interesting.

Obviously, nearly everyone simply updates stats, but it’s how they do this which can be crucial.

The most common method I see to deal with this is a simple “It’s table X, therefore we’ll update stats on table X”.

Okay, this will likely work, but it could be wasting a lot of time that you might not be able to afford.

Let’s take a fabricated example in order to demonstrate my logic:

I’m going to use a table from my expanded AdventureWorks2012 for this as my sales.salesOrderHeaderEnlarged table has 26 columns and 1.25 million rows.

I’ve also run a few queries to ensure that each column has stats created against it.

You can see that this table now has 27 stats (auto generated and those as part of an index):

select c.name, s.name
from sys.stats s
join sys.stats_columns sc
on s.stats_id = sc.stats_id
and sc.object_id = s.object_id
join sys.columns c
on sc.column_id = c.column_id
and c.object_id = s.object_id
where s.object_id = object_id('sales.salesOrderHeaderEnlarged')


Okay, so now let’s imagine that we have a query which joins this table to another using just the 1 column, orderDate.

This join is having stats issues and it’s all too common that I see the following solution:

update statistics sales.salesOrderHeaderEnlarged with fullscan
go


However, what people don’t realize is that this code will update ALL stats on the table. This could be incredibly time consuming and would involve reading the whole table.

In my case (and with my fully SSD based SQL Server), this took a good while:

So instead we need to be smarter… just update the one stat on that one column:

update statistics sales.salesOrderHeaderEnlarged _WA_Sys_00000003_24B26D99 with fullscan
go


As you can see, this is MUCH faster.

If a thin nonclustered index exists in which SQL Server can obtain this specific column data then that is what it will use, which will invariably be a lot faster than a full table scan.

Therefore when you’re looking to perform UPDATE STATISTICS commands in the future, please don’t go for the sledgehammer approach, but maybe try a more tailored solution.
Comments:
NB: Comments will only appear once they have been moderated.