Statistics Affecting Query Performance
Published: 17th November 2013
Another way in which query performance can suffer is down to statistics. These are the numbers and mathematical information that SQL Server holds in regard to indexes and tables. By using these SQL Server estimates the number of rows a query will return at each stage of the execution plan.

Using these numbers SQL Server calculates what volume of data will be processed (record size etc), how much memory grant to allocate, and which joins to use at each step of the way. Therefore if these statistics are incorrect and out of date it can lead SQL Server to perform some crazy things.

To give an absolutely extreme example of this, take the following query:

use AdventureWorks2012
go

select h.SalesOrderID
from sales.SalesOrderHeaderEnlarged h
join sales.SalesOrderDetailEnlarged d
on h.SalesOrderID = d.SalesOrderID
order by d.rowguid


SQL Server knows these are large tables and therefore it, wisely, comes up with the following execution plan:

As you can see, this is using parallelism and a hash join in order to be most efficient. The thick lines also tell you that a LOT of data is passing through this plan… 9.2 million records to be exact:

It also estimates requiring a memory grant of 1GB.

These are all accurate and will help the query run fast and well.

So what happens if we wreck the statistics on purpose? (Do NOT do this in production… EVER!!!)

update statistics sales.SalesOrderDetailEnlarged
with rowcount = 1


Well now if I obtain the same estimated plan I get something entirely different:

Now I get a Nested Loop and estimated number of records as 1.

Also you can see there is no memory grant as 1 record simply does require one… neither is there parallelism.

So what happens in execution?

Well let’s just say that my original (good) execution plan completed in 22 seconds… my second execution plan with the bad statistics? I got bored after 45 mins and just stopped it. Seriously.

The lesson here is to please keep you statistics up to date at all times.

SQL Server will only auto update your statistics when 20% of the table has changed… in a large table that can be a huge amount of records and cause significant skew to your data. Don’t let it happen.
Comments:
NB: Comments will only appear once they have been moderated.