Statistics On Views Via Linked Server
Published: 29th July 2015
As a follow-up to my last post about accessing synonyms via a linked server (you can’t), I stated that it was therefore better to use a view. However, that has issues of its own to consider.

Nothing’s ever easy, you can’t access synonyms via a linked server and therefore you want to use views, but they are just asking for trouble as well because SQL Server can’t gauge statistics on them.

We’ve always had issues in SQL Server with linked servers and statistics because SQL Server historically had no access to the stats of objects on the linked server unless you were using a high privilege account such as db_owner or sysadmin. However, this changed in SQL 2012 SP1 in which they opened up the stats so that all you needed was db_datareader and you could see the stats of the remote table you were accessing.

However, what happens if you’re accessing a view?

Let’s create a simple table and populate it on my TRACYSQL lab cluster:

set nocount on
go

create database testDB
go

use testDB
go

create table testRealTable
(
  
id int identity
)
go

insert into testRealTable default values
go 100


Now let’s create a view on my ROBINSQL lab cluster to access this:

use testDB
go

create view testRealView
as
   select
*
  
from TracySQL.testDB.dbo.testRealTable
go


That was simple enough.

Now, my linked server accounts are both sysadmin and therefore we have more than we need to see stats within my SQL 2014 environment… therefore let’s see what execution plan we get when trying to read some data:

select * from RobinSQL.testDB.dbo.testRealView


All as expected… so what about the estimated number of rows?

It’s estimating 10,000? We only have 100 rows? Well that’s actually the default for a remote query in SQL Server… so why is it using that when we have access to see the stats?

Well the problem is that we have access to the stats of objects, and therefore SQL Server asks for the stats on the object we’re querying. But we’re querying a view and the view itself has no stats and therefore it returns a null which SQL then translates into its defaults.

The linked server doesn’t break down the view and look at the underlying tables and then return those stats and therefore if you’re thinking of accessing a view via a linked server then be VERY careful and test your queries carefully because I’ve seen tables in excess of 100 million rows and 300GB of data being seen as “10,000 rows” and therefore SQL Server thinks it’s cheaper to execute a cross server query locally and so tries to pull VERY large tables across the network (thinking they’re small) which is an absolute performance killer.

You have been warned.
Comments:
NB: Comments will only appear once they have been moderated.