Performance Impact of a Key Lookup
Published: 9th July 2015
Following on from my last post about Key Lookups I wanted to emphasize the performance problems that you can get when these appear in your execution plans.

To do this I’m going to use a similar example as per my last post, but focusing more on the IO involved and also showing how a Covering Index can alleviate these issues.

So let’s set up a test scenario similar to the last one we used (but with more data):

use tempDB
go

if object_id('tempDB..#lookup') is not null drop table #lookup
go

create table #lookup
(
  
id int identity primary key clustered,
  
firstName varchar(50),
  
lastName varchar(50),
  
address1 varchar(150),
  
city varchar(50),
  
postalCode varchar(50)
)
go

insert into #lookup
select top 10000 firstName, lastName, AddressLine1, city, PostalCode
from AdventureWorks2012.person.address a
join AdventureWorks2012.person.person p
on a.AddressID = p.BusinessEntityID
go

create nonclustered index ix_tempLookup on #lookup(lastName, firstName)
go


Now, if you’ll read my other posts you’ll know this… if we update less than 20% of the records in the table we will not trigger an update stats… therefore we are going to do just that in order to make 20% of the records in the table have the lastName “Zabokritski”:

update top (1999) #lookup
set lastName = 'Zabokritski'
where lastName != 'Zabokritski'


According to the current table stats there is only one of these records and therefore SQL Server will believe that only 1 record will be returned and therefore will use our non-clustered index with a Key Lookup in order to satisfy the following query:

select lastName, firstName, id, address1
from #lookup
where lastName = 'Zabokritski'


So what happens in regard to IO?

That’s a whole lot of reads!!! SQL had to read each individual row from the non-clustered and then for each read performed on the non-clustered we also had to read the clustered.

So how about changing the index to a covering index (we need to re-run our whole script here to ensure incorrect stats and a fair comparison)?

use tempDB
go

if object_id('tempDB..#lookup') is not null drop table #lookup
go

create table #lookup
(
  
id int identity primary key clustered,
  
firstName varchar(50),
  
lastName varchar(50),
  
address1 varchar(150),
  
city varchar(50),
  
postalCode varchar(50)
)
go

insert into #lookup
select top 10000 firstName, lastName, AddressLine1, city, PostalCode
from AdventureWorks2012.person.address a
join AdventureWorks2012.person.person p
on a.AddressID = p.BusinessEntityID
go

create nonclustered index ix_tempLookup on #lookup(lastName, firstName, address1)
go

update top (1999) #lookup
set lastName = 'Zabokritski'
where lastName != 'Zabokritski'


Okay, so now what execution plan do we get?

Much better. And we get a vastly reduced number of reads:

This is because we are using a single index to obtain all column information and therefore we don’t need to read each row individually (which we do in order to do key lookups) and therefore we can read whole pages instead. Therefore we only needed to read 22 pages which contained our entire 2000 rows.

As a note, if we didn’t have our screwed up stats, then SQL Server would have realized the Key Lookup was expensive and simply used a full Clustered Scan instead:

use tempDB
go

if object_id('tempDB..#lookup') is not null drop table #lookup
go

create table #lookup
(
  
id int identity primary key clustered,
  
firstName varchar(50),
  
lastName varchar(50),
  
address1 varchar(150),
  
city varchar(50),
  
postalCode varchar(50)
)
go

insert into #lookup
select top 10000 firstName, lastName, AddressLine1, city, PostalCode
from AdventureWorks2012.person.address a
join AdventureWorks2012.person.person p
on a.AddressID = p.BusinessEntityID
go

update top (1999) #lookup
set lastName = 'Zabokritski'
where lastName != 'Zabokritski'
go

create nonclustered index ix_tempLookup on #lookup(lastName, firstName)
go

set statistics io on

select
lastName, firstName, id, address1
from #lookup
where lastName = 'Zabokritski'


Still not as good as the covering index, but way better than the Key Lookup.

Therefore, if you do see Key Lookups in your execution plans, please make sure that your stats are up to date and, if you’re still seeing them, consider a covering index instead in order to drastically reduce your IO overhead.
Comments:
NB: Comments will only appear once they have been moderated.