SHRINK a data file? Just say NO!!!
Published: 26th May 2013
DBCC SHRINKFILE, DBCC SHRINKDATABASE, and Auto-shrink… they’re all truly, truly evil and should not be allowed near any system… ever!!!

Now, to be fair there are times when shrinking a log file is useful and even necessary. I’ve written my fair share of bug riddled code which has caused the transaction log to spiral into oblivion and leave me with a 20GB data file and 500GB of logs. In that case… shrink away. It’s really quite handy. But data files… no… just no… again no… please just no!!!!!!

This particular problem arose the other day when I was talking to someone who had a data drive with numerous databases and on which he then created a 150GB database housing, amongst others, an 80GB table.

There was no problem so far as the drive was 400GB in size and after adding this new database he still had 90GB free space. Everyone was happy and his capacity planning said there would be no issues for a good while to come. However, what he had failed to consider in his capacity planning model was the impact of rebuilding an index.

Effectively, when doing a clustered index rebuild, SQL Server will make a copy of your clustered index (which is the entire table, remember) and then once happy it’s sorted will swap it out for your existing data. Not 100% accurate, but gives the gist.

Anyway, this meant that in the case above, the 150GB database swelled to 230GB after a rebuild of the clustered index on the 80GB table. Suddenly all capacity plans went out of the window and the drive only had 10GB left. Not good. So to counter this, he was performing a shrink on the datafile to reduce it back to 150GB.

Sounds sensible, but his performance went through the floor and he couldn’t understand why. Well… here’s why you should avoid shrinking a database at all costs…

Let’s create a basic test database:

if exists(select * from sys.databases where name = 'fragTestDB')
  
drop database fragTestDB
go

create database fragTestDB
go


Now, using this we’ll create a table in which I’ve made each row purposely large so that it covers numerous pages in the database and therefore gives us some good metadata. Also note that I’m using an identity(1, 2) so that it only populates every other identity value starting from 1…

use fragTestDB
go

if object_id('fragTest') is not null drop table fragTest
go

create table dbo.fragTest
(
  
id int identity(1, 2),
  
value char(1000) default('a'),
  
constraint pk_fragTest primary key clustered(id)
)
go

set nocount on
insert into
fragTest default values
go 100000


So, as you can see, there’s a clustered index on the table and with this database being brand new, there’s only going to be one index listed in the DMV. So let’s have a look at the fragmentation…

declare @db int = (select database_id from sys.databases where name = 'fragTestDB')

select page_count, avg_fragmentation_in_percent
from sys.dm_db_index_physical_stats(@db, null, null, null, null)




As we’d expect, with having inserted data into an identity column, there’s almost zero fragmentation.

So now we’re going to perform some inserts to fill in some of the gaps in the identity key which will purposely cause some page splits inside the database.

declare @id int = 14
while @id <= 80000
begin
   set identity_insert
fragTest on
   insert into
fragTest(id, value)
  
select @id, 'b'
  
set identity_insert fragTest off

   set
@id += 10
end


So now let’s have a look at the index…

declare @db int = (select database_id from sys.databases where name = 'fragTestDB')

select page_count, avg_fragmentation_in_percent
from sys.dm_db_index_physical_stats(@db, null, null, null, null)




Well we’ve fragmented our index so it’s time for a rebuild. Let’s have a quick check on the database and table size before we do…

select name, (size*8)/1024 from sys.master_files where db_name(database_id) = 'fragTestDB'

exec sp_spaceUsed 'fragTest'




Okay… rebuild…

alter index pk_fragTest on fragTest rebuild


Check on the indexes again…

declare @db int = (select database_id from sys.databases where name = 'fragTestDB')

select page_count, avg_fragmentation_in_percent
from sys.dm_db_index_physical_stats(@db, null, null, null, null)




Well we couldn’t be happier with those results. Pretty much 0% fragmentation. Now let’s look at our database sizes…

select name, (size*8)/1024 from sys.master_files where db_name(database_id) = 'fragTestDB'

exec sp_spaceUsed 'fragTest'




Ouch. Our database has, as predicted, grown massively and is now 290GB in size but only using 125GB in the table. We don’t have that kind of capacity on the server, so let’s shrink the data file back down again…

dbcc shrinkfile('fragTestDB')

select name, (size*8)/1024 from sys.master_files where db_name(database_id) = 'fragTestDB'

exec sp_spaceUsed 'fragTest'




All looks good. Exactly what we wanted. We’ve performed a rebuild and removed the unused space. But at what cost?

declare @db int = (select database_id from sys.databases where name = 'fragTestDB')

select page_count, avg_fragmentation_in_percent
from sys.dm_db_index_physical_stats(@db, null, null, null, null)




OUCH!!!!!!!!! What on earth has happened?

Well, simply put, SQL Server isn’t very helpful when it conducts a shrink.

What has happened is that SQL Server has made a copy of the clustered index (data table) and therefore the database has expanded and SQL Server has used that space in which to place the copy. Once created it then marks the pages used by the original index as unused. As such we now have a data file with a LOT of blank pages at the start, and used pages at the end.

The problem is that the SHRINK operation (all of them function in the same way) will take the last non-empty page in the datafile and move it to the front… then next non-empty page to 2nd from the front, the next to 3rd from the front etc. and it continues to do this until it has all used data pages at the front of the file and all empty ones at the end. Then it gives the blank space back to the Operating System.

Therefore it has literally reversed the data pages in moving them to the front of the data file. As such our perfect index has now become almost perfectly fragmented and performance drops through the floor.

Aside from this utter destruction of indexes, it’s also advisable to note that this will hit your disk IO hard and is a fully logged operation, therefore if the original example above were in the full recovery mode the transaction log would have grown by a good 120GB as well which is not only painful on the disks when backing up but if you happen to be running transactional replication as well then your log reader really won’t appreciate it either!

Luckily, in the example at the top of this article, the guy was able to relocate his entire database to a new, larger drive and not have to consider shrinking after the index maintenance.

However, you may not be so lucky and the only real way I know to avoid this is to create a new filegroup and rebuild your index to the new filegroup. You can then drop the old filegroup, freeing up space without affecting the new index.

Basically though, the main thing to take away is simply DO NOT SHRINK DATABASES!!!
Comments:
Gert Hauan
28/10/2014 13:54:00
Hi.

Do you know if this also applies to when one uses the TRUNCATEONLY parameter? According to BOL no pages will be moved with this option ("Releases all free space at the end of the file to the operating system but does not perform any page movement inside the file. The data file is shrunk only to the last allocated extent. ")

Thanks.
Kevin (theBoredDBA)
28/10/2014 14:04:00
Hi Gurt,

Thanks for the comment.

Truncate_Only is actually only relevant for a database Log file and not a Data file. Therefore it does not work in the same way.

A transaction log doesn't have data pages and therefore, as mentioned in BOL, no pages will be moved.

Truncate_Only simply frees up space in your transaction log, but be careful using this as if your database is in FULL recovery then you will break your backup chain.

Thanks,
Kevin
Gert Hauan
28/10/2014 14:13:00
Hi

Thanks for your answer (and time).
But as I understand the documentation in BOL it is the other way around: "TRUNCATEONLY is applicable only to data files" (http://technet.microsoft.com/en-us/library/ms189493(v=sql.105).aspx)

Am I missing something?

We have deleted a huge amount of data from a DB (1600 GB > 600 GB), and really need to free up some disk space, but wouldn't want to end up in the scenario you are describing :-)
Kevin (theBoredDBA)
28/10/2014 14:20:00
Hi Gurt,

Apologies, you're quite right. I was thinking Truncate_Only which is a log file command... not Truncateonly as one word.

Yes, that will affect the data file (AND log file actually - it's not clear in BOL, but it does), and therefore you can safely use it as it does NOT move pages and therefore will not cause you the same fragmentation problems.

The only thing you might find is that if you have pages in use at the end of the data file then you will not actually free up any space, but it's definitely worth a try.

If it doesn't work then I would suggest creating a new filegroup and then have your indexes rebuilt into the new filegroup (and move any Heaps) which you can do one at a time through a maintenance plan and should then leave your current filegroup empty, therefore letting you shrink the filegroup safely.

Hopefully that makes sense?

Thanks,
Kevin
Gert Hauan
28/10/2014 14:36:00
Thanks Kevin.

I will give the TRUNCATEONLY option a try first. I have tried this on some test-dbs, and as you said, not all of the free space would get freed up.
Gert Hauan
29/10/2014 17:16:00
Hello again Kevin

Unfortunately the TRUNCATEONLY did not work.

One (stupid)question regarding the option to create a new filegroup:

Will not the new filegroup have to contain a new file that will get filled up as the indexes are moved? So until I am able to shrink the original mdf-file, I would run out of disk space?
(the physical disk is 2000 GB, the file I am trying to shrink has reserved about 1600 GB, but only 600 GB in use). Did that make sence?

Also, when you say rebuild the indexes into the new filegroup, do you mean also the clustered indexes, or only nonclustered?

Thanks again for your help.
Kevin (theBoredDBA)
29/10/2014 18:20:00
Hi Gurt,

That's a tricky situation. You're correct... You would need to move all indexes to the new filegroup and therefore you would need 600GB which you don't have spare.

In this situation you might have to take the hit and maybe schedule a shrink during a non-critical time or scheduled maintenance window and then follow it immediately with a full index maintenance plan to fix all the fragmentation that will result from the shrink.

Not ideal, but that may be your only option now.

Thanks,
Kevin
NB: Comments will only appear once they have been moderated.