Scalar Value Functions and Parallelism
Published: 3rd October 2013
I’ve complained before about Scalar Value Functions and how much I dislike them but, in case you weren’t convinced the first time around, here’s another reason for you.

Scalar Value Functions stop the optimizer from using parallelism.

And I’m not saying that these will cause a serial zone in your execution plans and therefore not be TOOOOO bad (serial zones should be avoided where possible, but sometimes it can’t be helped)… these stop parallelism throughout your entire execution plan, making the whole thing run as maxdop 1.

Don’t worry though, all is not lost… but I’ll get to that later…

Firstly let’s have a look at a simple query which is nonsensical but has simply been created because it happens to run in parallel on my machine.

First things first, I’m going to be using the AdventureWorks2012 database, but this is utterly useless for parallelism demos as it’s simply too small. Therefore I’m using an artificially enlarged version which I created using Jonathan Kehayais’ script which you can find here.

select o.SalesOrderID, o.CarrierTrackingNumber, h.DueDate
from sales.salesOrderDetailEnlarged o
join sales.salesOrderHeaderEnlarged h
on o.SalesOrderID = h.SalesOrderID
where h.ShipDate >= '2006-01-01'


So, as you can see… we’re happily using parallelism in order to obtain the data.

Now let’s create an utterly meaningless function, just for the fun of it…

create function returnValue(@value int)
returns int
as
begin
   return
@value
end


Okay, now we have our pointless function which doesn’t change the logic of the query in the slightest, let’s put it into our tSQL and generate a new execution plan:

select dbo.returnValue(o.SalesOrderID), o.CarrierTrackingNumber, h.DueDate
from sales.salesOrderDetailEnlarged o
join sales.salesOrderHeaderEnlarged h
on o.SalesOrderID = h.SalesOrderID
where h.ShipDate >= '2006-01-01'


Now it’s gone to a serial plan. So… to test this, let’s use the trace flag I mention in a previous post (here) to try and force a parallel plan:

select dbo.returnValue(o.SalesOrderID), o.CarrierTrackingNumber, h.DueDate
from sales.salesOrderDetailEnlarged o
join sales.salesOrderHeaderEnlarged h
on o.SalesOrderID = h.SalesOrderID
where h.ShipDate >= '2006-01-01'
option (querytraceon 8649)


Same result… using a scalar value function will not allow a parallel plan to be produced. So what can we do about this? Well, there is actually an answer… the humble CLR.

Note that this is only for a non-data access CLR, but when used in this circumstance a CLR will actually allow parallelism to take place.

I’ll not go into too much detail (as I’ll post later about creating a CLR and importing it into SQL Server as it’s a whole post in itself), but here’s a screenshot of the C# code:

Then we deploy this to SQL Server:

create assembly returnValue from 'C:\ProjectsreturnValuereturnValue

in\DebugreturnValue.dll'
go

create function returnValueCLR(@value int)
returns int
as
external name returnValue.UserDefinedFunctions.returnValue


And now let’s run our code again, but this time using the CLR function:

select dbo.returnValueCLR(o.SalesOrderID), o.CarrierTrackingNumber, h.DueDate
from sales.salesOrderDetailEnlarged o
join sales.salesOrderHeaderEnlarged h
on o.SalesOrderID = h.SalesOrderID
where h.ShipDate >= '2006-01-01'


Success!!! We now have our parallel plan back. Therefore, if you didn’t have one already, this is a good reason to start using a few more pieces of CLR code within your servers.
Comments:
NB: Comments will only appear once they have been moderated.