Job History Timeout
Published: 15th February 2014
This is actually an extension to my previous No Job History Showing post in which I pointed out how easy it is to end up without any job history in your SQL Server Agent and therefore not be able to diagnose problems.

Well this is very similar except that there is so much history that all you see is “Timeout Expired”.

This is basically happening because you have the opposite problem to my previous post. Instead of having a limit set to your history in which it gets deleted too frequently, this generally occurs when you have no limit set and therefore the history builds up to the extent that SQL Server doesn’t return it before the Agent Job History window times out the connection.

There are 2 ways to deal with this really, one is that you write your own code to retrieve job history, therefore bypassing the GUI timeout, the other is to enforce a limit to the amount of history that SQL Server retains.

For the first method you could easily write your own code in order to obtain job history for a specific job, but then why re-invent the wheel… as such, below you’ll find the same code that Microsoft themselves use to populate the GUI. The only tweaks I’ve made are to make the date and time more legible and to include a job name variable:

declare @jobName varchar(250) = 'my Job Name'

SELECT sjh.instance_id, -- This is included just for ordering purposes
  
job_name = sj.name,
  
sjh.step_id,
  
sjh.step_name,
  
sjh.sql_message_id,
  
sjh.sql_severity,
  
sjh.message,
  
sjh.run_status,
  
-- adjusted values to convert to date
  
convert(date, left(sjh.run_date, 4) + '-' +
              
substring(convert(varchar(8), sjh.run_date), 5, 2) + '-' +
              
right(sjh.run_date, 2)) run_date,
  
-- adjusted values to convert to time
  
convert(time, left(right('00000' + convert(varchar(6), sjh.run_time), 6), 2) + ':' +
              
substring(right('00000' + convert(varchar(6), sjh.run_time), 6), 3, 2) + ':' +
              
right(right('00000' + convert(varchar(6), sjh.run_time), 6), 2)) run_time,
  
sjh.run_duration,
  
operator_emailed = so1.name,
  
operator_netsent = so2.name,
  
operator_paged = so3.name,
  
sjh.retries_attempted,
  
sjh.server
FROM msdb.dbo.sysjobhistory sjh
LEFT OUTER JOIN msdb.dbo.sysoperators so1  ON (sjh.operator_id_emailed = so1.id)
LEFT OUTER JOIN msdb.dbo.sysoperators so2  ON (sjh.operator_id_netsent = so2.id)
LEFT OUTER JOIN msdb.dbo.sysoperators so3  ON (sjh.operator_id_paged = so3.id),
msdb.dbo.sysjobs_view sj
WHERE (sj.job_id = sjh.job_id)
and
sj.name = @jobName
ORDER BY sjh.instance_id


The second option is to reduce your job history by placing a limit on the amount held. To do this you can follow the steps laid out in my previous blog, HOWEVER, please note that if you go from having several years of history to enforcing only 4 weeks then this will instantly try to make this delete in one swoop and you’ll end up with a VERY busy and locked server and GUI.

The safest way is to gradually reduce your job history, then enforce a limit. Therefore SQL Server can slowly reduce your history levels without locking up the machine.

The easiest way to do this is through code as follows:

-- this will delete any history older than 30 days
declare @dateLimit datetime = dateadd(dd, -30, current_timestamp)

exec msdb.dbo.sp_purge_jobhistory @oldest_date = @dateLimit
go


Once you’ve removed enough history and set your new limit you should find that those Timeouts never come back.
Comments:
NB: Comments will only appear once they have been moderated.