When the job failed, my systems administrator got an alert for more than 90% memory usage on the server that runs the job.
He provided me a graph that showed the peak usage as shown below.
This prompted me to run a script that gives me a list of all the jobs for the last 3 days along with the duration they took to run the job.
I used the table MSDB.dbo.sysjobhistory. In this table the date and time columns are stored as two different columns of type int -- Run_date and run_time.
Furthermore the run_time column is stored in hhmmss format but is not always 6 digits.
But the good news is there is a function available that we can use to get the rundatetime which is called msdb.dbo.agent_datetime. This takes two arguments run_date and run_time.
There is also a run_duration column which is stored as type INT. This column is a duration stored in hhmmss format similar to run_time column except that this is a duration. So for example 208 indicates that the job took 2 min 8 secs.
So I had to do some transformation to the duration in minutes as shown in the script below.
select j.name as 'JobName', h.step_name, msdb.dbo.agent_datetime(run_date, run_time) as 'RunDateTime', ((run_duration/10000*3600 + (run_duration/100)%100*60 + run_duration%100 + 31 ) / 60) as 'RunDurationMinutes' From msdb.dbo.sysjobs j INNER JOIN msdb.dbo.sysjobhistory h ON j.job_id = h.job_id where j.enabled = 1 order by JobName, RunDateTime desc
I took the results and analysed to see what are the peak times the jobs run. The peak times are shown in the graph below for the three days.
I did some further analysis and decided to move the schedule of the jobs that are overlapping to run at a different time so that the server is not loaded at that time.
I will check the memory usage tomorrow and see if there are any failures.
Are there any other ways you can analyse the CPU/Memory usage ? Look forward for your comments.
No comments:
Post a Comment