Job dispatching can be slowed down excessively by host loads query


With a significant number of jobs (in history and/or running), job dispatching can become very slow, in excess of 3 minutes per dispatch cycle.

This is caused by the call:

SystemLoad systemLoad = getHostLoads(em, true);

for every job being dispatched. This is the query

On our 3.x production system, New Relic reports this query taking 94% of all database time. Response time can be up to 2s, and the query can be called hundreds of times per minute.

The job dispatching code needs refactoring to avoid this expensive query being called frequently.


Greg Logan
December 12, 2017, 1:41 AM

we could also add the host to the output of the select. It wouldn't be technically useful, but it might resolve the issue with MSSQL. If I made that change in a separate branch would you have time to test?

Stephen Marquard
December 12, 2017, 1:44 AM

Looks like it's the other way around - you have to add it to the GROUP BY.

James Perrin
December 12, 2017, 3:28 AM

looks like changing it to "GROUP BY job.processorServiceRegistration.hostRegistration.baseUrl" fixes things for MS SQL, should make it faster for MYSQL too as it was grouping by every field of job.processorServiceRegistration (mh_service_registration) previously.

I can reopen this ticket or create a new one, preference?

Greg Logan
December 12, 2017, 3:39 AM

let's reopen here, I'll create a branch and file another PR.

James Perrin
December 12, 2017, 3:41 AM

Yeah I guess I'm the only person that can review that it works on MSSQL!

Fixed and reviewed


Greg Logan


Stephen Marquard

Tags (folksonomy)



Fix versions

Affects versions