Steps to reproduce:
1. Go to Systems->Servers on a system with a lot of rows in mh_jobs (100'000+++)
2. Open inspect in your web browser to see how long the calls take
Actual Results:
Slow performance (6 seconds+ on our system for a single call)
Expected Results:
Fast performance
It is the following query that makes Systems->Servers so slow:
SELECT t0.id, t1.status, COUNT(t1.status), AVG(t1.queue_time), AVG(t1.run_time) FROM mh_service_registration t0, mh_job t1 WHERE (t0.id = t1.processor_service) GROUP BY t0.id, t0.active, t0.error_state_trigger, t0.job_producer, t0.online, t0.online_from, t0.path, t0.service_state, t0.service_type, t0.state_changed, t0.warning_state_trigger, t0.host_registration, t1.status
Note that this does not scale... on our system with 400'000+ entries in mh_jobs, the SQL-query alone takes more than 5 seconds.
Possible solution: Need to limit the number of mh_job rows to a reasonable number so that the query above has a reasonable runtime.
The query comes from
modules/matterhorn-serviceregistry/impl/SeviceRegistryJpaImpl.java
see:
@NamedQuery(name = "ServiceRegistration.statistics", query = "SELECT job.processorServiceRegistration.id as serviceRegistration, job.status, "
+ "count(job.status) as numJobs, "
+ "avg(job.queueTime) as meanQueue, "
+ "avg(job.runTime) as meanRun FROM Job job group by job.processorServiceRegistration, job.status"),
This should definitely be fixed, but also worth noting that there are configurable processes to remove old jobs from the db, so the mh_jobs table doesn't grow without limits.