Scheduler service needs to restrict queries to episodes owned by it

Description

This is an issue for 2 operations for the scheduler:

1. Retrieving a set of events for a CA for the /recordings/calendar endpoint (ical feed)

2. Getting the list of events for a CA for conflict check /recordings/conflicts.xml, after the change in:

Asset manager conflict checks are very slow #43

The query that fetches the scheduled events for a capture agents looks like this (from the mysql slow query log). Note the relatively large Rows_examined in comparison to Rows_sent.

  1. Time: 2018-01-17T15:24:20.286898Z

  2. User@Host: opencast[opencast] @ devslsmth001.uct.ac.za [137.158.158.193] Id: 896

  3. Query_time: 3.428755 Lock_time: 0.000803 Rows_sent: 436 Rows_examined: 155287
    SET timestamp=1516202660;
    SELECT t0.id, t0.archival_date, t0.availability, t0.mediapackage_id, t0.mediapackage_xml, t0.organization_id, t0.owner, t0.series_id, t0.version, t1.id, t1.val_bool, t1.val_date, t1.val_long, t1.mediapackage_id, t1.namespace, t1.property_name, t1.val_string FROM mh_assets_snapshot t0 LEFT OUTER JOIN mh_assets_properties t1 ON (((t1.mediapackage_id = t0.mediapackage_id) AND ((t1.namespace = 'org.opencastproject.scheduler') AND (t1.property_name = 'start'))) OR ((t1.mediapackage_id = t0.mediapackage_id) AND ((t1.namespace = 'org.opencastproject.scheduler') AND (t1.property_name = 'end')))) WHERE ((((t0.organization_id = 'mh_default_org') AND EXISTS (SELECT 1 FROM mh_assets_properties t2 WHERE ((t0.mediapackage_id = t2.mediapackage_id) AND (t2.namespace = 'org.opencastproject.scheduler'))) ) AND ((NOT EXISTS (SELECT 1 FROM mh_assets_properties t3 WHERE (((t0.mediapackage_id = t3.mediapackage_id) AND (t3.namespace = 'org.opencastproject.scheduler.trx')) AND (t3.property_name = 'transaction_id'))) AND (t0.version = (SELECT MAX(t4.version) FROM mh_assets_snapshot t4 WHERE (t4.mediapackage_id = t0.mediapackage_id)))) OR (EXISTS (SELECT 1 FROM mh_assets_properties t5 WHERE (((t0.mediapackage_id = t5.mediapackage_id) AND (t5.namespace = 'org.opencastproject.scheduler.trx')) AND (t5.property_name = 'transaction_id'))) AND EXISTS (SELECT 1 FROM mh_assets_properties t6 WHERE ((((t0.mediapackage_id = t6.mediapackage_id) AND (t6.namespace = 'org.opencastproject.scheduler')) AND (t6.property_name = 'version')) AND (t0.version = t6.val_long))) ))) AND EXISTS (SELECT 1 FROM mh_assets_properties t7 WHERE ((((t0.mediapackage_id = t7.mediapackage_id) AND (t7.namespace = 'org.opencastproject.scheduler')) AND (t7.property_name = 'agent')) AND (t7.val_string = 'm320'))) );

The runtime of this query (and similar queries) can be dramatically improved by including:

t0.owner='org.opencastproject.scheduler'

in the WHERE condition to reduce the number of rows in mh_assets_snapshot that must be examined to identify the result set. This is significant for databases with a large number of processed recordings (there will be no practical difference in a database with no processed recordings, i.e. only scheduled recordings).

This requires changes to the AssetManager query API.

Activity

Show:
Stephen Marquard
January 18, 2018, 6:32 PM

Incorporated in PR for

Assignee

Unassigned

Reporter

Stephen Marquard

Tags (folksonomy)

None

Components

Fix versions

Affects versions

Priority

Critical
Configure