Opencast needs to better distribute load across the available nodes


Right now we limit the number of active jobs on a given node with a 1-to-1 relationship to the number of processor cores available (unless set otherwise by the user). This means that if you have a worker node with 32 processors Opencast will happily dispatch up to 32 encoding jobs to that node at a time. This is unlikely to be the optimal solution considering the current multithreaded capabilities of ffmpeg. Likewise, a single inspect job occupies an entire core of the worker node, when in reality a number of inspect jobs could run on the same core simultaneously - dispatching 32 to a single core machine would be heavy, but not nearly as node-breaking as 32 encodes!

What we need to do is decouple the number of jobs from the load those jobs impose. This would be stored as a new field on the job itself, and represent (more or less) how much load the job imposes on the processing host. This would be a floating point value which would allow for low-load jobs (like inspect, or distribute) to impose nearly no load (ie, less than 1.0 load) and for high-load jobs to impose multiples of the current weight (ie, an encode could take 4.0). The load for most of the jobs would be configured in new service configuration files under etc/services, with the load for the encoding profiles and the execute service jobs being specified in the relevant configuration file(s).

Since these configuration items are not necessarily global, this can lead to interesting scenarios. Consider a cluster with two workers: both are general purpose workers, but one has hardware encoding assist. If you want to do all of the encoding on the second, hardware assisted, host you would set the load value for the encoding profiles to exceed the maximum number of available cores on the first host. This would mean that both of the workers could do most of the tasks, except that the worker lacking hardware assist would reject encoding jobs as being too costly, leaving them to queue on the other worker.

Fixed and reviewed
Your pinned fields
Click on the next to a field label to start pinning.


Greg Logan


Greg Logan

Tags (folksonomy)