OptimisticLockException on worker node can cause jobs to be stuck in DISPATCHING state
Steps to reproduce
Steps to reproduce:
1. Run an Opencast cluster with significant load
2. Observe in Jobs table that some jobs remain in DISPATCHING state
3. Observe in Servers table that some servers have jobs queued but aren't running any jobs
Workaround by restarting worker node (after placing it in maintenance mode).
This only seems to affect Inspect jobs (not sure why).
@greg_logan, we appear to have this issue but from a RollbackException wrapping the Optimistic Lock. So the catch clause did not catch it. I'm going to make a pull referencig this ticket with the addition of the RollbackException in the catch.
From our logs, we can see that the job completed one second before this error.
JobBarrier.suspendWaiterJob - Unable to put Some(240757) into a waiting state, this may cause a deadlock: javax.persistence.RollbackException: javax.persistence.OptimisticLockException: Exception [EclipseLink-5006] (Eclipse Persistence Services - 2.6.4.v20160829-44060b6): org.eclipse.persistence.exceptions.OptimisticLockException