Uploaded image for project: 'Opencast'
  1. MH-12267

As a user, I want to be able to do my work when I want without downtime, so I am not inconvenienced.

    Details

      Description

      This ticket is related to OC Dev list conversation titled "Editing location". It is about updating event bibliographic metadata on a long running workflow.

      My Take:
      1. Use Elastic Search as a central access source of truth.
      2. The running nodes elect a job token lock queue node from amongst themselves with a failover algorithm. Only the "master" job locker can provide tokens to enable dispatching a job.
      3. UI components can be used on any node. The common access to Elastic search will help in the effort of alert one node if a user on another node is also editing the same mediapackage at the same time as the user on the first node.
      4. Config logic determines if both users can start edit workflows to process their edits on the same mediapackage. The current job lock token node only allows one of those workflows to run at a time.

      Summary of user stories from the email chain:

      [James Perrin] It's not just about locking down the UI,  our Admin tends to be subjected
       to a lot more updates and configuration changes than other nodes,
       but because of editing it becomes an "end-user service" which is some we wish to avoid.
      
      [kdolan]  A remote endpoint is not useful if its impl is frequently down.
       It sounds like you need the remote editor service to view a mediapackage
       and its resources and save a smil file while the admin is down. Then, when
       the admin starts back up, it starts pending processing workflows.
      
      [James Perrin] I was thinking of a very cut down remote for just our purposes,
       but as you say everything still eventually relies on Admin for so/too much
      
      [Paul Pettit] we allow self service editing via LTI which means exposing users to the
       admin ui. even when heavily cut down this is not a great user experience. especially
       as logging in to the admin ui via LTI does not log you in to the engage ui as it is
       on a different server (unless i am doing something very wrong?)
      
      [Stephen Marquard] ... self-service editing is going to be an increasingly important
       use-case, and some LTI use-cases only really work at the moment if you run
       admin+presentation in a single node.
      
      Ideally we should allow multiple admin (job dispatching) as well, or at least a failover
       model on that. Well, you can right now enable job dispatching on more than
       one node, but the locking is not great.
      
      [Stephen Marquard] This is maybe an Opencast 4.x or 5.x goal.
      
      I think the first step would be to ditch solr and use Elasticsearch for
       everything (the latest version, MH-12165), then test a configuration
       using Elasticsearch's own clustering.
      
      Then there's no reason to have a "special" admin node because the db,
       filesystem and elasticsearch are all shared resources. The running nodes
       could elect a job dispatching node from amongst themselves with a failover algorithm.
      
      The clustered UI nodes could all live under a single domain with load-balancing
      , and the worker and ingest nodes can continue to have their own URLs.
      
      I would say also the goal for config changes would be to make them all work
       live (the same way workflows and encoding profiles are reloaded dynamically),
       so you'd hardly ever need to restart a node, and if you wanted you could do
       rolling updates for UI changes by taking some nodes out of the loadbalancer,
       updating them and putting them back in.
      

        TestRail: Results

          Attachments

            Activity

              People

              • Assignee:
                karen_dolan Karen Dolan
                Reporter:
                karen Karen Dolan (Inactive)
              • Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:

                  TestRail: Cases