DistributionMigrationService migrates files inefficiently

Steps to reproduce

Steps to reproduce:

Run the opencast-migration feature

Actual Results:

To achieve the addition of top-level organization (tenant) folders in the distribution storage, the DistributionMigrationService renames each file in the mediapackage individually.

This is extremely slow (estimated runtime of 19 hours on a Netapp NFS share for our production set of 19000+ mediapackages).

Expected Results:

Distribution service should operate at the level of the mediapackage, and move the whole folder structure in one filesystem operation, e.g.

move
distribution/downloads/engage-player/76ee35ee-0f56-462a-8979-023389974f39
to
distribution/downloads/mh_default_org/engage-player/76ee35ee-0f56-462a-8979-023389974f39

As a mediapackage could easily have 50 items inside it, this will reduce the I/O operations by an order of magnitude.

Also, if there is only 1 organization, the default organization, the process can be reduced to a few move operations, e.g. move

distribution/downloads/engage-player
to
distribution/downloads/mh_default_org/engage-player/

Workaround (if any):

Activity

Show:
Stephen Marquard
July 9, 2017, 5:10 PM

UCT doesn't have an interest in this any longer, and I'm not aware of anyone else working on it.

Stephen Marquard
February 14, 2017, 7:17 PM

We ended up rewriting this significantly to do a single pass through the mediapackages adjusting both download and streaming URLs (if present). This code is only usable as-is if you have a single organization, mh_default_org, and have adjusted the file paths manually at the top of the folder structure as suggested above.

https://bitbucket.org/cilt/opencast/src/3d834e87bb0ef749922bd6e6fe8aae1b7962df08/modules/matterhorn-migration/src/main/java/org/opencastproject/migration/DistributionMigrationService.java?fileviewer=file-view-default

We won't be submitting this as a PR as it was a once-off exercise and not written to be generalizable (for multiple organizations, etc.).

Won't Fix

Assignee

Unassigned

Reporter

Stephen Marquard

Severity

Performance