Ingest Service should only trigger a series metadata update when the metadata has actually changed

Steps to reproduce

Steps to reproduce:
1. Make sure the property "org.opencastproject.series.overwrite" in the file "org.opencastproject.ingest.impl.IngestServiceImpl.cfg" is set to "true"
2. Ingest a certain number of mediapackages belonging to the same series and containing a series metadata catalog containing the exact same metadata. Make sure the workflow run after the ingestion archives the mediapackage.

Actual Results:
Each time a new mediapackage is ingested, each archived mediapackage belonging to the same series is archived again to, in theory, store the updated series metadata --even though the series metadata is actually exactly the same and does not require to be updated at all.

Expected Results:
Before triggering a potentially expensive update (series may have a large number of mediapackages which are all re-archived), the system should make sure that the ingested series metadata catalog does modify the current series metadata.

Workaround (if any):
Either remove the series metadata catalog from the mediapackage or set the "org.opencastproject.series.overwrite" property to "false".


Rubén Pérez
July 11, 2017, 12:55 PM

Copied from the Github ticket:

Karen said:
> I can't recall if this is merged, but if you want to ensure that an ingest service ingested series (versus series updated via the series update REST endpoint, which is a totally different approach ) updates series metadata be careful to compare for when values are removed from a catalog versus just when they are added to a catalog. [EDIT] this comment is related to merging inbound field value updates versus directly overwriting the entire existing catalog with an inbound one (which is the style of the current Ingest Service).

Rubén said:
> @karendolan The thing is, looking at the code (that is, connecting a debugger and adding a breakpoint in the "addInternal" method of the archive service, then looking at the stack to see exactly where the call was originated), the update is triggered after a certain type of message is sent through the ActiveMQ, and the handler does not bother at all to check whether or not the metadata is changed. It just checks the property org.opencastproject.series.overwrite and triggers the bulk archival without further ado.
> But of course, what you mention is absolutely correct. Update is not only a change in a existing metadata value or the addition of a new parameter, but also the removal of one (or several!)

Rubén Pérez
July 3, 2017, 4:19 PM

I did not. I mostly confirmed that ingesting a catalog with the same exact metadata cause the other MPs to be updated. But what you are saying may well happen, but I would think that's a bug and should also be addressed.

Former user
July 3, 2017, 4:17 PM

did you test removing a series catalog from the ingest mediapackage but leaving the isPartOf/series metadata? I would think that this should create a new mostly empty series catalog to overwrite any existing series catalog of the same series id?




Rubén Pérez


Incorrectly Functioning With Workaround