We're updating the issue view to help you get more done. 

Textanalysis shouldn't create tons of copies of the source video

Steps to reproduce

Background:
The textanalysis workflow operation extract an image from the source video for each video segment. Then the tesseract is called to extract text from this image files. And so on…

Issue:
Opencast create for each video segment an image extraction job. Each job call workspace.get(URI..., unique: true). This create a copy of the source video. If you run an textanalysis for 100 segments, you will end up copying the video file 100 times.

Solution:
You can pass an array of timestamps to the image extraction job. This will create only one job for extracting all image files at once.

Status

Assignee

Waldemar Smirnow

Reporter

Waldemar Smirnow

Severity

Incorrectly Functioning Without Workaround

Tags (folksonomy)

None

Components

Fix versions

Affects versions

6.0

Priority

Minor