Text extraction does not respect dictionary

Steps to reproduce

Steps to reproduce:
1. Run text extraction on a mediapackage with slides
2. Some text is recognized and visible in engage, along with lots of garbage

Actual Results:
Everything makes it into the text segment, even non-text.

Expected Results:
Non-words should be left out.

Status

Assignee

Tobias Wunden

Reporter

Tobias Wunden

Severity

Data Loss/Corruption

Tags (folksonomy)

None

Components

Fix versions

Affects versions

Priority

Critical
Configure