Intermittent failure to detect hard links when starting a cluster
Steps to reproduce
Steps to reproduce:
1. Start up worker node with wfr and file repo on the same nfs share
2. Examine logs for state of hard linking, e.g.
grep Hard /opt/matterhorn/logs/opencast.log
2015-02-22 10:43:50 INFO [FelixStartLevel] (WorkspaceImpl:175) - Hard links between the working file repository and the workspace enabled
Sometimes the worker node reports that hard links are not possible, e.g.:
2015-02-22 13:07:22 WARN [FelixStartLevel] (WorkspaceImpl:177) - Hard links between the working file repository and the workspace are not possible
Restarting the worker node (sometimes a few times) produces a successful detection.
Hard links should be detected consistently.
Workaround (if any):
Examine status on startup and restart if required.
2017-06-21 10:46:43,707 | DEBUG | FelixStartLevel | (FileSupport:374) - Creating hard link from /data/opencast/work/shared/files/.linktest to /data/opencast/work/shared/workspace/.linktest.1377639978094138114.tmp
2017-06-21 10:46:43,713 | INFO | FelixStartLevel | (WorkspaceImpl:223) - Hard links between the working file repository and the workspace enabled
Seen on our 2.3.x production system after a cluster restart:
2017-06-21 06:40:05,762 | DEBUG | FelixStartLevel | (FileSupport:374) - Creating hard link from /data/opencast/work/shared/files/.linktest to /data/opencast/work/shared/workspace/.linktest
2017-06-21 06:40:05,767 | DEBUG | FelixStartLevel | (FileSupport:379) - Unable to create a link from /data/opencast/work/shared/files/.linktest to /data/opencast/work/shared/workspace/.linktest: java.nio.file.FileAlreadyExistsException: /data/opencast/work/shared/workspace/.linktest -> /data/opencast/work/shared/files/.linktest
2017-06-21 06:40:05,767 | WARN | FelixStartLevel | (WorkspaceImpl:211) - Hard links between the working file repository and the workspace are not possible
2017-06-21 06:40:05,768 | WARN | FelixStartLevel | (WorkspaceImpl:212) - This will increase the overall amount of disk space used
Bulk resolving these all as out of date, and thus not going to be fixed. Please reopen and change affects version to appropriate 2.3.x+ versions if these are still bothering you.
Seems I can do everything on this JIRA except resolve as a non-issue.
This issue appeared as a regression in FileSupport.supportsLinking in the pull request in which we had merged to our production branch.
I will comment on the pull request.