Spatio-Temporal Saliency for Action Similarity
conference paper
Human actions are spatio-temporal patterns. A popular representation is to describe the action by features at interest points. Because the interest point detection and feature description are generic processes, they are not tuned to discriminate one particular action from the other. In this paper we propose a saliency measure for each individual feature to improve its distinctiveness for a particular action. We propose a spatio-temporal saliency map, for a bag of features, that is specific to the current video and to the action of interest. The novelty is that the saliency map is derived directly from the SVM's support vectors. For the retrieval of 48 human actions from the visint.org database of 3, 480 videos, we demonstrate a systematic improvement across the board of 35.3% on average and significant improvements for 25 actions. We learn that the improvements are achieved in particular for complex human actions such as giving, receiving, burying and replacing an item. © 2013 IEEE.
Topics
TNO Identifier
481444
Publisher
IEEE
Source title
2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2013, 23 - 28 June 2013, Portland, OR, USA
Place of publication
Piscataway, NJ
Pages
257-262
Files
To receive the publication files, please send an e-mail request to TNO Repository.