Soft-assignment random-forest with an application to discriminative representation of human actions in videos

article
The bag-of-features model is a distinctive and robust approach to detect human actions in videos. The discriminative power of this model relies heavily on the quantization of the video features into visual words. The quantization determines how well the visual words describe the human action. Random forests have proven to efficiently transform the features into distinctive visual words. A major disadvantage of the random forest is that it makes binary decisions on the feature values, and thus not taking into account uncertainties of the values. We propose a soft-assignment random forest, which is a generalization of the random forest, by substitution of the binary decisions inside the tree nodes by a sigmoid function. The slope of the sigmoid models the degree of uncertainty about a feature's value. The results demonstrate that the soft-assignment random forest improves significantly the action detection accuracy compared to the original random forest. The human actions that are hard to detect - because they involve interactions with or manipulations of some (typically small) item - are structurally improved. Most prominent improvements are reported for a person handing, throwing, dropping, hauling, taking, closing or opening some item. Improvements are achieved for the state-of-the-art on the IXMAS and UT-Interaction datasets by using the soft-assignment random forest. © 2013 World Scientific Publishing Company.
TNO Identifier
477618
Source
International Journal of Pattern Recognition and Artificial Intelligence, 27(4)
Article nr.
1350009
Files
To receive the publication files, please send an e-mail request to TNO Repository.