Automatic Human Action Recognition in a Scene from Visual Inputs

conference paper
Surveillance is normally performed by humans, since it requires visual intelligence. However, it can be dangerous, especially for military operations. Therefore, unmanned visual-intelligence systems are desired. In this paper, we present a novel system that can recognize human actions. Central to the system is a break-down of high-level perceptual concepts (verbs) in simpler observable events. The system is trained on 3482 videos and evaluated on 2589 videos from DARPA, with for each video human annotations indicating the presence or absence of 48 verbs. The results show that our system reaches a good performance approaching the human average response.
TNO Identifier
455442
Publisher
SPIE
Article nr.
83880L
Source title
Unattended Ground, Sea, and Air Sensor Technologies and Applications XIV, 23 April 2012, Baltimore, Maryland, USA
Place of publication
Bellingham, WA