Searched for: +
(1 - 20 of 23)

Pages

document
Burghouts, G.J. (author), Schutte, K. (author)
We investigate how human action recognition can be improved by considering spatio-temporal layout of actions. From literature, we adopt a pipeline consisting of STIP features, a random forest to quantize the features into histograms, and an SVM classifier. Our goal is to detect 48 human actions, ranging from simple actions such as walk to...
article 2013
document
Lefter, I. (author), Burghouts, G.J. (author), Rothkrantz, L.J.M. (author)
Stressful situations are likely to occur at human operated service desks, as well as at human-computer interfaces used in public domain. Automatic surveillance can help notifying when extra assistance is needed. Human communication is inherently multimodal e.g. speech, gestures, facial expressions. It is expected that automatic surveillance...
article 2014
document
Burghouts, G.J. (author), Marck, J.W. (author)
We propose a mechanism to assess threats that are based on observables. Observables are properties of persons, i.e., their behavior and interaction with other persons and objects. We consider observables that can be extracted from sensor signals and intelligence. In this paper, we discuss situation assessment that is based on observables for...
article 2011
document
Burghouts, G.J. (author)
The bag-of-features model is a distinctive and robust approach to detect human actions in videos. The discriminative power of this model relies heavily on the quantization of the video features into visual words. The quantization determines how well the visual words describe the human action. Random forests have proven to efficiently transform...
article 2013
document
Lefter, I. (author), Rothkrantz, L.J.M. (author), Burghouts, G.J. (author)
Multimodal fusion is a complex topic. For surveillance applications audio-visual fusion is very promising given the complementary nature of the two streams. However, drawing the correct conclusion from multi-sensor data is not straightforward. In previous work we have analysed a database with audio- visual recordings of unwanted behavior in...
article 2013
document
Burghouts, G.J. (author), Schutte, K. (author), Bouma, H. (author), den Hollander, R.J.M. (author)
In this paper, a system is presented that can detect 48 human actions in realistic videos, ranging from simple actions such as ‘walk’ to complex actions such as ‘exchange’. We propose a method that gives a major contribution in performance. The reason for this major improvement is related to a different approach on three themes: sample selection...
article 2013
document
Sanromà, G. (author), Patino, L. (author), Burghouts, G.J. (author), Schutte, K. (author), Ferryman, J. (author)
We present a method for the recognition of complex actions. Our method combines automatic learning of simple actions and manual definition of complex actions in a single grammar. Contrary to the general trend in complex action recognition, that consists in dividing recognition into two stages, our method performs recognition of simple and...
article 2014
document
Burghouts, G.J. (author), Schutte, K. (author)
Many human actions are correlated, because of compound and/or sequential actions, and similarity. Indeed, human actions are highly correlated in human annotations of 48 actions in the 4,774 videos from visint.org. We exploit such correlations to improve the detection of these 48 human actions, ranging from simple actions such as walk to complex...
conference paper 2012
document
Lefter, I. (author), Burghouts, G.J. (author), Rothkrantz, L.J.M. (author)
The focus of this paper is finding a method to predict aggression using a multimodal system, given multiple unimodal features. The mechanism underlying multimodal sensor fusion is complex and not completely clear. We try to understand the process of fusion and make it more transparent. As a case study we use a database with audio-visual...
conference paper 2012
document
Burghouts, G.J. (author), Eendebak, P.T. (author), Bouma, H. (author), ten Hove, R.J.M. (author)
This paper describes the TNO action recognition system that was used to generate the results for our submission to the competition track of the THUMOS ’13 challenge at ICCV ’13. This system deploys only the STIP features that were provided on the website of the challenge. A bag-of-features model is extended with three novelties
conference paper 2013
document
Burghouts, G.J. (author), Bouma, H. (author), den Hollander, R.J.M. (author), van den Broek, S.P. (author), Schutte, K. (author)
We have developed a system that recognizes 48 human behaviors from video. The essential elements are (i) inference of the actors in the scene, (ii) assessment of event-related properties of actors and between actors, (iii) exploiting the event properties to recognize the behaviors. The performance of our recognizer approaches human performance,...
conference paper 2012
document
Lefter, I. (author), Rothkrantz, L.J.M. (author), Burghouts, G. (author), Yang, Z. (author), Wiggers, P. (author)
Automatic detection of aggressive situations has a high societal and scientific relevance. It has been argued that using data from multimodal sensors as for example video and sound as opposed to unimodal is bound to increase the accuracy of detections. We approach the problem of multimodal aggression detection from the viewpoint of a human...
conference paper 2011
document
Andersson, M. (author), Patino, L. (author), Burghouts, G.J. (author), Flizikowski, A. (author), Evans, M. (author), Gustafsson, D. (author), Petersson, H. (author), Schutte, K. (author), Ferryman, J. (author)
In this paper we present a set of activity recognition and localization algorithms that together assemble a large amount of information about activities on a parking lot. The aim is to detect and recognize events that may pose a threat to truck drivers and trucks. The algorithms perform zone-based activity learning, individual action recognition...
conference paper 2013
document
van Eekeren, A.W.M. (author), Dijk, J. (author), Burghouts, G. (author)
Airborne platforms are recording large amounts of video data. Extracting the events which are needed to see is a timedemanding task for analysts. The reason for this is that the sensors record hours of video data in which only a fraction of the footage contains events of interest. For the analyst, it is hard to retrieve such events from the...
conference paper 2014
document
Burghouts, G.J. (author), Eendebak, P.T. (author), Bouma, H. (author), Hove, R.J.M. (author)
In this paper, we summarize how the action recognition can be improved when multiple views are available. The novelty is that we explore various combination schemes within the robust and simple bag-of-words (BoW) framework, from early fusion of features to late fusion of multiple classifiers. In new experiments on the publicly available IXMAS...
conference paper 2014
document
Sanromà, G. (author), Burghouts, G.J. (author), Schutte, K. (author)
Human behavior understanding from visual data has applications such as threat recognition. A lot of approaches are restricted to limited time actions, which we call short-term actions. Long-term behaviors are sequences of short-term actions that are more extended in time. Our hypothesis is that they usually present some structure that can be...
conference paper 2012
document
Lefter, I. (author), Rothkrantz, L.J.M. (author), Burghouts, G.J. (author)
By analyzing a multimodal (audio-visual) database with aggressive incidents in trains, we have observed that there are no trivial fusion algorithms to successfully predict multimodal aggression based on unimodal sensor inputs. We proposed a fusion framework that contains a set of intermediate level variables (meta-features) between the low level...
conference paper 2012
document
Lefter, I. (author), Burghouts, G.J. (author), Rothkrantz, L.J.M. (author)
We propose a new method for audio-visual sensor fusion and apply it to automatic aggression detection. While a variety of definitions of aggression exist, in this paper we see it as any kind of behavior that has a disturbing effect on others. We have collected multi- and unimodal assessments by humans, who have given aggression scores on a 3...
conference paper 2012
document
Burghouts, G.J. (author), Eendebak, P.T. (author), Bouma, H. (author), ten Hove, R.J.M. (author)
Action recognition is a hard problem due to the many degrees of freedom of the human body and the movement of its limbs. This is especially hard when only one camera viewpoint is available and when actions involve subtle movements. For instance, when looked from the side, checking one’s watch may look very similar to crossing one’s arms. In this...
conference paper 2013
document
Burghouts, G.J. (author), van den Broek, S.P. (author), ten Hove, R.J.M. (author)
Human actions are spatio-temporal patterns. A popular representation is to describe the action by features at interest points. Because the interest point detection and feature description are generic processes, they are not tuned to discriminate one particular action from the other. In this paper we propose a saliency measure for each individual...
conference paper 2013
Searched for: +
(1 - 20 of 23)

Pages