Print Email Facebook Twitter Addressing multimodality in overt aggression detection Title Addressing multimodality in overt aggression detection Author Lefter, I. Rothkrantz, L.J.M. Burghouts, G. Yang, Z. Wiggers, P. Publication year 2011 Abstract Automatic detection of aggressive situations has a high societal and scientific relevance. It has been argued that using data from multimodal sensors as for example video and sound as opposed to unimodal is bound to increase the accuracy of detections. We approach the problem of multimodal aggression detection from the viewpoint of a human observer and try to reproduce his predictions automatically. Typically, a single ground truth for all available modalities is used when training recognizers. We explore the benefits of adding an extra level of annotations, namely audio-only and video-only. We analyze these annotations and compare them to the multimodal case in order to have more insight into how humans reason using multimodal data. We train classifiers and compare the results when using unimodal and multimodal labels as ground truth. Both in the case of audio and video recognizer the performance increases when using the unimodal labels. © 2011 Springer-Verlag. Subject Physics & ElectronicsII - Intelligent ImagingTS - Technical SciencesImage processingBehavior detectionSensorsMultimodalAgression detectionAudioVideo To reference this document use: http://resolver.tudelft.nl/uuid:88c4652b-efa9-4fbe-8952-f6516e5d7191 DOI https://doi.org/10.1007/978-3-642-23538-2_4 TNO identifier 435968 Publisher Springer, Berlin ISBN 9783642235375 ISSN 0302-9743 Source 14th International Conference on Text, Speech and Dialogue, TSD 2011, 1-5 September 2011, Pilsen, 6836 LNAI, 25-32 Series Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Document type conference paper Files To receive the publication files, please send an e-mail request to TNO Library.