Semantic reasoning in zero example video event retrieval
article
Searching in digital video data for high-level events, such as a parade or a car accident, is challenging when the query is textual and lacks visual example images or videos. Current research in deep neural networks is highly beneficial for the retrieval of high-level events using visual examples, but without examples it is still hard to (1) determine which concepts are useful to pre-train (Vocabulary challenge) and (2) which pre-trained concept detectors are relevant for a certain unseen high-level event (Concept Selection challenge). In our article, we present our Semantic Event Retrieval System which (1) shows the importance of high-level concepts in a vocabulary for the retrieval of complex and generic high-level events and (2) uses a novel concept selection method (i-w2v) based on semantic embeddings. Our experiments on the international TRECVID Multimedia Event Detection benchmark show that a diverse vocabulary including high-level concepts improves performance on the retrieval of high-level events in videos and that our novel method outperforms a knowledge-based concept selection method. © 2017 ACM.
Topics
Content-based visual information retrievalMultimedia event detectionSemanticsZero shotAccidentsBenchmarkingDeep neural networksImage retrievalKnowledge based systemsMultimedia systemsSemanticsConcept selectionContent-based visual information retrievalDigital video dataKnowledge basedMultimedia event detectionsRetrieval systemsSemantic reasoningZero shotComputer graphics
TNO Identifier
782443
ISSN
15516857
Source
ACM Transactions on Multimedia Computing, Communications and Applications, 13(4)
Article nr.
60
Files
To receive the publication files, please send an e-mail request to TNO Repository.