Evaluation of Spatio-Temporal Small Object Detection in Real-World Adverse Weather Conditions
                                            conference paper
                                        
                                    
                                            Deep learning-based object detection methods, such as YOLO, are promising for surveillance applications. How ever, detecting small objects in large-scale scenes with cluttered backgrounds and adverse weather remains challenging. Recent advancements leverage spatio-temporal information to enhance small object detection, yet the impact of (temporal) adverse weather conditions on such methods remains largely unexplored due to the lack of comprehensive evaluation datasets. This paper evaluates the performance of spatio-temporal YOLOv8 (TYOLOv8) for detecting small objects in real-world adverse weather conditions, comparing it to spatial YOLOv8 and the 3FN moving object detection method. Additionally, we propose haze augmentation to improve object detection performance in challenging hazy weather. Due to the lack of suitable datasets for evaluation, this paper introduces a novel real-world video dataset for small object detection, referred to as Nano VID-weather, with an average object size of 16.4 2 pixels, consisting of a Tiny Objects subset and three challenging weather subsets: Wind, Rain and Haze. Our findings reveal that TYOLOv8 is resilient to real-world adversarial weather conditions, like wind, rain, and haze. Notably, on average TYOLOv8 outperformed both 3FN and YOLOv8 with +0.21mAP across all our subsets. These results demonstrate that TYOLOv8 can enhance surveillance capabilities for small object detection under real-world adverse weather conditions.
                                        
                                    TNO Identifier
                                        
                                            1011990
                                        
                                    Source title
                                        
                                            Proceedings of the Winter Conference on Applications of Computer Vision (WACV) Workshops
                                        
                                    Pages
                                        
                                            844-855