Adversarial AI image perturbation attack invariant to object scale and type

conference paper
Adversarial AI technologies can be used to make AI-based object detection in images malfunction. Evasion attacks make perturbations to the input images that can be unnoticeable to the human eye and exploit weaknesses in object detectors to prevent detection. However, evasion attacks have weaknesses themselves and can be sensitive to any apparent obje ct type, orientation, positioning, and scale. This work will evaluate the performance of a white-box evasion attack and its robustness for these factors. Video data from the ATR Algorithm Development Image Database is used, containing military and civilian vehicles at different ranges (1000-5000 m). A white-box evasion attack (adversarial objectness gradient) was trained to disrupt a YOLOv3 vehicles detector previously trained on this dataset. Several experiments were performed to assess whether the attack successfully prevented vehicle detection at different ranges. Results show that for an evasion attack trained on object at only 1500 m range and applied to all other ranges, the median mAP reduction is >95%. Similarly, when trained only on two vehicles and applied on all seven remaining vehicles, the median mAP reduction is >95%. This means that evasion attacks can succeed with limited training data for multiple ranges and vehicles. Although a (perfect-knowledge) white-box evasion attack is a worst-case scenario in which a system is fully compromised, and its inner workings are known to an adversary, this work may serve as a basis for research into robustness and designing AIbased object detectors resilient to these attacks.
TNO Identifier
1001972
Source
Artificial Intelligence for Security and Defence Applications II. Vol. 13206. SPIE, 2024.
Publisher
SPIE
Source title
SPIE SECURITY + DEFENCE 16-20 September 2024 Edinburgh, United Kingdom