Epistemic defenses against scientific and empirical adversarial AI attacks
In this paper, we introduce “scientific and empirical adversarial AI attacks” (SEA AI attacks) as umbrella term for not yet prevalent but technically feasible deliberate malicious acts of specifically crafting AI-generated samples to achieve an epistemic distortion in (applied) science or engineering contexts. In view of possible socio-psycho technological impacts, it seems responsible to ponder countermeasures from the onset on and not in hindsight. In this vein, we consider two illustrative use cases: the example of AI-produced data to mislead security engineering practices and the conceivable prospect of AI-generated contents to manipulate scientific writing processes. Firstly, we contextualize the epistemic challenges that such future SEA AI attacks could pose to society in the light of broader i.a. AI safety, AI ethics and cybersecurity relevant efforts. Secondly, we set forth a corresponding supportive generic epistemic defense approach. Thirdly, we effect a threat modelling for the two use cases and propose tailor-made defenses based on the foregoing generic deliberations. Strikingly, our transdisciplinary analysis suggests that employing distinct explanation-anchored, trust disentangled and adversarial strategies is one pos?sible principled complementary epistemic defense against SEA AI attacks – albeit with caveats yielding incentives for future work.
To reference this document use:
CEUR Workshop Proceedings, 2021 Workshop on Artificial Intelligence Safety, AISafety 2021, 19 August 2021 through 20 August 2021