New Publication: Detecting Visual Information Manipulation Attacks in Augmented Reality: A Multimodal Semantic Reasoning Approach

August 9, 2025

New paper from the lab is appearing in IEEE Transactions on Visualization and Computer Graphics (IEEE TVCG; IF: 6.5) and is being presented at IEEE ISMAR 2025:  Detecting Visual Information Manipulation Attacks in Augmented Reality: A Multimodal Semantic Reasoning Approach

In this paper, Duke I3T Lab PhD student Yanming Xiu demonstrates how modern LMMs can be used to detect visual information manipulation (VIM) attacks in AR, where virtual content changes the meaning of real-world scenes in subtle but impactful ways. As part of this effort, Yanming developed a taxonomy of visual information manipulation attacks and created the first dataset of such attacks, consisting of 452 raw–AR video pairs spanning 202 different scenes, each simulating a real-world AR scenario. The paper presents an LMM-based approach to detecting these attacks, achieving, for the different subcategories of VIM attacks, accuracies of 77–95%. It has been accepted to the IEEE TVCG track of IEEE ISMAR (8% acceptance rate) and will be presented at IEEE ISMAR 2025 in October 2025.

Image
VIM Sense Architecture

 

We thank DARPA and the NSF for supporting this research.