Date of Award
8-2025
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Computer Engineering
Committee Chair/Advisor
Dr. Xiaoyong (Brian) Yuan
Committee Member
Dr. Fatemeh Afghah
Committee Member
Dr. Long Cheng
Committee Member
Dr. Xiaolong Ma
Abstract
Artificial Intelligence (AI) systems have become central to high-stakes applications such as autonomous driving and language-based decision support. As their deployment accelerates, ensuring the security and trustworthiness of these systems becomes paramount. Among the most stealthy and potent threats are backdoor attacks, where models behave as expected under normal conditions but exhibit malicious behavior when triggered by specific inputs, either digital or physical.
This thesis investigates novel backdoor and adversarial vulnerabilities across two emerging classes of AI architectures: (1) multimodal 3D object detection systems that fuse LiDAR and camera data, and (2) Retrieval-Augmented Generation (RAG) systems that pair large language models with document retrievers. Our work is structured across four chapters, each addressing a critical dimension of secure AI perception and reasoning.
We begin by establishing a robust baseline in multimodal perception through a weather-aware, attention-based 3D object detector. Designed for adverse driving conditions such as fog, rain, and poor visibility, the proposed model leverages dynamic multi-scale global-local attention to improve detection accuracy under diverse weather scenarios, laying the groundwork for understanding model behavior under complex operational conditions.
In the second part, we investigate digital backdoor attacks on camera-LiDAR fusion models. By inserting small, view-consistent 2D digital triggers into RGB images, we demonstrate that such artifacts can survive the fusion process and significantly distort 3D bounding box predictions. These findings reveal how cross-modal information propagation introduces new vulnerabilities, even in state-of-the-art camera-LiDAR fusion models.
Extending the threat model further, we present the first material-specific physical backdoor attacks for fusion-based detectors. By leveraging high LiDAR reflectivity materials as physical triggers, we show that attacks can be reliably activated in real-world driving environments across varying angles, lighting conditions, and viewpoints. This chapter bridges the gap between digital and physical threats by aligning simulation-based trigger design with field-level deployability.
Lastly, we examine adversarial threats in Retrieval-Augmented Generation (RAG) systems, which pair large language models with document retrieval mechanisms. We introduce a novel attack strategy, Adaptive Instruction Poisoning (AIP), that injects stealthy context-aware triggers through instructional prompts and retrieved malicious documents, without requiring model retraining or access to user queries. This work highlights how the modular design of RAG pipelines, particularly the separation between retrieval and generation, creates new and underexplored attack surfaces beyond traditional model-centric vulnerabilities.
Together, these contributions expose blind spots in modern AI pipelines where multimodal fusion, retrieval mechanisms, and deployment interfaces introduce overlooked but critical vulnerabilities. We conclude by outlining principles for designing secure-by-default AI systems and emphasizing the urgent need for holistic security evaluations that go beyond static model architectures.
Recommended Citation
Chaturvedi, Saket Sanjeev, "Towards Securing AI Systems: Investigating Threats in Multimodal Autonomous Driving & RAG Systems" (2025). All Dissertations. 4030.
https://open.clemson.edu/all_dissertations/4030
Author ORCID Identifier
0000-0003-0700-404X