Date of Award

5-2026

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Electrical and Computer Engineering (Holcomb Dept. of)

Committee Chair/Advisor

Fatemeh Afghah

Committee Member

Nathan McNeese

Committee Member

Wael Abd Almageed

Committee Member

Tao Wei

Abstract

Despite remarkable progress in deep learning, a major challenge remains: machine learning models often struggle to generalize to unseen domains under distribution shift. In real-world settings, data often differ from training conditions due to changes in lighting, sensor type, image resolution, and style. These differences can significantly degrade performance, highlighting the need for representations that are both robust and generalizable. This thesis addresses this challenge by developing a set of frameworks for generalization across domains in anomaly detection, deepfake detection, and vision-language image recognition. For anomaly detection, this thesis introduces ROADS, a robust prompt-driven framework for multi-class unified anomaly detection. ROADS is designed to address two major limitations of existing approaches: interference among anomaly classes and sensitivity to domain shifts. It incorporates hierarchical class-aware prompts to encode class-specific semantics and includes a domain adaptation component to improve robustness under varying conditions. As a result, ROADS improves both anomaly detection and localization performance, especially in out-of-distribution settings. Beyond conventional anomaly detection, this thesis further studies multimodal anomaly understanding using Multimodal Large Language Models (MLLMs). Qwen-AD addresses multi-task industrial anomaly understanding by reducing task interference and improving generalization across multiple anomaly-related tasks. In addition, RADAR improves the robustness of MLLMs for anomaly understanding at inference time without retraining. It enhances reliability under shifted conditions by estimating uncertainty, strengthening visual grounding, and refining model predictions to reduce language bias and hallucination.

For deepfake detection, this thesis proposes FreqDebias, a framework designed to improve generalization to unseen forgery types. Many deepfake detectors rely on narrow spectral artifacts that do not transfer well across datasets or manipulation methods. FreqDebias reduces this spectral bias through a frequency-based augmentation strategy called Fo-Mixup and a dual consistency regularization that encourages consistent local and global representations. This leads to more robust and generalizable deepfake detection. For vision-language recognition, this thesis introduces Style-Pro, a style-guided prompt learning framework for models such as CLIP. It improves generalization by modeling style variation while preserving content and cross-modal alignment. As a result, the model adapts more effectively to unseen domains and classes while maintaining strong zero-shot performance. Extensive experiments validate the proposed methods across diverse benchmarks. The results consistently demonstrate improved robustness and generalization in anomaly detection, multimodal anomaly understanding, deepfake detection, and vision-language recognition under both in-distribution and out-of-distribution settings. Together, these findings show that robust representation learning, modular adaptation, and inference-time refinement can substantially improve generalization across diverse domains and visual environments.

Recommended Citation

Kashiani, Hossein, "Towards Generalizable Representation Learning Across Domains" (2026). All Dissertations. 4237.
https://open.clemson.edu/all_dissertations/4237

Author ORCID Identifier

0000-0001-8338-9987

Download

Included in

Computer Sciences Commons

COinS

All Dissertations

Towards Generalizable Representation Learning Across Domains

Date of Award

Document Type

Degree Name

Department

Committee Chair/Advisor

Committee Member

Committee Member

Committee Member

Abstract

Recommended Citation

Author ORCID Identifier

Included in

Search

Browse by

Useful Links

All Dissertations

Towards Generalizable Representation Learning Across Domains

Author

Date of Award

Document Type

Degree Name

Department

Committee Chair/Advisor

Committee Member

Committee Member

Committee Member

Abstract

Recommended Citation

Author ORCID Identifier

Included in

Share

Search

Browse by

Useful Links