Date of Award

5-2026

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Civil Engineering

Committee Chair/Advisor

Mashrur Chowdhury

Committee Member

Yongkai Wu

Committee Member

Chao Fan

Abstract

Autonomous vehicle (AV) systems typically employ modular systems in which discrete components handle separate tasks such as perception, computation, and path planning. While flexible, this approach allows errors to propagate and compound across the pipeline, and many AI systems offer little transparency into their internal decision-making. Such limitations are particularly concerning in safety-critical domains where failures can carry lethal consequences. Vision Language Models (VLMs) have emerged as a promising alternative because they support end-to-end implementations that bypass compounding error risks and provide natural language explanations of their outputs. Despite these advantages, prior research has demonstrated that both computer vision systems and large language models can exhibit demographic disparities, with outputs that vary systematically based on the characteristics of individuals represented in the input. Whether VLMs inherit or introduce similar biases in safety-critical pedestrian detection tasks remains largely unexplored. To investigate this gap, this study benchmarks five state-of-the-art VLMs across two pedestrian detection scenarios using the CityPersons dataset augmented with demographic labels, evaluating model performance across age and gender. The primary contribution is a systematic assessment of the demographic robustness of leading VLMs in pedestrian detection, providing insights critical to the unbiased and safe deployment of AI-driven AV systems.

Recommended Citation

Thomas, Ostonya K., "Robustness of Vision Language Models for Pedestrian Detection Tasks" (2026). All Theses. 4730.
https://open.clemson.edu/all_theses/4730

Author ORCID Identifier

0009-0007-6893-0852

Download

Included in

Artificial Intelligence and Robotics Commons, Transportation Engineering Commons

COinS

All Theses

Robustness of Vision Language Models for Pedestrian Detection Tasks

Date of Award

Document Type

Degree Name

Department

Committee Chair/Advisor

Committee Member

Committee Member

Abstract

Recommended Citation

Author ORCID Identifier

Included in

Search

Browse by

Useful Links

All Theses

Robustness of Vision Language Models for Pedestrian Detection Tasks

Author

Date of Award

Document Type

Degree Name

Department

Committee Chair/Advisor

Committee Member

Committee Member

Abstract

Recommended Citation

Author ORCID Identifier

Included in

Share

Search

Browse by

Useful Links