Date of Award
12-2024
Document Type
Thesis
Degree Name
Master of Science (MS)
Department
Electrical and Computer Engineering
Committee Chair/Advisor
Fatemeh Afghah
Committee Member
Yongkai Wu
Committee Member
Yongqiang Wang
Committee Member
Qi Luo
Abstract
Visual navigation systems are crucial in various applications, including autonomous driving, unmanned aerial systems (UAS), and industrial automation. For these systems to operate efficiently in dynamic environments, they must not only interpret complex surroundings but also anticipate changes over time. Temporal prediction—forecasting environmental changes like moving obstacles or shifting lighting conditions—enables navigation systems to act proactively, enhancing both safety and performance. This dissertation investigates representation learning methods both as a backbone feature extractor for RL agents as well as a proxy for systems oriented for Explainable AI (XAI). Two main projects are presented as case studies to achieve the aforementioned goals. The first project aims to develop a belief-based system for tracking a wildfire frontline as accurate as possible with an RL agent as a UAV monitoring the area. We show that belief-based representations greatly surpass observation-based representations in highly dynamic wildfire propagation scenarios. The second focus is the disentanglement of latent factors, enhancing interpretability by ensuring that the model’s representations align with independent and meaningful environmental features. Unlike baselines which use variational autoencoders with explicit assumptions about the guassianity of generative factors and full supervision over them, we assume a more realistic setting by considering a subset of generative factors to be unsupervised and simulated as nuisance and uncontrolled factors of variation. We show that our model is able to separate the nuisance factors from the supervised factors, and outperforms the baseline method on a variety of disentanglement metrics. Additionally, a side project is also presented which addresses UAS traffic optimization in urban airspaces, focusing on energy-efficient and safe aerial mobility. Through these efforts, this dissertation contributes to more adaptable, interpretable, and robust visual navigation systems suited to real-world applications.
Recommended Citation
Khoshdel, Sahand, "Learning to Represent Temporal Dynamics and Generative Factors for Intelligent Visual Navigation" (2024). All Theses. 4391.
https://open.clemson.edu/all_theses/4391