Date of Award

12-2024

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Automotive Engineering

Committee Chair/Advisor

Qilun Zhu

Committee Member

Laine Mears

Committee Member

Siyu Huang

Committee Member

Bing Li

Abstract

The advancement of autonomous driving technology and intelligent robotic applications has emerged as a focal point in the realm of autonomy. One of the driving forces behind this trend is the profound understanding of the environment, and at the core of this endeavor lies the three-dimensional geometric perception. This dissertation embarks on a comprehensive exploration of this domain, emphasizing the advances of depth prediction, 3D scene reconstruction, and active vision to enhance geometric perception and scene understanding capabilities in autonomous driving, embodied AI, and robotics. In the domain of depth prediction, this research addresses the challenges of accurately inferring three-dimensional depth from monocular cameras. It introduces innovative methods to improve the accuracy and robustness of depth prediction by leveraging the low-cost sparse LiDAR and handling the dynamic objects' motion and occlusion. This is critical for the perception in autonomous driving and robotics, especially vital tasks such as object detection and obstacle avoidance. Furthermore, we delve into the intricacies of 3D scene reconstruction, aiming to capture the environment's geometric structures and objects' positions. Of particular, this dissertation underscores the significance of meaningful geometric feature learning as a pivotal component for 3D reconstruction, which, by integrating multiple observed frames 2 into an improved cost volume, provides a richer and more accurate geometric encoding of the 3D scene, offering a more precise environmental reconstruction for autonomous driving and robot systems. The concept of active vision, especially for neural implicit reconstruction, is also explored. Instead of traditional geometric perception, which uses passively collected data, active vision enables the intelligent agent to explore and reconstruct the unknown scene automatically. This approach eliminated the reliance on human operation and navigation. It achieved faster exploration and perception coverage, providing a more comprehensive scene understanding and perception for embodied AI and robotics applications. In summary, this dissertation represents the cutting-edge of geometric perception in 3D computer vision. It is dedicated to enhancing scene understanding capabilities in autonomous driving and robotic applications, from depth prediction to 3D scene reconstruction, sensor fusion to geometric feature learning, and active perception. It lays a solid foundation for the future development of intelligent automotive and robotic technologies.

Recommended Citation

Feng, Ziyue, "Advancing Visual Geometric Perception: Camera-Based Depth, Reconstruction, and Active Vision" (2024). All Dissertations. 3823.
https://open.clemson.edu/all_dissertations/3823

ZiyueFeng_Dissertation.docx (9347 kB)

Author ORCID Identifier

0000-0002-0037-3697

Download

Included in

Other Computer Engineering Commons, Robotics Commons

COinS

All Dissertations

Advancing Visual Geometric Perception: Camera-Based Depth, Reconstruction, and Active Vision

Date of Award

Document Type

Degree Name

Department

Committee Chair/Advisor

Committee Member

Committee Member

Committee Member

Abstract

Recommended Citation

Author ORCID Identifier

Included in

Search

Browse by

Useful Links

All Dissertations

Advancing Visual Geometric Perception: Camera-Based Depth, Reconstruction, and Active Vision

Author

Date of Award

Document Type

Degree Name

Department

Committee Chair/Advisor

Committee Member

Committee Member

Committee Member

Abstract

Recommended Citation

Author ORCID Identifier

Included in

Share

Search

Browse by

Useful Links