Identificador: TDX:4357
Autors: Masoumian, Armin
Resum:
As the field of robotics and autonomous vehicles advances, the demand for precise depth measurements becomes increasingly pronounced. Depth estimation (DE), a fundamental task in computer vision, plays a pivotal role in achieving this accuracy, with deep learning (DL) techniques offering a viable solution. Particularly, self-supervised monocular depth estimation (MDE) represents cutting-edge technology, allowing the estimation of object depth in a scene from a single image, eliminating the need for expensive stereoscopic or 3D cameras. Graph convolutional networks (GCNs) have further improved the accuracy of DE models by accommodating non-Euclidean data, while combining multiple loss functions has enhanced the reliability of depth predictions.
This study explores the extensive applications of self-supervised MDE and provides a comprehensive review of recent advancements in the field using DL techniques. It delves into key aspects like input data shapes, training methods, and evaluation criteria while also addressing the limitations of DL-based MDE models, including challenges related to accuracy, computational efficiency, real-time feasibility, domain adaptation, and generalization. Furthermore, the research introduces an innovative MDE approach leveraging GCNs for estimating depth maps from monocular videos, outperforming existing state-of-the-art methods. Additionally, a novel deep learning framework is presented, seamlessly integrating DE and object detection within a single image, achieving impressive accuracy, particularly in outdoor scenarios. In summary, this study underscores the efficiency of the self-supervised MDE approach based on graph convolutional networks, providing both quantitative and qualitative comparisons with state-of-the-art methods, emphasizing the considerable advantages of the proposed depth prediction technique.