Articles producció científicaEnginyeria Informàtica i Matemàtiques

DAR-MDE: Depth-Attention Refinement for Multi-Scale Monocular Depth Estimation

  • Dades identificatives

    Identificador:  imarina:9468454
    Autors:  Abdulwahab, S; Rashwan, HA; El-Melegy, MT; Puig, D
    Resum:
    Monocular Depth Estimation (MDE) remains a challenging problem due to texture ambiguity, occlusion, and scale variation in real-world scenes. While recent deep learning methods have made significant progress, maintaining structural consistency and robustness across diverse environments remains difficult. In this paper, we propose DAR-MDE, a novel framework that combines an autoencoder backbone with a Multi-Scale Feature Aggregation (MSFA) module and a Refining Attention Network (RAN). The MSFA module enables the model to capture geometric details across multiple resolutions, while the RAN enhances depth predictions by attending to structurally important regions guided by depth-feature similarity. We also introduce a multi-scale loss based on curvilinear saliency to improve edge-aware supervision and depth continuity. The proposed model achieves robust and accurate depth estimation across varying object scales, cluttered scenes, and weak-texture regions. We evaluated DAR-MDE on the NYU Depth v2, SUN RGB-D, and Make3D datasets, demonstrating competitive accuracy and real-time inference speeds (19 ms per image) without relying on auxiliary sensors. Our method achieves a delta < 1.25 accuracy of 87.25% and a relative error of 0.113 on NYU Depth v2, outperforming several recent state-of-the-art models. Our approach highlights the potential of lightweight RGB-only depth estimation models for real-world deployment in robotics and scene understanding.
  • Altres:

    Enllaç font original: https://www.mdpi.com/2224-2708/14/5/90
    Referència de l'ítem segons les normes APA: Abdulwahab, S; Rashwan, HA; El-Melegy, MT; Puig, D (2025). DAR-MDE: Depth-Attention Refinement for Multi-Scale Monocular Depth Estimation. Journal Of Sensor And Actuator Networks, 14(5), 90-. DOI: 10.3390/jsan14050090
    Referència a l'article segons font original: Journal Of Sensor And Actuator Networks. 14 (5): 90-
    DOI de l'article: 10.3390/jsan14050090
    Any de publicació de la revista: 2025-09-01
    Entitat: Universitat Rovira i Virgili
    Versió de l'article dipositat: info:eu-repo/semantics/publishedVersion
    Data d'alta del registre: 2026-02-13
    Autor/s de la URV: Abdellatif Fatahallah Ibrahim Mahmoud, Hatem / Puig Valls, Domènec Savi
    Departament: Enginyeria Informàtica i Matemàtiques
    URL Document de llicència: https://repositori.urv.cat/ca/proteccio-de-dades/
    Tipus de publicació: Journal Publications
    Autor segons l'article: Abdulwahab, S; Rashwan, HA; El-Melegy, MT; Puig, D
    Accès a la llicència d'ús: https://creativecommons.org/licenses/by/3.0/es/
    Àrees temàtiques: Computer networks and communications, Computer science, information systems, Control and optimization, Engenharias iv, Instrumentation, Interdisciplinar, Telecommunications
    Adreça de correu electrònic de l'autor: domenec.puig@urv.cat, hatem.abdellatif@urv.cat
  • Paraules clau:

    Accurate depth
    Autoencoder network
    Deep learning
    Depth attention
    Depth map estimation
    Multi-scale aggregation
    Refining attention network
    Computer Networks and Communications
    Computer Science
    Information Systems
    Control and Optimization
    Instrumentation
    Telecommunications
    Engenharias iv
    Interdisciplinar
  • Documents:

  • Cerca a google

    Search to google scholar