Articles producció científicaEnginyeria Informàtica i Matemàtiques

DAR-MDE: Depth-Attention Refinement for Multi-Scale Monocular Depth Estimation

  • Datos identificativos

    Identificador:  imarina:9468454
    Autores:  Abdulwahab, S; Rashwan, HA; El-Melegy, MT; Puig, D
    Resumen:
    Monocular Depth Estimation (MDE) remains a challenging problem due to texture ambiguity, occlusion, and scale variation in real-world scenes. While recent deep learning methods have made significant progress, maintaining structural consistency and robustness across diverse environments remains difficult. In this paper, we propose DAR-MDE, a novel framework that combines an autoencoder backbone with a Multi-Scale Feature Aggregation (MSFA) module and a Refining Attention Network (RAN). The MSFA module enables the model to capture geometric details across multiple resolutions, while the RAN enhances depth predictions by attending to structurally important regions guided by depth-feature similarity. We also introduce a multi-scale loss based on curvilinear saliency to improve edge-aware supervision and depth continuity. The proposed model achieves robust and accurate depth estimation across varying object scales, cluttered scenes, and weak-texture regions. We evaluated DAR-MDE on the NYU Depth v2, SUN RGB-D, and Make3D datasets, demonstrating competitive accuracy and real-time inference speeds (19 ms per image) without relying on auxiliary sensors. Our method achieves a delta < 1.25 accuracy of 87.25% and a relative error of 0.113 on NYU Depth v2, outperforming several recent state-of-the-art models. Our approach highlights the potential of lightweight RGB-only depth estimation models for real-world deployment in robotics and scene understanding.
  • Otros:

    Enlace a la fuente original: https://www.mdpi.com/2224-2708/14/5/90
    Referencia de l'ítem segons les normes APA: Abdulwahab, S; Rashwan, HA; El-Melegy, MT; Puig, D (2025). DAR-MDE: Depth-Attention Refinement for Multi-Scale Monocular Depth Estimation. Journal Of Sensor And Actuator Networks, 14(5), 90-. DOI: 10.3390/jsan14050090
    Referencia al articulo segun fuente origial: Journal Of Sensor And Actuator Networks. 14 (5): 90-
    DOI del artículo: 10.3390/jsan14050090
    Año de publicación de la revista: 2025-09-01
    Entidad: Universitat Rovira i Virgili
    Versión del articulo depositado: info:eu-repo/semantics/publishedVersion
    Fecha de alta del registro: 2026-02-13
    Autor/es de la URV: Abdellatif Fatahallah Ibrahim Mahmoud, Hatem / Puig Valls, Domènec Savi
    Departamento: Enginyeria Informàtica i Matemàtiques
    URL Documento de licencia: https://repositori.urv.cat/ca/proteccio-de-dades/
    Tipo de publicación: Journal Publications
    Autor según el artículo: Abdulwahab, S; Rashwan, HA; El-Melegy, MT; Puig, D
    Acceso a la licencia de uso: https://creativecommons.org/licenses/by/3.0/es/
    Áreas temáticas: Computer networks and communications, Computer science, information systems, Control and optimization, Engenharias iv, Instrumentation, Interdisciplinar, Telecommunications
    Direcció de correo del autor: domenec.puig@urv.cat, hatem.abdellatif@urv.cat
  • Palabras clave:

    Accurate depth
    Autoencoder network
    Deep learning
    Depth attention
    Depth map estimation
    Multi-scale aggregation
    Refining attention network
    Computer Networks and Communications
    Computer Science
    Information Systems
    Control and Optimization
    Instrumentation
    Telecommunications
    Engenharias iv
    Interdisciplinar
  • Documentos:

  • Cerca a google

    Search to google scholar