Articles producció científica> Enginyeria Informàtica i Matemàtiques

Efficient deep learning-based semantic mapping approach using monocular vision for resource-limited mobile robots

  • Dades identificatives

    Identificador: imarina:9262032
    Autors:
    Singh, AdityaNarula, RaghavRashwan, Hatem AAbdel-Nasser, MohamedPuig, DomenecNandi, G C
    Resum:
    Semantic mapping is still challenging for household collaborative robots. Deep learning models have proved their capability to extract semantics from the scene and learn robot odometry. For interfacing semantic information with robot odometry, existing approaches extract both semantics and robot odometry separately and then integrate them using fusion techniques. Such approaches face many issues while integration, and the mapping procedure requires a lot of memory and resources to process the information. In an attempt to produce accurate semantic mapping with resource-limited devices, this paper proposes an efficient deep learning-based model to simultaneously estimate robot odometry by using monocular sequence frames and detecting objects in the frames. The proposed model includes two main components: using a YOLOv3 object detector as a backbone and a convolutional long short-term (Conv-LSTM) recurrent neural network to model the changes in camera pose. The unique advantage of the proposed model is that it boycotts the need for data association and the requirement of multi-sensor fusion. We conducted the experiments on a LoCoBot robot in a laboratory environment, attaining satisfactory results with such limited computational resources. Additionally, we tested the proposed method on the Kitti dataset, reaching an average test loss of 15.93 on various sequences. The experiments are documented in this video https://www.youtube.com/watch?v=hnmqwxpaTEw.
  • Altres:

    Autor segons l'article: Singh, Aditya; Narula, Raghav; Rashwan, Hatem A; Abdel-Nasser, Mohamed; Puig, Domenec; Nandi, G C
    Departament: Enginyeria Informàtica i Matemàtiques
    Autor/s de la URV: Abdellatif Fatahallah Ibrahim Mahmoud, Hatem / Abdelnasser Mohamed Mahmoud, Mohamed / Puig Valls, Domènec Savi / Singh, Aditya
    Paraules clau: Visual odometry Slam Real-time Object detection Mapping Household robots Agglomerative clustering
    Resum: Semantic mapping is still challenging for household collaborative robots. Deep learning models have proved their capability to extract semantics from the scene and learn robot odometry. For interfacing semantic information with robot odometry, existing approaches extract both semantics and robot odometry separately and then integrate them using fusion techniques. Such approaches face many issues while integration, and the mapping procedure requires a lot of memory and resources to process the information. In an attempt to produce accurate semantic mapping with resource-limited devices, this paper proposes an efficient deep learning-based model to simultaneously estimate robot odometry by using monocular sequence frames and detecting objects in the frames. The proposed model includes two main components: using a YOLOv3 object detector as a backbone and a convolutional long short-term (Conv-LSTM) recurrent neural network to model the changes in camera pose. The unique advantage of the proposed model is that it boycotts the need for data association and the requirement of multi-sensor fusion. We conducted the experiments on a LoCoBot robot in a laboratory environment, attaining satisfactory results with such limited computational resources. Additionally, we tested the proposed method on the Kitti dataset, reaching an average test loss of 15.93 on various sequences. The experiments are documented in this video https://www.youtube.com/watch?v=hnmqwxpaTEw.
    Àrees temàtiques: Zootecnia / recursos pesqueiros Software Matemática / probabilidade e estatística Interdisciplinar Engenharias iv Engenharias iii Engenharias i Computer science, artificial intelligence Ciências biológicas ii Ciências biológicas i Ciências ambientais Ciências agrárias i Ciência da computação Biotecnología Artificial intelligence Administração pública e de empresas, ciências contábeis e turismo
    Accès a la llicència d'ús: https://creativecommons.org/licenses/by/3.0/es/
    Adreça de correu electrònic de l'autor: mohamed.abdelnasser@urv.cat hatem.abdellatif@urv.cat aditya.singh@urv.cat domenec.puig@urv.cat
    Identificador de l'autor: 0000-0002-1074-2441 0000-0001-5421-1637 0000-0001-6281-9431 0000-0002-0562-4205
    Data d'alta del registre: 2024-09-21
    Versió de l'article dipositat: info:eu-repo/semantics/acceptedVersion
    URL Document de llicència: https://repositori.urv.cat/ca/proteccio-de-dades/
    Referència a l'article segons font original: Neural Computing & Applications. 34 (18): 15617-15631
    Referència de l'ítem segons les normes APA: Singh, Aditya; Narula, Raghav; Rashwan, Hatem A; Abdel-Nasser, Mohamed; Puig, Domenec; Nandi, G C (2022). Efficient deep learning-based semantic mapping approach using monocular vision for resource-limited mobile robots. Neural Computing & Applications, 34(18), 15617-15631. DOI: 10.1007/s00521-022-07273-7
    Entitat: Universitat Rovira i Virgili
    Any de publicació de la revista: 2022
    Tipus de publicació: Journal Publications
  • Paraules clau:

    Artificial Intelligence,Computer Science, Artificial Intelligence,Software
    Visual odometry
    Slam
    Real-time
    Object detection
    Mapping
    Household robots
    Agglomerative clustering
    Zootecnia / recursos pesqueiros
    Software
    Matemática / probabilidade e estatística
    Interdisciplinar
    Engenharias iv
    Engenharias iii
    Engenharias i
    Computer science, artificial intelligence
    Ciências biológicas ii
    Ciências biológicas i
    Ciências ambientais
    Ciências agrárias i
    Ciência da computação
    Biotecnología
    Artificial intelligence
    Administração pública e de empresas, ciências contábeis e turismo
  • Documents:

  • Cerca a google

    Search to google scholar