Promising Depth Map Prediction Method from a Single Image Based on Conditional Generative Adversarial Network

Abdulwahab, Saddam; Rashwan, Hatem A; Masoumian, Armin; Sharaf, Najwa; Puig, Domenec

Dades identificatives

Identificador: imarina:9380784

Handle: https://hdl.handle.net/20.500.11797/imarina9380784

Autors:
Abdulwahab, SaddamRashwan, Hatem AMasoumian, ArminSharaf, NajwaPuig, Domenec

Resum:
Pose estimation is typically performed through 3D images. In contrast, estimating the pose from a single RGB image is still a difficult task. RGB images do not only represent objects' shape, but also represent the intensity that is relative to the viewpoint, texture, and lighting condition. While the 3D pose estimation from depth images is considered a promising approach since the depth image only represents objects' shape. Thus, it is necessary to know what is the appropriate method that can be used for predicting the depth image from a 2D RGB image and then to use for getting the 3D pose estimation. In this paper, we propose a promising approach based on a deep learning model for depth estimation in order to improve the 3D pose estimation. The proposed model consists of two successive networks. The first network is an autoencoder network that maps from the RGB domain to the depth domain. The second network is a discriminator network that compares a real depth image to a generated depth image to support the first network to generate an accurate depth image. In this work, we do not use real depth images corresponding to the input color images. Our contribution is to use 3D CAD models corresponding to objects appearing in color images to render depth images from different viewpoints. These rendered images are then used as ground truth and to guide the autoencoder network to learn the mapping from the image domain to the depth domain. The proposed model outperforms state-of-the-art models on the publicly PASCAL 3D+ dataset.
Altres:

Autor segons l'article: Abdulwahab, Saddam; Rashwan, Hatem A; Masoumian, Armin; Sharaf, Najwa; Puig, Domenec
Departament: Enginyeria Informàtica i Matemàtiques
Autor/s de la URV: Abdellatif Fatahallah Ibrahim Mahmoud, Hatem / Abdulwahab, Saddam Abdulrhman Hamed / Masoumian, Armin / Puig Valls, Domènec Savi
Paraules clau: Deep learning Depth prediction Image segmentation Image to image translatio Image to image translation Unet Unet plus Unet++
Resum: Pose estimation is typically performed through 3D images. In contrast, estimating the pose from a single RGB image is still a difficult task. RGB images do not only represent objects' shape, but also represent the intensity that is relative to the viewpoint, texture, and lighting condition. While the 3D pose estimation from depth images is considered a promising approach since the depth image only represents objects' shape. Thus, it is necessary to know what is the appropriate method that can be used for predicting the depth image from a 2D RGB image and then to use for getting the 3D pose estimation. In this paper, we propose a promising approach based on a deep learning model for depth estimation in order to improve the 3D pose estimation. The proposed model consists of two successive networks. The first network is an autoencoder network that maps from the RGB domain to the depth domain. The second network is a discriminator network that compares a real depth image to a generated depth image to support the first network to generate an accurate depth image. In this work, we do not use real depth images corresponding to the input color images. Our contribution is to use 3D CAD models corresponding to objects appearing in color images to render depth images from different viewpoints. These rendered images are then used as ground truth and to guide the autoencoder network to learn the mapping from the image domain to the depth domain. The proposed model outperforms state-of-the-art models on the publicly PASCAL 3D+ dataset.
Àrees temàtiques: Artificial intelligence Ciências agrárias i Comunicació i informació Engenharias iii Engenharias iv General o multidisciplinar Información y documentación Interdisciplinar Medicina ii
Accès a la llicència d'ús: https://creativecommons.org/licenses/by/3.0/es/
Adreça de correu electrònic de l'autor: domenec.puig@urv.cat saddam.abdulwahab@urv.cat armin.masoumian@estudiants.urv.cat armin.masoumian@estudiants.urv.cat hatem.abdellatif@urv.cat saddam.abdulwahab@urv.cat
Identificador de l'autor: 0000-0002-0562-4205 0000-0001-5421-1637
Data d'alta del registre: 2024-09-21
Versió de l'article dipositat: info:eu-repo/semantics/publishedVersion
Referència a l'article segons font original: Frontiers In Artificial Intelligence And Applications. 339 392-401
Referència de l'ítem segons les normes APA: Abdulwahab, Saddam; Rashwan, Hatem A; Masoumian, Armin; Sharaf, Najwa; Puig, Domenec (2021). Promising Depth Map Prediction Method from a Single Image Based on Conditional Generative Adversarial Network. Amsterdam: IOS Press
URL Document de llicència: https://repositori.urv.cat/ca/proteccio-de-dades/
Entitat: Universitat Rovira i Virgili
Any de publicació de la revista: 2021
Tipus de publicació: Proceedings Paper

Paraules clau:

Artificial Intelligence
Deep learning
Depth prediction
Image segmentation
Image to image translatio
Image to image translation
Unet
Unet plus
Unet++
Artificial intelligence
Ciências agrárias i
Comunicació i informació
Engenharias iii
Engenharias iv
General o multidisciplinar
Información y documentación
Interdisciplinar
Medicina ii
Documents:

DocumentPrincipal
Cerca a google

Articles producció científica> Enginyeria Informàtica i Matemàtiques

Promising Depth Map Prediction Method from a Single Image Based on Conditional Generative Adversarial Network

Dades identificatives

Altres:

Paraules clau:

Documents:

Cerca a google