Articles producció científica> Enginyeria Informàtica i Matemàtiques

Adversarial Learning for Depth and Viewpoint Estimation From a Single Image

  • Dades identificatives

    Identificador: imarina:8679846
    Autors:
    Abdulwahab, SaddamRashwan, Hatem AGarcia, Miguel AngelJabreel, MohammedChambon, SylviePuig, Domenec
    Resum:
    Estimating a depth map and, at the same time, predicting the 3D pose of an object from a single 2D color image is a very challenging task. Depth estimation is typically performed through stereo vision by following several time-consuming stages, such as epipolar geometry, rectification and matching. Alternatively, when stereo vision is not useful or applicable, depth relations can be inferred from a single image as studied in this paper. More precisely, deep learning is applied in order to solve the problem of estimating a depth map from a single image. Then, that map is used for predicting the 3D pose of the main object depicted in the image. The proposed model consists of two successive neural networks. The first network is based on a Generative Adversarial Neural network (GAN). It estimates a dense depth map from the given color image. A Convolutional Neural Network (CNN) is then used to predict the 3D pose from the generated depth map through regression. The main difficulty to jointly estimate depth maps and 3D poses using deep networks is the lack of training data with both depth and viewpoint annotations. This contribution assumes a cross-domain training procedure with 3D CAD models corresponding to objects appearing in real images in order to render depth images from different viewpoints. These rendered images are then used to guide the GAN network to learn the mapping from the image domain to the depth domain. By exploiting the dataset as a source of training data, the proposed model outperforms state-of-the-art models on the PASCAL 3D+ dataset. The code of the proposed model is publicly available at https://github.com/SaddamAbdulrhman/Depth-and-Viewpoint-Estimation/tree/master.
  • Altres:

    Autor segons l'article: Abdulwahab, Saddam; Rashwan, Hatem A; Garcia, Miguel Angel; Jabreel, Mohammed; Chambon, Sylvie; Puig, Domenec
    Departament: Enginyeria Informàtica i Matemàtiques
    Autor/s de la URV: Abdellatif Fatahallah Ibrahim Mahmoud, Hatem / GARCIA GARCIA, MIGUEL ANGEL / Puig Valls, Domènec Savi
    Paraules clau: Three-dimensional displays Solid modeling Pose estimation Generative adversarial networks Face Depth prediction Deep learning Color
    Resum: Estimating a depth map and, at the same time, predicting the 3D pose of an object from a single 2D color image is a very challenging task. Depth estimation is typically performed through stereo vision by following several time-consuming stages, such as epipolar geometry, rectification and matching. Alternatively, when stereo vision is not useful or applicable, depth relations can be inferred from a single image as studied in this paper. More precisely, deep learning is applied in order to solve the problem of estimating a depth map from a single image. Then, that map is used for predicting the 3D pose of the main object depicted in the image. The proposed model consists of two successive neural networks. The first network is based on a Generative Adversarial Neural network (GAN). It estimates a dense depth map from the given color image. A Convolutional Neural Network (CNN) is then used to predict the 3D pose from the generated depth map through regression. The main difficulty to jointly estimate depth maps and 3D poses using deep networks is the lack of training data with both depth and viewpoint annotations. This contribution assumes a cross-domain training procedure with 3D CAD models corresponding to objects appearing in real images in order to render depth images from different viewpoints. These rendered images are then used to guide the GAN network to learn the mapping from the image domain to the depth domain. By exploiting the dataset as a source of training data, the proposed model outperforms state-of-the-art models on the PASCAL 3D+ dataset. The code of the proposed model is publicly available at https://github.com/SaddamAbdulrhman/Depth-and-Viewpoint-Estimation/tree/master.
    Àrees temàtiques: Media technology Matemática / probabilidade e estatística Engineering, electrical & electronic Engenharias iv Engenharias iii Electrical and electronic engineering Ciência da computação Administração pública e de empresas, ciências contábeis e turismo
    Accès a la llicència d'ús: https://creativecommons.org/licenses/by/3.0/es/
    Adreça de correu electrònic de l'autor: miguelangel.garciag@urv.cat hatem.abdellatif@urv.cat domenec.puig@urv.cat
    Identificador de l'autor: 0000-0001-9972-2182 0000-0001-5421-1637 0000-0002-0562-4205
    Data d'alta del registre: 2024-09-21
    Versió de l'article dipositat: info:eu-repo/semantics/acceptedVersion
    URL Document de llicència: https://repositori.urv.cat/ca/proteccio-de-dades/
    Referència a l'article segons font original: Ieee Transactions On Circuits And Systems For Video Technology. 30 (9): 2947-2958
    Referència de l'ítem segons les normes APA: Abdulwahab, Saddam; Rashwan, Hatem A; Garcia, Miguel Angel; Jabreel, Mohammed; Chambon, Sylvie; Puig, Domenec (2020). Adversarial Learning for Depth and Viewpoint Estimation From a Single Image. Ieee Transactions On Circuits And Systems For Video Technology, 30(9), 2947-2958. DOI: 10.1109/TCSVT.2020.2973068
    Entitat: Universitat Rovira i Virgili
    Any de publicació de la revista: 2020
    Tipus de publicació: Journal Publications
  • Paraules clau:

    Electrical and Electronic Engineering,Engineering, Electrical & Electronic,Media Technology
    Three-dimensional displays
    Solid modeling
    Pose estimation
    Generative adversarial networks
    Face
    Depth prediction
    Deep learning
    Color
    Media technology
    Matemática / probabilidade e estatística
    Engineering, electrical & electronic
    Engenharias iv
    Engenharias iii
    Electrical and electronic engineering
    Ciência da computação
    Administração pública e de empresas, ciências contábeis e turismo
  • Documents:

  • Cerca a google

    Search to google scholar