Articles producció científicaEnginyeria Informàtica i Matemàtiques

Explaining Image Misclassification in Deep Learning via Adversarial Examples

  • Datos identificativos

    Identificador:  imarina:9229338
    Autores:  Haffar, R; Jebreel, NM; Domingo-Ferrer, J; Sánchez, D
    Resumen:
    With the increasing use of convolutional neural networks (CNNs) for computer vision and other artificial intelligence tasks, the need arises to interpret their predictions. In this work, we tackle the problem of explaining CNN misclassification of images. We propose to construct adversarial examples that allow identifying the regions of the input images that had the largest impact on the CNN wrong predictions. More specifically, for each image that was incorrectly classified by the CNN, we implemented an inverted adversarial attack consisting on modifying the input image as little as possible so that it becomes correctly classified. The changes made to the image to fix classification errors explain the causes of misclassification and allow adjusting the model and the data set to obtain more accurate models. We present two methods, of which the first one employs the gradients from the CNN itself to create the adversarial examples and is meant for model developers. However, end users only have access to the CNN model as a black box. Our second method is intended for end users and employs a surrogate model to estimate the gradients of the original CNN model, which are then used to create the adversarial examples. In our experiments, the first method achieved 99.67% success rate at finding the misclassification explanations and needed on average 1.96 queries per misclassified image to build the corresponding adversarial example. The second method achieved 73.08% success rate at finding the explanations with 8.73 queries per image on average.
  • Otros:

    Referencia de l'ítem segons les normes APA: Haffar, R; Jebreel, NM; Domingo-Ferrer, J; Sánchez, D (2021). Explaining Image Misclassification in Deep Learning via Adversarial Examples. : Springer Science and Business Media Deutschland GmbH
    Referencia al articulo segun fuente origial: Lecture Notes In Computer Science. 12898 LNAI 323-334
    DOI del artículo: 10.1007/978-3-030-85529-1_26
    Año de publicación de la revista: 2021-01-01
    Entidad: Universitat Rovira i Virgili
    Versión del articulo depositado: info:eu-repo/semantics/submittedVersion
    Fecha de alta del registro: 2026-05-09
    Autor/es de la URV: Domingo Ferrer, Josep / Haffar, Rami / Sánchez Ruenes, David
    Departamento: Enginyeria Informàtica i Matemàtiques
    URL Documento de licencia: https://repositori.urv.cat/ca/proteccio-de-dades/
    Tipo de publicación: Proceedings Paper
    Autor según el artículo: Haffar, R; Jebreel, NM; Domingo-Ferrer, J; Sánchez, D
    Áreas temáticas: Theoretical computer science, Planejamento urbano e regional / demografia, General o multidisciplinar, General computer science, Comunicació i informació, Computer science, theory & methods, Computer science, artificial intelligence, Computer science (miscellaneous), Computer science (all), Administração, ciências contábeis e turismo
    Direcció de correo del autor: rami.haffar@urv.cat, david.sanchez@urv.cat, david.sanchez@urv.cat, josep.domingo@urv.cat, josep.domingo@urv.cat, josep.domingo@urv.cat, josep.domingo@urv.cat
  • Palabras clave:

    Image classification
    Explainability
    Deep learning
    Convolutional neural networks
    Adversarial examples
    Computer Science (Miscellaneous)
    Computer Science
    Artificial Intelligence
    Theory & Methods
    Theoretical Computer Science
    Planejamento urbano e regional / demografia
    General o multidisciplinar
    General computer science
    Comunicació i informació
    Computer science (all)
    Administração
    ciências contábeis e turismo
  • Documentos:

  • Cerca a google

    Search to google scholar