Articles producció científicaEnginyeria Informàtica i Matemàtiques

Digital forgetting in large language models: a survey of unlearning methods

  • Dades identificatives

    Identificador:  imarina:9435540
    Autors:  Blanco-Justicia, Alberto; Jebreel, Najeeb; Manzanares-Salor, Benet; Sanchez, David; Domingo-Ferrer, Josep; Collell, Guillem; Eeik Tan, Kuan
    Resum:
    Large language models (LLMs) have become the state of the art in natural language processing. The massive adoption of generative LLMs and the capabilities they have shown have prompted public concerns regarding their impact on the labor market, privacy, the use of copyrighted work, and how these models align with human ethics and the rule of law. As a response, new regulations are being pushed, which require developers and service providers to evaluate, monitor, and forestall or at least mitigate the risks posed by their models. One mitigation strategy is digital forgetting: given a model with undesirable knowledge or behavior, the goal is to obtain a new model where the detected issues are no longer present. Digital forgetting is usually enforced via machine unlearning techniques, which modify trained machine learning models for them to behave as models trained on a subset of the original training data. In this work, we describe the motivations and desirable properties of digital forgetting when applied to LLMs, and we survey recent works on machine unlearning. Specifically, we propose a taxonomy of unlearning methods based on the reach and depth of the modifications done on the models, we discuss and compare the effectiveness of machine unlearning methods for LLMs proposed so far, and we survey their evaluation. Finally, we describe open problems of machine unlearning applied to LLMs and we put forward recommendations for developers and practitioners.
  • Altres:

    Enllaç font original: https://link.springer.com/article/10.1007/s10462-024-11078-6
    Referència de l'ítem segons les normes APA: Blanco-Justicia, Alberto; Jebreel, Najeeb; Manzanares-Salor, Benet; Sanchez, David; Domingo-Ferrer, Josep; Collell, Guillem; Eeik Tan, Kuan (2025). Digital forgetting in large language models: a survey of unlearning methods. Artificial Intelligence Review, 58(90), -. DOI: 10.1007/s10462-024-11078-6
    Referència a l'article segons font original: Artificial Intelligence Review. 58 (90):
    DOI de l'article: 10.1007/s10462-024-11078-6
    Any de publicació de la revista: 2025
    Entitat: Universitat Rovira i Virgili
    Versió de l'article dipositat: info:eu-repo/semantics/publishedVersion
    Data d'alta del registre: 2025-03-03
    Autor/s de la URV: Blanco Justicia, Alberto / Domingo Ferrer, Josep / Jebreel, Najeeb Moharram Salim / Manzanares Salor, Benet / Sánchez Ruenes, David
    Departament: Enginyeria Informàtica i Matemàtiques
    URL Document de llicència: https://repositori.urv.cat/ca/proteccio-de-dades/
    Tipus de publicació: Journal Publications
    Autor segons l'article: Blanco-Justicia, Alberto; Jebreel, Najeeb; Manzanares-Salor, Benet; Sanchez, David; Domingo-Ferrer, Josep; Collell, Guillem; Eeik Tan, Kuan
    Accès a la llicència d'ús: https://creativecommons.org/licenses/by/3.0/es/
    Àrees temàtiques: Artificial intelligence, Biotecnología, Ciência da computação, Ciências biológicas i, Ciencias humanas, Ciencias sociales, Computer science, artificial intelligence, Engenharias iv, Filologia, lingüística i sociolingüística, Language and linguistics, Linguistics, Linguistics and language, Medicina i, Psicología, Psychology
    Adreça de correu electrònic de l'autor: josep.domingo@urv.cat, david.sanchez@urv.cat, najeeb.jebreel@urv.cat, alberto.blanco@urv.cat, benet.manzanares@urv.cat, najeeb.jebreel@urv.cat
  • Paraules clau:

    Copyright
    Large language models
    Machine unlearning
    Privacy
    Trustworthy a
    Trustworthy ai
    Artificial Intelligence
    Computer Science
    Language and Linguistics
    Linguistics and Language
    Biotecnología
    Ciência da computação
    Ciências biológicas i
    Ciencias humanas
    Ciencias sociales
    Engenharias iv
    Filologia
    lingüística i sociolingüística
    Linguistics
    Medicina i
    Psicología
    Psychology
  • Documents:

  • Cerca a google

    Search to google scholar