Articles producció científicaEnginyeria Informàtica i Matemàtiques

Utility-preserving privacy protection of nominal data sets via semantic rank swapping

  • Dades identificatives

    Identificador:  imarina:5131995
    Autors:  Rodriguez-Garcia, Mercedes; Batet, Montserrat; Sanchez, David
    Resum:
    © 2018 Personal data are of great interest for research but, at the same time, they pose a serious privacy risk. Therefore, appropriate data protection measures should be undertaken by the data controller before making personal data available for secondary use. Also, such data protection should be done in a way that data are still useful for analysis. In the last years, a plethora of data protection mechanisms have been proposed. Among them, rank swapping is considered one of the best with respect to disclosure risk minimization and data utility preservation. Because rank swapping is based on sorting input data to swap values that are close to each other, in principle, it is a method restricted to numerical and ordinal categorical data. However, a significant amount of personal data currently compiled and used in data analysis are nominal, and their utility depends on the semantics they convey. To properly cope with this type of data, in this paper, we present rank swapping methods capable of protecting nominal data from a semantic perspective. Specifically, by exploiting ontologies, our methods are able to protect nominal data while properly preserving their semantics and, thus, their analytical utility. For that, we provide a suitable binary relation to semantically sort nominal data. Our proposal is capable of managing both independent individual attributes and non-independent multivariate data sets, being the latter especially relevant for data analysis. Empirical experiments carried on real clinical records and using a standard medical ontology show that our methods are able to preserve the semantic features of nominal data significantly better than standard permutation mechanisms.
  • Altres:

    Autor segons l'article: Rodriguez-Garcia, Mercedes; Batet, Montserrat; Sanchez, David
    Departament: Enginyeria Informàtica i Matemàtiques
    Autor/s de la URV: Batet Sanromà, Montserrat / Sánchez Ruenes, David
    Paraules clau: Nominal data; Ontologies; Rank swapping; Semantics
    Resum: © 2018 Personal data are of great interest for research but, at the same time, they pose a serious privacy risk. Therefore, appropriate data protection measures should be undertaken by the data controller before making personal data available for secondary use. Also, such data protection should be done in a way that data are still useful for analysis. In the last years, a plethora of data protection mechanisms have been proposed. Among them, rank swapping is considered one of the best with respect to disclosure risk minimization and data utility preservation. Because rank swapping is based on sorting input data to swap values that are close to each other, in principle, it is a method restricted to numerical and ordinal categorical data. However, a significant amount of personal data currently compiled and used in data analysis are nominal, and their utility depends on the semantics they convey. To properly cope with this type of data, in this paper, we present rank swapping methods capable of protecting nominal data from a semantic perspective. Specifically, by exploiting ontologies, our methods are able to protect nominal data while properly preserving their semantics and, thus, their analytical utility. For that, we provide a suitable binary relation to semantically sort nominal data. Our proposal is capable of managing both independent individual attributes and non-independent multivariate data sets, being the latter especially relevant for data analysis. Empirical experiments carried on real clinical records and using a standard medical ontology show that our methods are able to preserve the semantic features of nominal data significantly better than standard permutation mechanisms.
    Àrees temàtiques: Ciência da computação; Computer science, artificial intelligence; Computer science, theory & methods; Engenharias iii; Engenharias iv; Hardware and architecture; Information systems; Signal processing; Software
    Accès a la llicència d'ús: https://creativecommons.org/licenses/by/3.0/es/
    Adreça de correu electrònic de l'autor: david.sanchez@urv.cat; montserrat.batet@urv.cat
    ISSN: 15662535
    Data d'alta del registre: 2025-02-18
    Versió de l'article dipositat: info:eu-repo/semantics/acceptedVersion
    Enllaç font original: https://www.sciencedirect.com/science/article/abs/pii/S1566253517304657?via%3Dihub
    Referència a l'article segons font original: Information Fusion. 45 282-295
    Referència de l'ítem segons les normes APA: Rodriguez-Garcia, Mercedes; Batet, Montserrat; Sanchez, David (2019). Utility-preserving privacy protection of nominal data sets via semantic rank swapping. Information Fusion, 45(), 282-295. DOI: 10.1016/j.inffus.2018.02.008
    URL Document de llicència: https://repositori.urv.cat/ca/proteccio-de-dades/
    DOI de l'article: 10.1016/j.inffus.2018.02.008
    Entitat: Universitat Rovira i Virgili
    Any de publicació de la revista: 2019
    Tipus de publicació: Journal Publications
  • Paraules clau:

    Computer Science, Artificial Intelligence,Computer Science, Theory & Methods,Hardware and Architecture,Information Systems,Signal Processing,Software
    Nominal data
    Ontologies
    Rank swapping
    Semantics
    Ciência da computação
    Computer science, artificial intelligence
    Computer science, theory & methods
    Engenharias iii
    Engenharias iv
    Hardware and architecture
    Information systems
    Signal processing
    Software
    15662535
  • Documents:

  • Cerca a google

    Search to google scholar