Articles producció científicaEnginyeria Informàtica i Matemàtiques

A semantic-preserving differentially private method for releasing query logs

  • Dades identificatives

    Identificador:  imarina:3934495
    Autors:  Sanchez, David; Batet, Montserrat; Viejo, Alexandre; Rodriguez-Garcia, Mercedes; Castella-Roca, Jordi
    Resum:
    © 2018 Elsevier Inc. Query logs are of great interest for data analysis. They allow characterizing user profiles, user behaviors and search habits. However, since query logs usually contain personal information, data controllers should implement appropriate data protection mechanisms before releasing them for secondary use. In the past, the anonymization of query logs was tackled from the perspective of statistical disclosure control and by relying on privacy models such as k-anonymity, which do not scale well with the high dimensionality and dynamicity of query logs. To offer better privacy protection, some authors have recently embraced the robust privacy guarantees of ɛ-differential privacy. However, this comes at the cost of limiting the number and types of analyses that can be made on the protected queries. To tackle this issue, in this paper we propose a privacy protection method for query logs that joins the flexibility and convenience of privacy-preserving data releases with the strong privacy guarantees of ɛ-differential privacy. Moreover, to retain the analytical utility of the protected query, we have put special care in capturing, managing and preserving the semantics of the queries during the protection process. The empirical experiments we report show that our method produces differentially private query logs that are more useful for analysis than related works.
  • Altres:

    Autor segons l'article: Sanchez, David; Batet, Montserrat; Viejo, Alexandre; Rodriguez-Garcia, Mercedes; Castella-Roca, Jordi
    Departament: Enginyeria Informàtica i Matemàtiques
    Autor/s de la URV: Batet Sanromà, Montserrat / Castellà Roca, Jordi / Sánchez Ruenes, David / Viejo Galicia, Luis Alexandre
    Paraules clau: Data utility; Differential privacy; Query logs; User profiling
    Resum: © 2018 Elsevier Inc. Query logs are of great interest for data analysis. They allow characterizing user profiles, user behaviors and search habits. However, since query logs usually contain personal information, data controllers should implement appropriate data protection mechanisms before releasing them for secondary use. In the past, the anonymization of query logs was tackled from the perspective of statistical disclosure control and by relying on privacy models such as k-anonymity, which do not scale well with the high dimensionality and dynamicity of query logs. To offer better privacy protection, some authors have recently embraced the robust privacy guarantees of ɛ-differential privacy. However, this comes at the cost of limiting the number and types of analyses that can be made on the protected queries. To tackle this issue, in this paper we propose a privacy protection method for query logs that joins the flexibility and convenience of privacy-preserving data releases with the strong privacy guarantees of ɛ-differential privacy. Moreover, to retain the analytical utility of the protected query, we have put special care in capturing, managing and preserving the semantics of the queries during the protection process. The empirical experiments we report show that our method produces differentially private query logs that are more useful for analysis than related works.
    Àrees temàtiques: Administração pública e de empresas, ciências contábeis e turismo; Artificial intelligence; Astronomia / física; Biodiversidade; Ciência da computação; Ciências agrárias i; Ciências ambientais; Ciências biológicas i; Ciencias sociales; Computer science applications; Computer science, information systems; Comunicação e informação; Control and systems engineering; Engenharias i; Engenharias iii; Engenharias iv; Ensino; Information systems and management; Interdisciplinar; Matemática / probabilidade e estatística; Medicina ii; Software; Theoretical computer science
    Accès a la llicència d'ús: https://creativecommons.org/licenses/by/3.0/es/
    Adreça de correu electrònic de l'autor: alexandre.viejo@urv.cat; jordi.castella@urv.cat; david.sanchez@urv.cat; montserrat.batet@urv.cat
    ISSN: 00200255
    Data d'alta del registre: 2024-10-12
    Versió de l'article dipositat: info:eu-repo/semantics/acceptedVersion
    Enllaç font original: https://www.sciencedirect.com/science/article/abs/pii/S002002551830416X?via%3Dihub
    Referència a l'article segons font original: Information Sciences. 460-461 223-237
    Referència de l'ítem segons les normes APA: Sanchez, David; Batet, Montserrat; Viejo, Alexandre; Rodriguez-Garcia, Mercedes; Castella-Roca, Jordi (2018). A semantic-preserving differentially private method for releasing query logs. Information Sciences, 460-461(), 223-237. DOI: 10.1016/j.ins.2018.05.046
    URL Document de llicència: https://repositori.urv.cat/ca/proteccio-de-dades/
    DOI de l'article: 10.1016/j.ins.2018.05.046
    Entitat: Universitat Rovira i Virgili
    Any de publicació de la revista: 2018
    Tipus de publicació: Journal Publications
  • Paraules clau:

    Artificial Intelligence,Computer Science Applications,Computer Science, Information Systems,Control and Systems Engineering,Information Systems and Management,Software,Theoretical Computer Science
    Data utility
    Differential privacy
    Query logs
    User profiling
    Administração pública e de empresas, ciências contábeis e turismo
    Artificial intelligence
    Astronomia / física
    Biodiversidade
    Ciência da computação
    Ciências agrárias i
    Ciências ambientais
    Ciências biológicas i
    Ciencias sociales
    Computer science applications
    Computer science, information systems
    Comunicação e informação
    Control and systems engineering
    Engenharias i
    Engenharias iii
    Engenharias iv
    Ensino
    Information systems and management
    Interdisciplinar
    Matemática / probabilidade e estatística
    Medicina ii
    Software
    Theoretical computer science
    00200255
  • Documents:

  • Cerca a google

    Search to google scholar