Articles producció científica> Enginyeria Informàtica i Matemàtiques

Utility preserving query log anonymization via semantic microaggregation

  • Dades identificatives

    Identificador: imarina:9285292
    Autors:
    Batet, MontserratErola, ArnauSanchez, DavidCastella-Roca, Jordi
    Resum:
    Query logs are of great interest for scientists and companies for research, statistical and commercial purposes. However, the availability of query logs for secondary uses raises privacy issues since they allow the identification and/or revelation of sensitive information about individual users. Hence, query anonymization is crucial to avoid identity disclosure. To enable the publication of privacy-preserved - but still useful - query logs, in this paper, we present an anonymization method based on semantic microaggregation. Our proposal aims at minimizing the disclosure risk of anonymized query logs while retaining their semantics as much as possible. First, a method to map queries to their formal semantics extracted from the structured categories of the Open Directory Project is presented. Then, a microaggregation method is adapted to perform a semantically-grounded anonymization of query logs. To do so, appropriate semantic similarity and semantic aggregation functions are proposed. Experiments performed using real AOL query logs show that our proposal better retains the utility of anonymized query logs than other related works, while also minimizing the disclosure risk. © 2013 Elsevier Inc. All rights reserved.
  • Altres:

    Autor segons l'article: Batet, Montserrat; Erola, Arnau; Sanchez, David; Castella-Roca, Jordi
    Departament: Enginyeria Informàtica i Matemàtiques
    Autor/s de la URV: Batet Sanromà, Montserrat / Castellà Roca, Jordi / EROLA CAÑELLAS, ARNAU / Sánchez Ruenes, David
    Paraules clau: Sensitive informations Semantics Semantic similarity Semantic aggregation Query logs Privacy-preservation Privacy preservation Open directory projects Microaggregation Information retrieval Data utility Data utilities
    Resum: Query logs are of great interest for scientists and companies for research, statistical and commercial purposes. However, the availability of query logs for secondary uses raises privacy issues since they allow the identification and/or revelation of sensitive information about individual users. Hence, query anonymization is crucial to avoid identity disclosure. To enable the publication of privacy-preserved - but still useful - query logs, in this paper, we present an anonymization method based on semantic microaggregation. Our proposal aims at minimizing the disclosure risk of anonymized query logs while retaining their semantics as much as possible. First, a method to map queries to their formal semantics extracted from the structured categories of the Open Directory Project is presented. Then, a microaggregation method is adapted to perform a semantically-grounded anonymization of query logs. To do so, appropriate semantic similarity and semantic aggregation functions are proposed. Experiments performed using real AOL query logs show that our proposal better retains the utility of anonymized query logs than other related works, while also minimizing the disclosure risk. © 2013 Elsevier Inc. All rights reserved.
    Àrees temàtiques: Theoretical computer science Software Medicina ii Matemática / probabilidade e estatística Interdisciplinar Information systems and management Ensino Engenharias iv Engenharias iii Engenharias i Control and systems engineering Comunicação e informação Computer science, information systems Computer science applications Ciencias sociales Ciências biológicas i Ciências ambientais Ciências agrárias i Ciência da computação Biodiversidade Astronomia / física Artificial intelligence Administração pública e de empresas, ciências contábeis e turismo
    Accès a la llicència d'ús: https://creativecommons.org/licenses/by/3.0/es/
    Adreça de correu electrònic de l'autor: montserrat.batet@urv.cat david.sanchez@urv.cat jordi.castella@urv.cat
    Identificador de l'autor: 0000-0001-8174-7592 0000-0001-7275-7887 0000-0002-0037-9888
    Data d'alta del registre: 2024-10-12
    Versió de l'article dipositat: info:eu-repo/semantics/acceptedVersion
    URL Document de llicència: https://repositori.urv.cat/ca/proteccio-de-dades/
    Referència a l'article segons font original: Information Sciences. 242 49-63
    Referència de l'ítem segons les normes APA: Batet, Montserrat; Erola, Arnau; Sanchez, David; Castella-Roca, Jordi (2013). Utility preserving query log anonymization via semantic microaggregation. Information Sciences, 242(), 49-63. DOI: 10.1016/j.ins.2013.04.020
    Entitat: Universitat Rovira i Virgili
    Any de publicació de la revista: 2013
    Tipus de publicació: Journal Publications
  • Paraules clau:

    Artificial Intelligence,Computer Science Applications,Computer Science, Information Systems,Control and Systems Engineering,Information Systems and Management,Software,Theoretical Computer Science
    Sensitive informations
    Semantics
    Semantic similarity
    Semantic aggregation
    Query logs
    Privacy-preservation
    Privacy preservation
    Open directory projects
    Microaggregation
    Information retrieval
    Data utility
    Data utilities
    Theoretical computer science
    Software
    Medicina ii
    Matemática / probabilidade e estatística
    Interdisciplinar
    Information systems and management
    Ensino
    Engenharias iv
    Engenharias iii
    Engenharias i
    Control and systems engineering
    Comunicação e informação
    Computer science, information systems
    Computer science applications
    Ciencias sociales
    Ciências biológicas i
    Ciências ambientais
    Ciências agrárias i
    Ciência da computação
    Biodiversidade
    Astronomia / física
    Artificial intelligence
    Administração pública e de empresas, ciências contábeis e turismo
  • Documents:

  • Cerca a google

    Search to google scholar