Author, as appears in the article.: Jordi Castellà-Roca; David Pàmies-Estrems; Alexandre Viejo
Department: Enginyeria Informàtica i Matemàtiques
URV's Author/s: CASTELLÀ ROCA, JORDI; David Pàmies-Estrems; VIEJO GALICIA, LUIS ALEXANDRE
Keywords: web search Query logs Privacy
Abstract: The popularity of Web Search Engines (WSEs) enables them to generate a lot of data in form of query logs. These files contain all search queries submitted by users. Economical benefits could be earned by means of selling or releasing those logs to third parties. Nevertheless, this data potentially expose sensitive user information. Removing direct identifiers is not sufficient to preserve the privacy of the users. Some existing privacy-preserving approaches use log batch processing but, as logs are generated and consumed in a real-time environment, a continuous anonymization process would be more convenient. In this way, in this paper we propose: (i) a new method to anonymize query logs, based on k-anonymity; and (ii) some de-anonymization tools to determine possible privacy problems, in case that an attacker gains access to the anonymized query logs. This approach preserves the original user interests, but spreads possible semi-identifier information over many users, preventing linkage attacks. To assess its performance, all the proposed algorithms are implemented and an extensive set of experiments are conducted using real data.
Research group: Criptografia i Secret Estadístic
Thematic Areas: Computer engineering Ingeniería informática Enginyeria informàtica
licence for use: https://creativecommons.org/licenses/by/3.0/es/
ISSN: 0957-4174
Author identifier: 0000-0002-0037-9888; N/A; 0000-0003-2342-5100
Record's date: 2016-09-21
Last page: 535
Journal volume: 64
Papper version: info:eu-repo/semantics/acceptedVersion
Licence document URL: https://repositori.urv.cat/ca/proteccio-de-dades/
Entity: Universitat Rovira i Virgili
Journal publication year: 2016
First page: 523
Publication Type: Article Artículo Article