Toward sensitive document release with privacy guarantees

Sánchez, D; Batet, M

doi:10.1016/j.engappai.2016.12.013

Dades identificatives

Identificador: imarina:5130658

Handle: https://hdl.handle.net/20.500.11797/imarina5130658

Autors: Sánchez, D; Batet, M

Resum:
Privacy has become a serious concern for modern Information Societies. The sensitive nature of much of the data that are daily exchanged or released to untrusted parties requires that responsible organizations undertake appropriate privacy protection measures. Nowadays, much of these data are texts (e.g., emails, messages posted in social media, healthcare outcomes, etc.) that, because of their unstructured and semantic nature, constitute a challenge for automatic data protection methods. In fact, textual documents are usually protected manually, in a process known as document redaction or sanitization. To do so, human experts identify sensitive terms (i.e., terms that may reveal identities and/or confidential information) and protect them accordingly (e.g., via removal or, preferably, generalization). To relieve experts from this burdensome task, in a previous work we introduced the theoretical basis of C-sanitization, an inherently semantic privacy model that provides the basis to the development of automatic document redaction/sanitization algorithms and offers clear and a priori privacy guarantees on data protection; even though its potential benefits C-sanitization still presents some limitations when applied to practice (mainly regarding flexibility, efficiency and accuracy). In this paper, we propose a new more flexible model, named (C, g(C))-sanitization, which enables an intuitive configuration of the trade-off between the desired level of protection (i.e., controlled information disclosure) and the preservation of the utility of the protected data (i.e., amount of semantics to be preserved). Moreover, we also present a set of technical solutions and algorithms that provide an efficient and scalable implementation of the model and improve its practical accuracy, as we also illustrate through empirical experiments.
Altres:

Enllaç font original: https://www.sciencedirect.com/science/article/pii/S0952197616302408
Referència de l'ítem segons les normes APA: Sánchez, D; Batet, M (2017). Toward sensitive document release with privacy guarantees. Engineering Applications Of Artificial Intelligence, 59(), 23-34. DOI: 10.1016/j.engappai.2016.12.013
Referència a l'article segons font original: Engineering Applications Of Artificial Intelligence. 59 23-34
DOI de l'article: 10.1016/j.engappai.2016.12.013
Any de publicació de la revista: 2017-03-01
Entitat: Universitat Rovira i Virgili
Versió de l'article dipositat: info:eu-repo/semantics/acceptedVersion
Data d'alta del registre: 2026-05-09
Autor/s de la URV: Batet Sanromà, Montserrat / Sánchez Ruenes, David
Departament: Enginyeria Informàtica i Matemàtiques
URL Document de llicència: https://repositori.urv.cat/ca/proteccio-de-dades/
Tipus de publicació: Journal Publications
Autor segons l'article: Sánchez, D; Batet, M
Accès a la llicència d'ús: https://creativecommons.org/licenses/by/3.0/es/
Àrees temàtiques: Robotics & automatic control, Engineering, multidisciplinary, Engineering, electrical & electronic, Engineering, Engenharias iv, Electrical and electronic engineering, Control and systems engineering, Computer science, artificial intelligence, Ciência da computação, Automation & control systems, Artificial intelligence
Adreça de correu electrònic de l'autor: montserrat.batet@urv.cat, montserrat.batet@urv.cat, david.sanchez@urv.cat, david.sanchez@urv.cat, montserrat.batet@urv.cat

Paraules clau:

Semantics
Sanitization
Privacy
Ontologies
Document redaction
Artificial Intelligence
Automation & Control Systems
Computer Science
Control and Systems Engineering
Electrical and Electronic Engineering
Engineering
Electrical & Electronic
Multidisciplinary
Robotics & Automatic Control
Engenharias iv
Ciência da computação
Documents:

DocumentPrincipal
Cerca a google

Toward sensitive document release with privacy guarantees

Dades identificatives

Altres:

Paraules clau:

Documents:

Cerca a google