4 research outputs found

    Mathematically optimized, recursive prepartitioning strategies for k-anonymous microaggregation of large-scale datasets

    Get PDF
    © Elsevier. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/The technical contents of this work fall within the statistical disclosure control (SDC) field, which concerns the postprocessing of the demographic portion of the statistical results of surveys containing sensitive personal information, in order to effectively safeguard the anonymity of the participating respondents. A widely known technique to solve the problem of protecting the privacy of the respondents involved beyond the mere suppression of their identifiers is the k-anonymous microaggregation. Unfortunately, most microaggregation algorithms that produce competitively low levels of distortions exhibit a superlinear running time, typically scaling with the square of the number of records in the dataset. This work proposes and analyzes an optimized prepartitioning strategy to reduce significantly the running time for the k-anonymous microaggregation algorithm operating on large datasets, with mild loss in data utility with respect to that of MDAV, the underlying method. The optimization strategy is based on prepartitioning a dataset recursively until the desired k-anonymity parameter is achieved. Traditional microaggregation algorithms have quadratic computational complexity in the form T(n2). By using the proposed method and fixing the number of recurrent prepartitions we obtain subquadratic complexity in the form T(n3/2), T(n4/3), ..., depending on the number of prepartitions. Alternatively, fixing the ratio between the size of the microcell and the macrocell on each prepartition, quasilinear complexity in the form T(nlog¿n) is achieved. Our method is readily applicable to large-scale datasets with numerical demographic attributes.Peer ReviewedPostprint (author's final draft

    Differentially private publication of database streams via hybrid video coding

    Get PDF
    While most anonymization technology available today is designed for static and small data, the current picture is of massive volumes of dynamic data arriving at unprecedented velocities. From the standpoint of anonymization, the most challenging type of dynamic data is data streams. However, while the majority of proposals deal with publishing either count-based or aggregated statistics about the underlying stream, little attention has been paid to the problem of continuously publishing the stream itself with differential privacy guarantees. In this work, we propose an anonymization method that can publish multiple numerical-attribute, finite microdata streams with high protection as well as high utility, the latter aspect measured as data distortion, delay and record reordering. Our method, which relies on the well-known differential pulse-code modulation scheme, adapts techniques originally intended for hybrid video encoding, to favor and leverage dependencies among the blocks of the original stream and thereby reduce data distortion. The proposed solution is assessed experimentally on two of the largest data sets in the scientific community working in data anonymization. Our extensive empirical evaluation shows the trade-off among privacy protection, data distortion, delay and record reordering, and demonstrates the suitability of adapting video-compression techniques to anonymize database streams

    XIII Jornadas de ingeniería telemática (JITEL 2017)

    Full text link
    Las Jornadas de Ingeniería Telemática (JITEL), organizadas por la Asociación de Telemática (ATEL), constituyen un foro propicio de reunión, debate y divulgación para los grupos que imparten docencia e investigan en temas relacionados con las redes y los servicios telemáticos. Con la organización de este evento se pretende fomentar, por un lado el intercambio de experiencias y resultados, además de la comunicación y cooperación entre los grupos de investigación que trabajan en temas relacionados con la telemática. En paralelo a las tradicionales sesiones que caracterizan los congresos científicos, se desea potenciar actividades más abiertas, que estimulen el intercambio de ideas entre los investigadores experimentados y los noveles, así como la creación de vínculos y puntos de encuentro entre los diferentes grupos o equipos de investigación. Para ello, además de invitar a personas relevantes en los campos correspondientes, se van a incluir sesiones de presentación y debate de las líneas y proyectos activos de los mencionados equiposLloret Mauri, J.; Casares Giner, V. (2018). XIII Jornadas de ingeniería telemática (JITEL 2017). Editorial Universitat Politècnica de València. http://hdl.handle.net/10251/97612EDITORIA
    corecore