25 research outputs found

    Cor-Split: Defending Privacy in Data Re-Publication from Historical Correlations and Compromised Tuples

    Get PDF
    Abstract. Several approaches have been proposed for privacy preserving data publication. In this paper we consider the important case in which a certain view over a dynamic dataset has to be released a number of times during its history. The insufficiency of techniques used for one-shot publication in the case of subsequent releases has been previously recognized, and some new approaches have been proposed. Our research shows that relevant privacy threats, not recognized by previous proposals, can occur in practice. In particular, we show the cascading effects that a single (or a few) compromised tuples can have in data re-publication when coupled with the ability of an adversary to recognize historical correlations among released tuples. A theoretical study of the threats leads us to a defense algorithm, implemented as a significant extension of the m-invariance technique. Extensive experiments using publicly available datasets show that the proposed technique preserves the utility of published data and effectively protects from the identified privacy threats.

    QuerioCity: A Linked Data Platform for Urban Information Management

    Full text link
    Abstract. In this paper, we present QuerioCity, a platform to catalog, index and query highly heterogenous information coming from complex systems, such as cities. A series of challenges are identified: namely, the heterogeneity of the domain and the lack of a common model, the vol-ume of information and the number of data sets, the requirement for a low entry threshold to the system, the diversity of the input data, in terms of format, syntax and update frequency (streams vs static data), and the sensitivity of the information. We propose an approach for incre-mental and continuous integration of static and streaming data, based on Semantic Web technologies. The proposed system is unique in the literature in terms of handling of multiple integrations of available data sets in combination with flexible provenance tracking, privacy protection and continuous integration of streams. We report on lessons learnt from building the first prototype for Dublin.

    Access Control for Data Integration in Presence of Data Dependencies

    Full text link
    International audienceDefining access control policies in a data integration scenario is a challenging task. In such a scenario typically each source specifies its local access control policy and cannot anticipate data inferences that can arise when data is integrated at the mediator level. Inferences, e.g., using functional dependencies, can allow malicious users to obtain, at the mediator level, prohibited information by linking multiple queries and thus violating the local policies. In this paper, we propose a framework, i.e., a methodology and a set of algorithms, to prevent such violations. First, we use a graph-based approach to identify sets of queries, called violating transactions, and then we propose an approach to forbid the execution of those transactions by identifying additional access control rules that should be added to the mediator. We also state the complexity of the algorithms and discuss a set of experiments we conducted by using both real and synthetic datasets. Tests also confirm the complexity and upper bounds in worst-case scenarios of the proposed algorithms

    The prevalence of mild cognitive impairment in diverse geographical and ethnocultural regions: The COSMIC Collaboration

    Get PDF
    Background Changes in criteria and differences in populations studied and methodology have produced a wide range of prevalence estimates for mild cognitive impairment (MCI). Methods Uniform criteria were applied to harmonized data from 11 studies from USA, Europe, Asia and Australia, and MCI prevalence estimates determined using three separate definitions of cognitive impairment. Results The published range of MCI prevalence estimates was 5.0%-36.7%. This was reduced with all cognitive impairment definitions: performance in the bottom 6.681% (3.2%-10.8%); Clinical Dementia Rating of 0.5 (1.8%-14.9%); Mini-Mental State Examination score of 24-27 (2.1%-20.7%). Prevalences using the first definition were 5.9% overall, and increased with age (P < .001) but were unaffected by sex or the main races/ethnicities investigated (Whites and Chinese). Not completing high school increased the likelihood of MCI (P = .01). Conclusion Applying uniform criteria to harmonized data greatly reduced the variation in MCI prevalence internationally

    Secure Re-publication of Dynamic Big Data

    No full text

    Development of building energy saving advisory: A data mining approach

    No full text
    Occupants’ behavior and their interaction with home appliances are crucial for assessing building energy consumption. This study proposes a new methodology for monitoring the energy consumed in building end-use loads to build an advisory system. The built system alerts occupants to take certain measures (prioritized recommendations) to reduce energy consumption of end-use loads. The quantification of potential savings is also provided upon following said measures. The proposed methodology is also capable of evaluating the energy savings performed by the occupants. The system works based on the analysis of historical data generated by occupants using data mining techniques to output highly feasible recommendations. For demonstration purposes, the methodology was tested on the real dataset of a building in Japan. The dataset includes detailed energy consumption of end-use loads, categorized as hot water supply, lighting, kitchen, refrigerator, entertainment & information, housework & sanitary, and others. Results suggest that the developed models are accurate, and that it is possible to save up to 21% of total energy consumption by only changing occupants’ energy use habits

    Thwarting Passive Privacy Attacks in Collaborative Filtering

    No full text
    corecore