2,870 research outputs found

    Feedback-based integration of the whole process of data anonymization in a graphical interface

    Get PDF
    The interactive, web-based point-and-click application presented in this article, allows anonymizing data without any knowledge in a programming language. Anonymization in data mining, but creating safe, anonymized data is by no means a trivial task. Both the methodological issues as well as know-how from subject matter specialists should be taken into account when anonymizing data. Even though specialized software such as sdcMicro exists, it is often difficult for nonexperts in a particular software and without programming skills to actually anonymize datasets without an appropriate app. The presented app is not restricted to apply disclosure limitation techniques but rather facilitates the entire anonymization process. This interface allows uploading data to the system, modifying them and to create an object defining the disclosure scenario. Once such a statistical disclosure control (SDC) problem has been defined, users can apply anonymization techniques to this object and get instant feedback on the impact on risk and data utility after SDC methods have been applied. Additional features, such as an Undo Button, the possibility to export the anonymized dataset or the required code for reproducibility reasons, as well its interactive features, make it convenient both for experts and nonexperts in R – the free software environment for statistical computing and graphics – to protect a dataset using this app

    Synthetic sequence generator for recommender systems - memory biased random walk on sequence multilayer network

    Full text link
    Personalized recommender systems rely on each user's personal usage data in the system, in order to assist in decision making. However, privacy policies protecting users' rights prevent these highly personal data from being publicly available to a wider researcher audience. In this work, we propose a memory biased random walk model on multilayer sequence network, as a generator of synthetic sequential data for recommender systems. We demonstrate the applicability of the synthetic data in training recommender system models for cases when privacy policies restrict clickstream publishing.Comment: The new updated version of the pape

    Clinical Scores for Dyspnoea Severity in Children:A Prospective Validation Study

    Get PDF
    In acute dyspnoeic children, assessment of dyspnoea severity and treatment response is frequently based on clinical dyspnoea scores. Our study aim was to validate five commonly used paediatric dyspnoea scores.Fifty children aged 0-8 years with acute dyspnoea were clinically assessed before and after bronchodilator treatment, a subset of 27 children were videotaped and assessed twice by nine observers. The observers scored clinical signs necessary to calculate the Asthma Score (AS), Asthma Severity Score (ASS), Clinical Asthma Evaluation Score 2 (CAES-2), Pediatric Respiratory Assessment Measure (PRAM) and respiratory rate, accessory muscle use, decreased breath sounds (RAD).A total of 1120 observations were used to assess fourteen measurement properties within domains of validity, reliability and utility. All five dyspnoea scores showed overall poor results, scoring insufficiently on more than half of the quality criteria for measurement properties. The AS and PRAM were the most valid with good values on six and moderate values on three properties. Poor results were mainly due to insufficient measurement properties in the validity and reliability domains whereas utility properties were moderate to good in all scores.This study shows that commonly used dyspnoea scores show insufficient validity and reliability to allow for clinical use without caution

    Assessing Maine’s ERAM experiment

    Get PDF
    Maine’s utility regulators have occasionally ventured into the uncharted waters of utility regulation reform. Some such efforts have been more successful than others. Leslie Hudson and Stephanie Seguino document the process and outcomes of one such attempt at alternative electric utility regulation, the Electric Revenue Adjustment Mechanism, or ERAM. They endeavor to answer several questions arising from this brief and failed, but interesting regulatory experiment

    Peak-ratio analysis method for enhancement of LOM protection using M class PMUs

    Get PDF
    A novel technique for loss of mains (LOM) detection, using Phasor Measurement Unit (PMU) data, is described in this paper. The technique, known as the Peak Ratio Analysis Method (PRAM), improves both sensitivity and stability of LOM protection when compared to prevailing techniques. The technique is based on a Rate of Change of Frequency (ROCOF) measurement from M-class PMUs, but the key novelty of the method lies in the fact that it employs a new “peak-ratio” analysis of the measured ROCOF waveform during any frequency disturbance to determine whether the potentially-islanded element of the network is grid connected or not. The proposed technique is described and several examples of its operation are compared with three competing LOM protection methods that have all been widely used by industry and/or reported in the literature: standard ROCOF, Phase Offset Relay (POR) and Phase Angle Difference (PAD) methods. It is shown that the PRAM technique exhibits comparable performance to the others, and in many cases improves upon their abilities, in particular for systems where the inertia of the main power system is reduced, which may arise in future systems with increased penetrations of renewable generation and HVDC infeed

    Avoiding disclosure of individually identifiable health information: a literature review

    Get PDF
    Achieving data and information dissemination without arming anyone is a central task of any entity in charge of collecting data. In this article, the authors examine the literature on data and statistical confidentiality. Rather than comparing the theoretical properties of specific methods, they emphasize the main themes that emerge from the ongoing discussion among scientists regarding how best to achieve the appropriate balance between data protection, data utility, and data dissemination. They cover the literature on de-identification and reidentification methods with emphasis on health care data. The authors also discuss the benefits and limitations for the most common access methods. Although there is abundant theoretical and empirical research, their review reveals lack of consensus on fundamental questions for empirical practice: How to assess disclosure risk, how to choose among disclosure methods, how to assess reidentification risk, and how to measure utility loss.public use files, disclosure avoidance, reidentification, de-identification, data utility
    • …
    corecore