1,897 research outputs found

    De-identifying a public use microdata file from the Canadian national discharge abstract database

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The Canadian Institute for Health Information (CIHI) collects hospital discharge abstract data (DAD) from Canadian provinces and territories. There are many demands for the disclosure of this data for research and analysis to inform policy making. To expedite the disclosure of data for some of these purposes, the construction of a DAD public use microdata file (PUMF) was considered. Such purposes include: confirming some published results, providing broader feedback to CIHI to improve data quality, training students and fellows, providing an easily accessible data set for researchers to prepare for analyses on the full DAD data set, and serve as a large health data set for computer scientists and statisticians to evaluate analysis and data mining techniques. The objective of this study was to measure the probability of re-identification for records in a PUMF, and to de-identify a national DAD PUMF consisting of 10% of records.</p> <p>Methods</p> <p>Plausible attacks on a PUMF were evaluated. Based on these attacks, the 2008-2009 national DAD was de-identified. A new algorithm was developed to minimize the amount of suppression while maximizing the precision of the data. The acceptable threshold for the probability of correct re-identification of a record was set at between 0.04 and 0.05. Information loss was measured in terms of the extent of suppression and entropy.</p> <p>Results</p> <p>Two different PUMF files were produced, one with geographic information, and one with no geographic information but more clinical information. At a threshold of 0.05, the maximum proportion of records with the diagnosis code suppressed was 20%, but these suppressions represented only 8-9% of all values in the DAD. Our suppression algorithm has less information loss than a more traditional approach to suppression. Smaller regions, patients with longer stays, and age groups that are infrequently admitted to hospitals tend to be the ones with the highest rates of suppression.</p> <p>Conclusions</p> <p>The strategies we used to maximize data utility and minimize information loss can result in a PUMF that would be useful for the specific purposes noted earlier. However, to create a more detailed file with less information loss suitable for more complex health services research, the risk would need to be mitigated by requiring the data recipient to commit to a data sharing agreement.</p

    MUSiC: a model-unspecific search for new physics in proton–proton collisions at √s=13TeV

    Get PDF
    Results of the Model Unspecific Search in CMS (MUSiC), using proton–proton collision data recorded at the LHC at a centre-of-mass energy of 13TeV, corresponding to an integrated luminosity of 35.9fb-1, are presented. The MUSiC analysis searches for anomalies that could be signatures of physics beyond the standard model. The analysis is based on the comparison of observed data with the standard model prediction, as determined from simulation, in several hundred final states and multiple kinematic distributions. Events containing at least one electron or muon are classified based on their final state topology, and an automated search algorithm surveys the observed data for deviations from the prediction. The sensitivity of the search is validated using multiple methods. No significant deviations from the predictions have been observed. For a wide range of final state topologies, agreement is found between the data and the standard model simulation. This analysis complements dedicated search analyses by significantly expanding the range of final states covered using a model independent approach with the largest data set to date to probe phase space regions beyond the reach of previous general searches

    Search for low-mass dilepton resonances in Higgs boson decays to four-lepton final states in proton–proton collisions at √s=13TeV

    Get PDF
    A search for low-mass dilepton resonances in Higgs boson decays is conducted in the four-lepton final state. The decay is assumed to proceed via a pair of beyond the standard model particles, or one such particle and a Z boson. The search uses proton–proton collision data collected with the CMS detector at the CERN LHC, corresponding to an integrated luminosity of 137 fb−1, at a center-of-mass energy √s = 13 TeV. No significant deviation from the standard model expectation is observed. Upper limits at 95% confidence level are set on model-independent Higgs boson decay branching fractions. Additionally, limits on dark photon and axion-like particle production, based on two specific models, are reported

    Combined searches for the production of supersymmetric top quark partners in proton–proton collisions at √s=13Te

    Get PDF
    A combination of searches for top squark pair production using proton–proton collision data at a center-of-mass energy of 13TeV at the CERN LHC, corresponding to an integrated luminosity of 137fb−1^{-1} collected by the CMS experiment, is presented. Signatures with at least 2 jets and large missing transverse momentum are categorized into events with 0, 1, or 2 leptons. New results for regions of parameter space where the kinematical properties of top squark pair production and top quark pair production are very similar are presented. Depending on the model, the combined result excludes a top squark mass up to 1325GeV for a massless neutralino, and a neutralino mass up to 700GeV for a top squark mass of 1150GeV. Top squarks with masses from 145 to 295GeV, for neutralino masses from 0 to 100GeV, with a mass difference between the top squark and the neutralino in a window of 30GeV around the mass of the top quark, are excluded for the first time with CMS data. The results of theses searches are also interpreted in an alternative signal model of dark matter production via a spin-0 mediator in association with a top quark pair. Upper limits are set on the cross section for mediator particle masses of up to 420GeV
    • 

    corecore