126 research outputs found

    Monitoring data in R with the lumberjack package

    Get PDF
    Monitoring data while it is processed and transformed can yield detailed insight into the dynamics of a (running) production system. The lumberjack package is a lightweight package allowing users to follow how an R object is transformed as it is manipulated by R code. The package abstracts all logging code from the user, who only needs to specify which objects are logged and what information should be logged. A few default loggers are included with the package but the package is extensible through user-defined logger objects.Comment: Accepted for publication in the Journal of Statistical Softwar

    Data Validation Infrastructure for R

    Get PDF
    Checking data quality against domain knowledge is a common activity that pervades statistical analysis from raw data to output. The R package validate facilitates this task by capturing and applying expert knowledge in the form of validation rules: logical restrictions on variables, records, or data sets that should be satisfied before they are considered valid input for further analysis. In the validate package, validation rules are objects of computation that can be manipulated, investigated, and confronted with data or versions of a data set. The results of a confrontation are then available for further investigation, summarization or visualization. Validation rules can also be endowed with metadata and documentation and they may be stored or retrieved from external sources such as text files or tabular formats. This data validation infrastructure thus allows for systematic, user-defined definition of data quality requirements that can be reused for various versions of a data set or by data correction algorithms that are parameterized by validation rules

    Direct measurement of the radiative lifetime of vibrationally excited OH radicals

    Get PDF
    Neutral molecules, isolated in the gas-phase, can be prepared in a long-lived excited state and stored in a trap. The long observation time afforded by the trap can then be exploited to measure the radiative lifetime of this state by monitoring the temporal decay of the population in the trap. This method is demonstrated here and used to benchmark the Einstein AA-coefficients in the Meinel system of OH. A pulsed beam of vibrationally excited OH radicals is Stark decelerated and loaded into an electrostatic quadrupole trap. The radiative lifetime of the upper Λ\Lambda-doublet component of the X2Π3/2,v=1,J=3/2X ^2\Pi_{3/2}, v=1, J=3/2 level is determined as 59.0±2.059.0 \pm 2.0 ms, in good agreement with the calculated value of 57.7±1.057.7 \pm 1.0 ms.Comment: 4 pages, 3 figures, submitted to Phys. Rev. Let

    Beyond the ego network: The effect of distant connections on node anonymity

    Full text link
    Ensuring privacy of individuals is of paramount importance to social network analysis research. Previous work assessed anonymity in a network based on the non-uniqueness of a node's ego network. In this work, we show that this approach does not adequately account for the strong de-anonymizing effect of distant connections. We first propose the use of d-k-anonymity, a novel measure that takes knowledge up to distance d of a considered node into account. Second, we introduce anonymity-cascade, which exploits the so-called infectiousness of uniqueness: mere information about being connected to another unique node can make a given node uniquely identifiable. These two approaches, together with relevant "twin node" processing steps in the underlying graph structure, offer practitioners flexible solutions, tunable in precision and computation time. This enables the assessment of anonymity in large-scale networks with up to millions of nodes and edges. Experiments on graph models and a wide range of real-world networks show drastic decreases in anonymity when connections at distance 2 are considered. Moreover, extending the knowledge beyond the ego network with just one extra link often already decreases overall anonymity by over 50%. These findings have important implications for privacy-aware sharing of sensitive network data

    Substituted anilides from chitin-based 3-acetamido-furfural

    Get PDF
    The synthesis of aromatic compounds from biomass-derived furans is a key strategy in the pursuit of a sustainable economy. Within this field, a Diels-Alder/aromatization cascade reaction with chitin-based furans is emerging as a powerful tool for the synthesis of nitrogen-containing aromatics. In this study we present the conversion of chitin-based 3-acetamido-furfural (3A5F) into an array of di- and tri-substituted anilides in good to high yields (62-90%) via a hydrazone mediated Diels-Alder/aromatization sequence. The addition of acetic anhydride expands the dienophile scope and improves yields. Moreover, replacing the typically used dimethyl hydrazone with its pyrrolidine analogue, shortens reaction times and further increases yields. The hydrazone auxiliary is readily converted into either an aldehyde or a nitrile group, thereby providing a plethora of functionalized anilides. The developed procedure was also applied to 3-acetamido-5-acetylfuran (3A5AF) to successfully prepare a phthalimide. </p

    Why Are Outcomes Different for Registry Patients Enrolled Prospectively and Retrospectively? Insights from the Global Anticoagulant Registry in the FIELD-Atrial Fibrillation (GARFIELD-AF).

    Get PDF
    Background: Retrospective and prospective observational studies are designed to reflect real-world evidence on clinical practice, but can yield conflicting results. The GARFIELD-AF Registry includes both methods of enrolment and allows analysis of differences in patient characteristics and outcomes that may result. Methods and Results: Patients with atrial fibrillation (AF) and ≥1 risk factor for stroke at diagnosis of AF were recruited either retrospectively (n = 5069) or prospectively (n = 5501) from 19 countries and then followed prospectively. The retrospectively enrolled cohort comprised patients with established AF (for a least 6, and up to 24 months before enrolment), who were identified retrospectively (and baseline and partial follow-up data were collected from the emedical records) and then followed prospectively between 0-18 months (such that the total time of follow-up was 24 months; data collection Dec-2009 and Oct-2010). In the prospectively enrolled cohort, patients with newly diagnosed AF (≤6 weeks after diagnosis) were recruited between Mar-2010 and Oct-2011 and were followed for 24 months after enrolment. Differences between the cohorts were observed in clinical characteristics, including type of AF, stroke prevention strategies, and event rates. More patients in the retrospectively identified cohort received vitamin K antagonists (62.1% vs. 53.2%) and fewer received non-vitamin K oral anticoagulants (1.8% vs . 4.2%). All-cause mortality rates per 100 person-years during the prospective follow-up (starting the first study visit up to 1 year) were significantly lower in the retrospective than prospectively identified cohort (3.04 [95% CI 2.51 to 3.67] vs . 4.05 [95% CI 3.53 to 4.63]; p = 0.016). Conclusions: Interpretations of data from registries that aim to evaluate the characteristics and outcomes of patients with AF must take account of differences in registry design and the impact of recall bias and survivorship bias that is incurred with retrospective enrolment. Clinical Trial Registration: - URL: http://www.clinicaltrials.gov . Unique identifier for GARFIELD-AF (NCT01090362)
    • …
    corecore