3,686 research outputs found

    Detecting clinically meaningful biomarkers with repeated measurements in an Electronic Health Record

    Full text link
    Electronic health record (EHR) data are becoming an increasingly common data source for understanding clinical risk of acute events. While their longitudinal nature presents opportunities to observe changing risk over time, these analyses are complicated by the sparse and irregular measurements of many of the clinical metrics making typical statistical methods unsuitable for these data. In this paper, we present an analytic procedure to both sample from an EHR and analyze the data to detect clinically meaningful markers of acute myocardial infarction (MI). Using an EHR from a large national dialysis organization we abstracted the records of 64,318 individuals and identified 5,314 people that had an MI during the study period. We describe a nested case-control design to sample appropriate controls and an analytic approach using regression splines. Fitting a mixed-model with truncated power splines we perform a series of goodness-of-fit tests to determine whether any of 11 regularly collected laboratory markers are useful clinical predictors. We test the clinical utility of each marker using an independent test set. The results suggest that EHR data can be easily used to detect markers of clinically acute events. Special software or analytic tools are not needed, even with irregular EHR data.Comment: 23 pages, 3 figure

    Scale-dependent angle of alignment between velocity and magnetic field fluctuations in solar wind turbulence

    Get PDF
    Under certain conditions, freely decaying magnetohydrodynamic (MHD) turbulence evolves in such a way that velocity and magnetic field fluctuations delta v and delta B approach a state of alignment in which delta v proportional to delta B. This process is called dynamic alignment. Boldyrev has suggested that a similar kind of alignment process occurs as energy cascades from large to small scales through the inertial range in strong incompressible MHD turbulence. In this study, plasma and magnetic field data from the Wind spacecraft, data acquired in the ecliptic plane near 1 AU, are employed to investigate the angle theta(tau) between velocity and magnetic field fluctuations in the solar wind as a function of the time scale tau of the fluctuations and to look for the scaling relation similar to tau(1/4) predicted by Boldyrev. We find that the angle appears to scale like a power law at large inertial range scales, but then deviates from power law behavior at medium to small inertial range scales. We also find that small errors in the velocity vector measurements can lead to large errors in the angle measurements at small time scales. As a result, we cannot rule out the possibility that the observed deviations from power law behavior arise from errors in the velocity measurements. When we fit the data from 2 x 10(3) s to 2 x 10(4) s with a power law of the form proportional to tau(p), our best fit values for p are in the range 0.27-0.36

    Galactic Spiral Structure

    Full text link
    We describe the structure and composition of six major stellar streams in a population of 20 574 local stars in the New Hipparcos Reduction with known radial velocities. We find that, once fast moving stars are excluded, almost all stars belong to one of these streams. The results of our investigation have lead us to re-examine the hydrogen maps of the Milky Way, from which we identify the possibility of a symmetric two-armed spiral with half the conventionally accepted pitch angle. We describe a model of spiral arm motions which matches the observed velocities and composition of the six major streams, as well as the observed velocities of the Hyades and Praesepe clusters at the extreme of the Hyades stream. We model stellar orbits as perturbed ellipses aligned at a focus in coordinates rotating at the rate of precession of apocentre. Stars join a spiral arm just before apocentre, follow the arm for more than half an orbit, and leave the arm soon after pericentre. Spiral pattern speed equals the mean rate of precession of apocentre. Spiral arms are shown to be stable configurations of stellar orbits, up to the formation of a bar and/or ring. Pitch angle is directly related to the distribution of orbital eccentricities in a given spiral galaxy. We show how spiral galaxies can evolve to form bars and rings. We show that orbits of gas clouds are stable only in bisymmetric spirals. We conclude that spiral galaxies evolve toward grand design two-armed spirals. We infer from the velocity distributions that the Milky Way evolved into this form about 9 Gyrs ago.Comment: Published in Proc Roy Soc A. A high resolution version of this file can be downloaded from http://papers.rqgravity.net/SpiralStructure.pdf. A simplified account with animations begins at http://rqgravity.net/SpiralStructur

    A Generalized Approach for Testing the Association of a Set of Predictors with an Outcome: A Gene Based Test

    Get PDF
    In many analyses, one has data on one level but desires to draw inference on another level. For example, in genetic association studies, one observes units of DNA referred to as SNPs, but wants to determine whether genes that are comprised of SNPs are associated with disease. While there are some available approaches for addressing this issue, they usually involve making parametric assumptions and are not easily generalizable. A statistical test is proposed for testing the association of a set of variables with an outcome of interest. No assumptions are made about the functional form relating the variables to the outcome. A general function is fit using any statistical learning algorithm, with the SuperLearner algorithm suggested. The parameter of interest is the cross-validated risk and this is compared to an expected risk. A Wald test is proposed using the influence curve of the cross-validated risk to obtain the variance. It is shown both theoretically and via simulation that the test maintains appropriate type I error control and is more powerful than parametric tests under more general alternatives. The test is applied to an MS candidate gene study. Three separate analyses are performed highlighting the flexibility of the approach

    An application of Random Forests to a genome-wide association dataset: Methodological considerations & new findings

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>As computational power improves, the application of more advanced machine learning techniques to the analysis of large genome-wide association (GWA) datasets becomes possible. While most traditional statistical methods can only elucidate main effects of genetic variants on risk for disease, certain machine learning approaches are particularly suited to discover higher order and non-linear effects. One such approach is the Random Forests (RF) algorithm. The use of RF for SNP discovery related to human disease has grown in recent years; however, most work has focused on small datasets or simulation studies which are limited.</p> <p>Results</p> <p>Using a multiple sclerosis (MS) case-control dataset comprised of 300 K SNP genotypes across the genome, we outline an approach and some considerations for optimally tuning the RF algorithm based on the empirical dataset. Importantly, results show that typical default parameter values are not appropriate for large GWA datasets. Furthermore, gains can be made by sub-sampling the data, pruning based on linkage disequilibrium (LD), and removing strong effects from RF analyses. The new RF results are compared to findings from the original MS GWA study and demonstrate overlap. In addition, four new interesting candidate MS genes are identified, <it>MPHOSPH9, CTNNA3, PHACTR2 </it>and <it>IL7</it>, by RF analysis and warrant further follow-up in independent studies.</p> <p>Conclusions</p> <p>This study presents one of the first illustrations of successfully analyzing GWA data with a machine learning algorithm. It is shown that RF is computationally feasible for GWA data and the results obtained make biologic sense based on previous studies. More importantly, new genes were identified as potentially being associated with MS, suggesting new avenues of investigation for this complex disease.</p

    Assessing the Accuracy of Ancestral Protein Reconstruction Methods

    Get PDF
    The phylogenetic inference of ancestral protein sequences is a powerful technique for the study of molecular evolution, but any conclusions drawn from such studies are only as good as the accuracy of the reconstruction method. Every inference method leads to errors in the ancestral protein sequence, resulting in potentially misleading estimates of the ancestral protein's properties. To assess the accuracy of ancestral protein reconstruction methods, we performed computational population evolution simulations featuring near-neutral evolution under purifying selection, speciation, and divergence using an off-lattice protein model where fitness depends on the ability to be stable in a specified target structure. We were thus able to compare the thermodynamic properties of the true ancestral sequences with the properties of “ancestral sequences” inferred by maximum parsimony, maximum likelihood, and Bayesian methods. Surprisingly, we found that methods such as maximum parsimony and maximum likelihood that reconstruct a “best guess” amino acid at each position overestimate thermostability, while a Bayesian method that sometimes chooses less-probable residues from the posterior probability distribution does not. Maximum likelihood and maximum parsimony apparently tend to eliminate variants at a position that are slightly detrimental to structural stability simply because such detrimental variants are less frequent. Other properties of ancestral proteins might be similarly overestimated. This suggests that ancestral reconstruction studies require greater care to come to credible conclusions regarding functional evolution. Inferred functional patterns that mimic reconstruction bias should be reevaluated

    Turning back

    Get PDF
    This response to Miri Rozmarin’s paper, Staying Alive, focuses on the question of what it might mean to create a response to matricide and patriarchal violence that is grounded in the particularities of cultural and personal history. Rozmarin’s rendering of a possible response to matricide through the mother-daughter genealogy is illustrated in her analysis of the Biblical myth of Lot’s wife. She claims that this story of destruction, punishment and incest reveals ‘an option of non-matricidal relations’ and she gives a compelling account of how this could be so. In my response, I suggest that there are alternative ‘against the grain’ readings that are grounded in the Jewish traditions and sensibilities in which such ‘mythic’ material is embedded and from which it draws its vitality. I offer an example of this, not to refute Rozmarin’s claims, but to suggest that something more nuanced and even loving can be found in the specificity of this cultured and gendered encounter, and that this better meets the conditions for ‘concrete’ ethical resistance that she seeks
    corecore