541 research outputs found

    "Virus hunting" using radial distance weighted discrimination

    Get PDF
    Motivated by the challenge of using DNA-seq data to identify viruses in human blood samples, we propose a novel classification algorithm called "Radial Distance Weighted Discrimination" (or Radial DWD). This classifier is designed for binary classification, assuming one class is surrounded by the other class in very diverse radial directions, which is seen to be typical for our virus detection data. This separation of the 2 classes in multiple radial directions naturally motivates the development of Radial DWD. While classical machine learning methods such as the Support Vector Machine and linear Distance Weighted Discrimination can sometimes give reasonable answers for a given data set, their generalizability is severely compromised because of the linear separating boundary. Radial DWD addresses this challenge by using a more appropriate (in this particular case) spherical separating boundary. Simulations show that for appropriate radial contexts, this gives much better generalizability than linear methods, and also much better than conventional kernel based (nonlinear) Support Vector Machines, because the latter methods essentially use much of the information in the data for determining the shape of the separating boundary. The effectiveness of Radial DWD is demonstrated for real virus detection.Comment: Published at http://dx.doi.org/10.1214/15-AOAS869 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A Fast and Compact Quantum Random Number Generator

    Get PDF
    We present the realization of a physical quantum random number generator based on the process of splitting a beam of photons on a beam splitter, a quantum mechanical source of true randomness. By utilizing either a beam splitter or a polarizing beam splitter, single photon detectors and high speed electronics the presented devices are capable of generating a binary random signal with an autocorrelation time of 11.8 ns and a continuous stream of random numbers at a rate of 1 Mbit/s. The randomness of the generated signals and numbers is shown by running a series of tests upon data samples. The devices described in this paper are built into compact housings and are simple to operate.Comment: 23 pages, 6 Figs. To appear in Rev. Sci. Inst

    A General Framework for Constrained Smoothing

    Get PDF
    There are a wide array of smoothing methods available for finding structure in data. A general framework is developed which shows that many of these can be viewed as a projection of the data, with respect to appropriate norms. The underlying vector space is an unusually large product space, which allows inclusion of a wide range of smoothers in our setup (including many methods not typically considered to be projections). We give several applications of this simple geometric interpretation of smoothing. A major payoff is the natural and computationally frugal incorporation of constraints. Our point of view also motivates new estimates and it helps to understand the finite sample and asymptotic behaviour of these estimates

    Smoke gets in your eyes:what is sociological about cigarettes?

    Get PDF
    Contemporary public health approaches increasingly draw attention to the unequal social distribution of cigarette smoking. In contrast, critical accounts emphasize the importance of smokers’ situated agency, the relevance of embodiment and how public health measures against smoking potentially play upon and exacerbate social divisions and inequality. Nevertheless, if the social context of cigarettes is worthy of such attention, and sociology lays a distinct claim to understanding the social, we need to articulate a distinct, positive and systematic claim for smoking as an object of sociological enquiry. This article attempts to address this by situating smoking across three main dimensions of sociological thinking: history and social change; individual agency and experience; and social structures and power. It locates the emergence and development of cigarettes in everyday life within the project of modernity of the nineteenth and twentieth centuries. It goes on to assess the habituated, temporal and experiential aspects of individual smoking practices in everyday lifeworlds. Finally, it argues that smoking, while distributed in important ways by social class, also works relationally to render and inscribe it

    Climate Policies with Burden Sharing: The Economics of Climate Financing

    Get PDF
    The maintenance of a favorable climate accounts for the most challenging contemporary global governance predicament that seems to pit today’s generation against future world inhabitants. In a trade-off of economic growth versus sustainability, a broad-based international coalition could establish climate justice. As a novel angle towards climate justice, the following paper proposes (1) a well-balanced climate mitigation and adaptation public policy mix guided by micro- and macroeconomic analysis results, and (2) a new way of funding climate change mitigation and adaptation policies through carbon tax and broad-based climate bonds that also involve future generations. Contemporary climate financing strategies (e.g., Sachs Model) are thereby added into Integrated Assessment Models of the Nordhaus Type. Overall, the paper strives to delve deeper into a discussion of how market economies can be brought to a path consistent with prosperity and sustainability. Finding innovative ways how to finance climate abatement over time coupled with future risk prevention as well as adaptation to higher temperatures appears as an innovative and easily-implementable solution to nudge overlapping generations towards climate justice in the sustainability domain

    Efficient algorithms for analyzing segmental duplications with deletions and inversions in genomes

    Get PDF
    Background: Segmental duplications, or low-copy repeats, are common in mammalian genomes. In the human genome, most segmental duplications are mosaics comprised of multiple duplicated fragments. This complex genomic organization complicates analysis of the evolutionary history of these sequences. One model proposed to explain this mosaic patterns is a model of repeated aggregation and subsequent duplication of genomic sequences. Results: We describe a polynomial-time exact algorithm to compute duplication distance, a genomic distance defined as the most parsimonious way to build a target string by repeatedly copying substrings of a fixed source string. This distance models the process of repeated aggregation and duplication. We also describe extensions of this distance to include certain types of substring deletions and inversions. Finally, we provide an description of a sequence of duplication events as a context-free grammar (CFG). Conclusion: These new genomic distances will permit more biologically realistic analyses of segmental duplications in genomes.

    Contingent Kernel Density Estimation

    Get PDF
    Kernel density estimation is a widely used method for estimating a distribution based on a sample of points drawn from that distribution. Generally, in practice some form of error contaminates the sample of observed points. Such error can be the result of imprecise measurements or observation bias. Often this error is negligible and may be disregarded in analysis. In cases where the error is non-negligible, estimation methods should be adjusted to reduce resulting bias. Several modifications of kernel density estimation have been developed to address specific forms of errors. One form of error that has not yet been addressed is the case where observations are nominally placed at the centers of areas from which the points are assumed to have been drawn, where these areas are of varying sizes. In this scenario, the bias arises because the size of the error can vary among points and some subset of points can be known to have smaller error than another subset or the form of the error may change among points. This paper proposes a “contingent kernel density estimation” technique to address this form of error. This new technique adjusts the standard kernel on a point-by-point basis in an adaptive response to changing structure and magnitude of error. In this paper, equations for our contingent kernel technique are derived, the technique is validated using numerical simulations, and an example using the geographic locations of social networking users is worked to demonstrate the utility of the method

    Quantifying anatomical shape variations in neurological disorders

    Get PDF
    We develop a multivariate analysis of brain anatomy to identify the relevant shape deformation patterns and quantify the shape changes that explain corresponding variations in clinical neuropsychological measures. We use kernel Partial Least Squares (PLS) and formulate a regression model in the tangent space of the manifold of diffeomorphisms characterized by deformation momenta. The scalar deformation momenta completely encode the diffeomorphic changes in anatomical shape. In this model, the clinical measures are the response variables, while the anatomical variability is treated as the independent variable. To better understand the “shape—clinical response” relationship, we also control for demographic confounders, such as age, gender, and years of education in our regression model. We evaluate the proposed methodology on the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database using baseline structural MR imaging data and neuropsychological evaluation test scores. We demonstrate the ability of our model to quantify the anatomical deformations in units of clinical response. Our results also demonstrate that the proposed method is generic and generates reliable shape deformations both in terms of the extracted patterns and the amount of shape changes. We found that while the hippocampus and amygdala emerge as mainly responsible for changes in test scores for global measures of dementia and memory function, they are not a determinant factor for executive function. Another critical finding was the appearance of thalamus and putamen as most important regions that relate to executive function. These resulting anatomical regions were consistent with very high confidence irrespective of the size of the population used in the study. This data-driven global analysis of brain anatomy was able to reach similar conclusions as other studies in Alzheimer’s Disease based on predefined ROIs, together with the identification of other new patterns of deformation. The proposed methodology thus holds promise for discovering new patterns of shape changes in the human brain that could add to our understanding of disease progression in neurological disorders
    • 

    corecore