191 research outputs found

    Identifying statistical dependence in genomic sequences via mutual information estimates

    Get PDF
    Questions of understanding and quantifying the representation and amount of information in organisms have become a central part of biological research, as they potentially hold the key to fundamental advances. In this paper, we demonstrate the use of information-theoretic tools for the task of identifying segments of biomolecules (DNA or RNA) that are statistically correlated. We develop a precise and reliable methodology, based on the notion of mutual information, for finding and extracting statistical as well as structural dependencies. A simple threshold function is defined, and its use in quantifying the level of significance of dependencies between biological segments is explored. These tools are used in two specific applications. First, for the identification of correlations between different parts of the maize zmSRp32 gene. There, we find significant dependencies between the 5' untranslated region in zmSRp32 and its alternatively spliced exons. This observation may indicate the presence of as-yet unknown alternative splicing mechanisms or structural scaffolds. Second, using data from the FBI's Combined DNA Index System (CODIS), we demonstrate that our approach is particularly well suited for the problem of discovering short tandem repeats, an application of importance in genetic profiling.Comment: Preliminary version. Final version in EURASIP Journal on Bioinformatics and Systems Biology. See http://www.hindawi.com/journals/bsb

    Scalable Parallel Numerical Constraint Solver Using Global Load Balancing

    Full text link
    We present a scalable parallel solver for numerical constraint satisfaction problems (NCSPs). Our parallelization scheme consists of homogeneous worker solvers, each of which runs on an available core and communicates with others via the global load balancing (GLB) method. The parallel solver is implemented with X10 that provides an implementation of GLB as a library. In experiments, several NCSPs from the literature were solved and attained up to 516-fold speedup using 600 cores of the TSUBAME2.5 supercomputer.Comment: To be presented at X10'15 Worksho

    LINVIEW: Incremental View Maintenance for Complex Analytical Queries

    Full text link
    Many analytics tasks and machine learning problems can be naturally expressed by iterative linear algebra programs. In this paper, we study the incremental view maintenance problem for such complex analytical queries. We develop a framework, called LINVIEW, for capturing deltas of linear algebra programs and understanding their computational cost. Linear algebra operations tend to cause an avalanche effect where even very local changes to the input matrices spread out and infect all of the intermediate results and the final view, causing incremental view maintenance to lose its performance benefit over re-evaluation. We develop techniques based on matrix factorizations to contain such epidemics of change. As a consequence, our techniques make incremental view maintenance of linear algebra practical and usually substantially cheaper than re-evaluation. We show, both analytically and experimentally, the usefulness of these techniques when applied to standard analytics tasks. Our evaluation demonstrates the efficiency of LINVIEW in generating parallel incremental programs that outperform re-evaluation techniques by more than an order of magnitude.Comment: 14 pages, SIGMO

    Developed Adomian method for quadratic Kaluza-Klein relativity

    Full text link
    We develop and modify the Adomian decomposition method (ADecM) to work for a new type of nonlinear matrix differential equations (MDE's) which arise in general relativity (GR) and possibly in other applications. The approach consists in modifying both the ADecM linear operator with highest order derivative and ADecM polynomials. We specialize in the case of a 4×\times4 nonlinear MDE along with a scalar one describing stationary cylindrically symmetric metrics in quadratic 5-dimensional GR, derive some of their properties using ADecM and construct the \textit{most general unique power series solutions}. However, because of the constraint imposed on the MDE by the scalar one, the series solutions terminate in closed forms exhausting all possible solutions.Comment: 17 pages (minor changes in reference [30]

    Asymptotic equivalence of discretely observed diffusion processes and their Euler scheme: small variance case

    Full text link
    This paper establishes the global asymptotic equivalence, in the sense of the Le Cam Δ\Delta-distance, between scalar diffusion models with unknown drift function and small variance on the one side, and nonparametric autoregressive models on the other side. The time horizon TT is kept fixed and both the cases of discrete and continuous observation of the path are treated. We allow non constant diffusion coefficient, bounded but possibly tending to zero. The asymptotic equivalences are established by constructing explicit equivalence mappings.Comment: 21 page

    Self-interrupted synthesis of sterically hindered aliphatic polyamide dendrimers

    Get PDF
    Hydrolytically and enzymatically stable nanoscale synthetic constructs, with well-defined structures that exhibit antimicrobial activity, offer exciting possibilities for diverse applications in the emerging field of nanomedicine. Herein, we demonstrate that it is the core conformation, rather than periodicity, that ultimately controls the synthesis of sterically hindered aliphatic polyamide dendrimers. The latter self-interrupt at a predictable low generation number due to backfolding of their peripheral groups, which in turn leads to well-defined nanoarchitectures
    corecore