218,218 research outputs found

    Simplifying the mosaic description of DNA sequences

    Get PDF
    By using the Jensen-Shannon divergence, genomic DNA can be divided into compositionally distinct domains through a standard recursive segmentation procedure. Each domain, while significantly different from its neighbours, may however share compositional similarity with one or more distant (non--neighbouring) domains. We thus obtain a coarse--grained description of the given DNA string in terms of a smaller set of distinct domain labels. This yields a minimal domain description of a given DNA sequence, significantly reducing its organizational complexity. This procedure gives a new means of evaluating genomic complexity as one examines organisms ranging from bacteria to human. The mosaic organization of DNA sequences could have originated from the insertion of fragments of one genome (the parasite) inside another (the host), and we present numerical experiments that are suggestive of this scenario.Comment: 16 pages, 1 figure, Accepted for publication in Phys. Rev.

    On the efficiency of estimating penetrating rank on large graphs

    Get PDF
    P-Rank (Penetrating Rank) has been suggested as a useful measure of structural similarity that takes account of both incoming and outgoing edges in ubiquitous networks. Existing work often utilizes memoization to compute P-Rank similarity in an iterative fashion, which requires cubic time in the worst case. Besides, previous methods mainly focus on the deterministic computation of P-Rank, but lack the probabilistic framework that scales well for large graphs. In this paper, we propose two efficient algorithms for computing P-Rank on large graphs. The first observation is that a large body of objects in a real graph usually share similar neighborhood structures. By merging such objects with an explicit low-rank factorization, we devise a deterministic algorithm to compute P-Rank in quadratic time. The second observation is that by converting the iterative form of P-Rank into a matrix power series form, we can leverage the random sampling approach to probabilistically compute P-Rank in linear time with provable accuracy guarantees. The empirical results on both real and synthetic datasets show that our approaches achieve high time efficiency with controlled error and outperform the baseline algorithms by at least one order of magnitude

    Earthquake source parameters of the 2009 Mw 7.8 Fiordland (New Zealand) earthquake from L-band InSAR observations

    Get PDF
    The 2009 MW7.8 Fiordland (New Zealand) earthquake is the largest to have occurred in New Zealand since the 1931 Mw 7.8 Hawke’s Bay earthquake, 1 000 km to the northwest. In this paper two tracks of ALOS PALSAR interferograms (one ascending and one descending) are used to determine fault geometry and slip distribution of this large earthquake. Modeling the event as dislocation in an elastic half-space suggests that the earthquake resulted from slip on a SSW-NNE orientated thrust fault that is associated with the subduction between the Pacific and Australian Plates, with oblique displacement of up to 6.3 m. This finding is consistent with the preliminary studies undertaken by the USGS using seismic data

    The optimized kinematic dynamo in a sphere

    Get PDF

    Estimation and Testing for Unit Root Processes with GARCH(1,1) Errors: Theory and Monte Carlo Evidence,

    Get PDF
    Least squares (LS) and maximum likelihood (ML) estimation are considered for unit root processes with GARCH (1, 1) errors. The asymptotic distributions of LS and ML estimators are derived under the condition alpha + beta
    • …
    corecore