78,210 research outputs found

    Particle rearrangements during transitions between local minima of the potential energy landscape of a supercooled Lennard-Jones liquid

    Full text link
    The potential energy landscape (PEL) of supercooled binary Lennard-Jones (BLJ) mixtures exhibits local minima, or inherent structures (IS), which are organized into meta-basins (MB). We study the particle rearrangements related to transitions between both successive IS and successive MB for a small 80:20 BLJ system near the mode-coupling temperature T_MCT. The analysis includes the displacements of individual particles, the localization of the rearrangements and the relevance of string-like motion. We find that the particle rearrangements during IS and MB transitions do not change significantly at T_MCT. Further, it is demonstrated that IS and MB dynamics are spatially heterogeneous and facilitated by string-like motion. To investigate the mechanism of string-like motion, we follow the particle rearrangements during suitable sequences of IS transitions. We find that most strings observed after a series of transitions do not move coherently during a single transition, but subunits of different sizes are active at different times. Several findings suggest that the occurrence of a successful string enables the system to exit a MB. Moreover, we show that the particle rearrangements during two consecutive MB transitions are basically uncorrelated. Specifically, different groups of particles are highly mobile during subsequent MB transitions. Finally, the relation between the features of the PEL and the relaxation processes in supercooled liquids is discussed.Comment: 13 pages, 10 figure

    Statistical analysis of simple repeats in the human genome

    Full text link
    The human genome contains repetitive DNA at different level of sequence length, number and dispersion. Highly repetitive DNA is particularly rich in homo-- and di--nucleotide repeats, while middle repetitive DNA is rich of families of interspersed, mobile elements hundreds of base pairs (bp) long, among which the Alu families. A link between homo- and di-polymeric tracts and mobile elements has been recently highlighted. In particular, the mobility of Alu repeats, which form 10% of the human genome, has been correlated with the length of poly(A) tracts located at one end of the Alu. These tracts have a rigid and non-bendable structure and have an inhibitory effect on nucleosomes, which normally compact the DNA. We performed a statistical analysis of the genome-wide distribution of lengths and inter--tract separations of poly(X) and poly(XY) tracts in the human genome. Our study shows that in humans the length distributions of these sequences reflect the dynamics of their expansion and DNA replication. By means of general tools from linguistics, we show that the latter play the role of highly-significant content-bearing terms in the DNA text. Furthermore, we find that such tracts are positioned in a non-random fashion, with an apparent periodicity of 150 bases. This allows us to extend the link between repetitive, highly mobile elements such as Alus and low-complexity words in human DNA. More precisely, we show that Alus are sources of poly(X) tracts, which in turn affect in a subtle way the combination and diversification of gene expression and the fixation of multigene families

    Query by String word spotting based on character bi-gram indexing

    Full text link
    In this paper we propose a segmentation-free query by string word spotting method. Both the documents and query strings are encoded using a recently proposed word representa- tion that projects images and strings into a common atribute space based on a pyramidal histogram of characters(PHOC). These attribute models are learned using linear SVMs over the Fisher Vector representation of the images along with the PHOC labels of the corresponding strings. In order to search through the whole page, document regions are indexed per character bi- gram using a similar attribute representation. On top of that, we propose an integral image representation of the document using a simplified version of the attribute model for efficient computation. Finally we introduce a re-ranking step in order to boost retrieval performance. We show state-of-the-art results for segmentation-free query by string word spotting in single-writer and multi-writer standard datasetsComment: To be published in ICDAR201

    Identifying statistical dependence in genomic sequences via mutual information estimates

    Get PDF
    Questions of understanding and quantifying the representation and amount of information in organisms have become a central part of biological research, as they potentially hold the key to fundamental advances. In this paper, we demonstrate the use of information-theoretic tools for the task of identifying segments of biomolecules (DNA or RNA) that are statistically correlated. We develop a precise and reliable methodology, based on the notion of mutual information, for finding and extracting statistical as well as structural dependencies. A simple threshold function is defined, and its use in quantifying the level of significance of dependencies between biological segments is explored. These tools are used in two specific applications. First, for the identification of correlations between different parts of the maize zmSRp32 gene. There, we find significant dependencies between the 5' untranslated region in zmSRp32 and its alternatively spliced exons. This observation may indicate the presence of as-yet unknown alternative splicing mechanisms or structural scaffolds. Second, using data from the FBI's Combined DNA Index System (CODIS), we demonstrate that our approach is particularly well suited for the problem of discovering short tandem repeats, an application of importance in genetic profiling.Comment: Preliminary version. Final version in EURASIP Journal on Bioinformatics and Systems Biology. See http://www.hindawi.com/journals/bsb
    corecore