78,210 research outputs found
Particle rearrangements during transitions between local minima of the potential energy landscape of a supercooled Lennard-Jones liquid
The potential energy landscape (PEL) of supercooled binary Lennard-Jones
(BLJ) mixtures exhibits local minima, or inherent structures (IS), which are
organized into meta-basins (MB). We study the particle rearrangements related
to transitions between both successive IS and successive MB for a small 80:20
BLJ system near the mode-coupling temperature T_MCT. The analysis includes the
displacements of individual particles, the localization of the rearrangements
and the relevance of string-like motion. We find that the particle
rearrangements during IS and MB transitions do not change significantly at
T_MCT. Further, it is demonstrated that IS and MB dynamics are spatially
heterogeneous and facilitated by string-like motion. To investigate the
mechanism of string-like motion, we follow the particle rearrangements during
suitable sequences of IS transitions. We find that most strings observed after
a series of transitions do not move coherently during a single transition, but
subunits of different sizes are active at different times. Several findings
suggest that the occurrence of a successful string enables the system to exit a
MB. Moreover, we show that the particle rearrangements during two consecutive
MB transitions are basically uncorrelated. Specifically, different groups of
particles are highly mobile during subsequent MB transitions. Finally, the
relation between the features of the PEL and the relaxation processes in
supercooled liquids is discussed.Comment: 13 pages, 10 figure
Statistical analysis of simple repeats in the human genome
The human genome contains repetitive DNA at different level of sequence
length, number and dispersion. Highly repetitive DNA is particularly rich in
homo-- and di--nucleotide repeats, while middle repetitive DNA is rich of
families of interspersed, mobile elements hundreds of base pairs (bp) long,
among which the Alu families. A link between homo- and di-polymeric tracts and
mobile elements has been recently highlighted. In particular, the mobility of
Alu repeats, which form 10% of the human genome, has been correlated with the
length of poly(A) tracts located at one end of the Alu. These tracts have a
rigid and non-bendable structure and have an inhibitory effect on nucleosomes,
which normally compact the DNA. We performed a statistical analysis of the
genome-wide distribution of lengths and inter--tract separations of poly(X) and
poly(XY) tracts in the human genome. Our study shows that in humans the length
distributions of these sequences reflect the dynamics of their expansion and
DNA replication. By means of general tools from linguistics, we show that the
latter play the role of highly-significant content-bearing terms in the DNA
text. Furthermore, we find that such tracts are positioned in a non-random
fashion, with an apparent periodicity of 150 bases. This allows us to extend
the link between repetitive, highly mobile elements such as Alus and
low-complexity words in human DNA. More precisely, we show that Alus are
sources of poly(X) tracts, which in turn affect in a subtle way the combination
and diversification of gene expression and the fixation of multigene families
Query by String word spotting based on character bi-gram indexing
In this paper we propose a segmentation-free query by string word spotting
method. Both the documents and query strings are encoded using a recently
proposed word representa- tion that projects images and strings into a common
atribute space based on a pyramidal histogram of characters(PHOC). These
attribute models are learned using linear SVMs over the Fisher Vector
representation of the images along with the PHOC labels of the corresponding
strings. In order to search through the whole page, document regions are
indexed per character bi- gram using a similar attribute representation. On top
of that, we propose an integral image representation of the document using a
simplified version of the attribute model for efficient computation. Finally we
introduce a re-ranking step in order to boost retrieval performance. We show
state-of-the-art results for segmentation-free query by string word spotting in
single-writer and multi-writer standard datasetsComment: To be published in ICDAR201
Identifying statistical dependence in genomic sequences via mutual information estimates
Questions of understanding and quantifying the representation and amount of
information in organisms have become a central part of biological research, as
they potentially hold the key to fundamental advances. In this paper, we
demonstrate the use of information-theoretic tools for the task of identifying
segments of biomolecules (DNA or RNA) that are statistically correlated. We
develop a precise and reliable methodology, based on the notion of mutual
information, for finding and extracting statistical as well as structural
dependencies. A simple threshold function is defined, and its use in
quantifying the level of significance of dependencies between biological
segments is explored. These tools are used in two specific applications. First,
for the identification of correlations between different parts of the maize
zmSRp32 gene. There, we find significant dependencies between the 5'
untranslated region in zmSRp32 and its alternatively spliced exons. This
observation may indicate the presence of as-yet unknown alternative splicing
mechanisms or structural scaffolds. Second, using data from the FBI's Combined
DNA Index System (CODIS), we demonstrate that our approach is particularly well
suited for the problem of discovering short tandem repeats, an application of
importance in genetic profiling.Comment: Preliminary version. Final version in EURASIP Journal on
Bioinformatics and Systems Biology. See http://www.hindawi.com/journals/bsb
- …