150 research outputs found
Statistical properties of thermodynamically predicted RNA secondary structures in viral genomes
By performing a comprehensive study on 1832 segments of 1212 complete genomes
of viruses, we show that in viral genomes the hairpin structures of
thermodynamically predicted RNA secondary structures are more abundant than
expected under a simple random null hypothesis. The detected hairpin structures
of RNA secondary structures are present both in coding and in noncoding regions
for the four groups of viruses categorized as dsDNA, dsRNA, ssDNA and ssRNA.
For all groups hairpin structures of RNA secondary structures are detected more
frequently than expected for a random null hypothesis in noncoding rather than
in coding regions. However, potential RNA secondary structures are also present
in coding regions of dsDNA group. In fact we detect evolutionary conserved RNA
secondary structures in conserved coding and noncoding regions of a large set
of complete genomes of dsDNA herpesviruses.Comment: 9 pages, 2 figure
Towards Reliable Automatic Protein Structure Alignment
A variety of methods have been proposed for structure similarity calculation,
which are called structure alignment or superposition. One major shortcoming in
current structure alignment algorithms is in their inherent design, which is
based on local structure similarity. In this work, we propose a method to
incorporate global information in obtaining optimal alignments and
superpositions. Our method, when applied to optimizing the TM-score and the GDT
score, produces significantly better results than current state-of-the-art
protein structure alignment tools. Specifically, if the highest TM-score found
by TMalign is lower than (0.6) and the highest TM-score found by one of the
tested methods is higher than (0.5), there is a probability of (42%) that
TMalign failed to find TM-scores higher than (0.5), while the same probability
is reduced to (2%) if our method is used. This could significantly improve the
accuracy of fold detection if the cutoff TM-score of (0.5) is used.
In addition, existing structure alignment algorithms focus on structure
similarity alone and simply ignore other important similarities, such as
sequence similarity. Our approach has the capacity to incorporate multiple
similarities into the scoring function. Results show that sequence similarity
aids in finding high quality protein structure alignments that are more
consistent with eye-examined alignments in HOMSTRAD. Even when structure
similarity itself fails to find alignments with any consistency with
eye-examined alignments, our method remains capable of finding alignments
highly similar to, or even identical to, eye-examined alignments.Comment: Peer-reviewed and presented as part of the 13th Workshop on
Algorithms in Bioinformatics (WABI2013
Joint Loop End Modeling Improves Covariance Model Based Non-coding RNA Gene Search
The effect of more detailed modeling of the interface between stem and loop in non-coding RNA hairpin structures on efficacy of covariance-model-based non-coding RNA gene search is examined. Currently, the prior probabilities of the two stem nucleotides and two loop-end nucleotides at the interface are treated the same as any other stem and loop nucleotides respectively. Laboratory thermodynamic studies show that hairpin stability is dependent on the identities of these four nucleotides, but this is not taken into account in current covariance models. It is shown that separate estimation of emission priors for these nucleotides and joint treatment of substitution probabilities for the two loop-end nucleotides leads to improved non-coding RNA gene search
Ecological Complex Systems
Main aim of this topical issue is to report recent advances in noisy
nonequilibrium processes useful to describe the dynamics of ecological systems
and to address the mechanisms of spatio-temporal pattern formation in ecology
both from the experimental and theoretical points of view. This is in order to
understand the dynamical behaviour of ecological complex systems through the
interplay between nonlinearity, noise, random and periodic environmental
interactions. Discovering the microscopic rules and the local interactions
which lead to the emergence of specific global patterns or global dynamical
behaviour and the noises role in the nonlinear dynamics is an important, key
aspect to understand and then to model ecological complex systems.Comment: 13 pages, Editorial of a topical issue on Ecological Complex System
to appear in EPJ B, Vol. 65 (2008
Detecting the Dependent Evolution of Biosequences
A probabilistic graphical model is developed in order to detect the dependent evolution between different sites in biological sequences. Given a multiple sequence alignment for each molecule of interest and a phylogenetic tree, the model can predict potential interactions within or between nucleic acids and proteins. Initial validation of the model is carried out using tRNA sequence data. The model is able to accurately identify the secondary structure of tRNA as well as several known tertiary interactions
COSNet : a cost sensitive neural network for semi-supervised learning in graphs
The semi-supervised problem of learning node labels in graphs consists, given a partial graph labeling, in inferring the unknown labels of the unlabeled vertices. Several machine learning algorithms have been proposed for solving this problem, including Hopfield networks and label
propagation methods; however, some issues have been only partially considered, e.g. the preservation of the prior knowledge and the unbalance
between positive and negative labels. To address these items, we propose
a Hopfield-based cost sensitive neural network algorithm (COSNet). The
method factorizes the solution of the problem in two parts: 1) the sub-
network composed by the labelled vertices is considered, and the net-
work parameters are estimated through a supervised algorithm; 2) the
estimated parameters are extended to the subnetwork composed of the
unlabeled vertices, and the attractor reached by the dynamics of this
subnetwork allows to predict the labeling of the unlabeled vertices. The
proposed method embeds in the neural algorithm the \u201da priori\u201d knowl-
edge coded in the labelled part of the graph, and separates node labels
and neuron states, allowing to differentially weight positive and nega-
tive node labels. Moreover, COSNet introduces an efficient cost-sensitive
strategy which allows to learn the near-optimal parameters of the net-
work in order to take into account the unbalance between positive and
negative node labels. Finally, the dynamics of the network is restricted to
its unlabeled part, preserving the minimization of the overall objective
function and significantly reducing the time complexity of the learning
algorithm. COSNet has been applied to the genome-wide prediction of
gene function in a model organism. The results, compared with those ob-
tained by other semi-supervised label propagation algorithms and super-
vised machine learning methods, show the effectiveness of the proposed
approach
Genome-Wide Association Study in BRCA1 Mutation Carriers Identifies Novel Loci Associated with Breast and Ovarian Cancer Risk
BRCA1-associated breast and ovarian cancer risks can be modified by common genetic variants. To identify further cancer risk-modifying loci, we performed a multi-stage GWAS of 11,705 BRCA1 carriers (of whom 5,920 were diagnosed with breast and 1,839 were diagnosed with ovarian cancer), with a further replication in an additional sample of 2,646 BRCA1 carriers. We identified a novel breast cancer risk modifier locus at 1q32 for BRCA1 carriers (rs2290854, P = 2.7×10-8, HR = 1.14, 95% CI: 1.09-1.20). In addition, we identified two novel ovarian cancer risk modifier loci: 17q21.31 (rs17631303, P = 1.4×10-8, HR = 1.27, 95% CI: 1.17-1.38) and 4q32.3 (rs4691139, P = 3.4×10-8, HR = 1.20, 95% CI: 1.17-1.38). The 4q32.3 locus was not associated with ovarian cancer risk in the general population or BRCA2 carriers, suggesting a BRCA1-specific associat
Measurement of the W+W- Production Cross Section in ppbar Collisions at sqrt(s)=1.96 TeV using Dilepton Events
We present a measurement of the W+W- production cross section using 184/pb of
ppbar collisions at a center-of-mass energy of 1.96 TeV collected with the
Collider Detector at Fermilab. Using the dilepton decay channel W+W- ->
l+l-vvbar, where the charged leptons can be either electrons or muons, we find
17 candidate events compared to an expected background of 5.0+2.2-0.8 events.
The resulting W+W- production cross section measurement of sigma(ppbar -> W+W-)
= 14.6 +5.8 -5.1 (stat) +1.8 -3.0 (syst) +-0.9 (lum) pb agrees well with the
Standard Model expectation.Comment: 8 pages, 2 figures, 2 tables. To be submitted to Physical Review
Letter
Structural Genomics of Minimal Organisms: Pipeline and Results
The initial objective of the Berkeley Structural Genomics Center was to obtain a near complete three-dimensional (3D) structural information of all soluble proteins of two minimal organisms, closely related pathogens Mycoplasma genitalium and M. pneumoniae. The former has fewer than 500 genes and the latter has fewer than 700 genes. A semiautomated structural genomics pipeline was set up from target selection, cloning, expression, purification, and ultimately structural determination. At the time of this writing, structural information of more than 93percent of all soluble proteins of M. genitalium is avail able. This chapter summarizes the approaches taken by the authors' center
- …