150 research outputs found

    Statistical properties of thermodynamically predicted RNA secondary structures in viral genomes

    Full text link
    By performing a comprehensive study on 1832 segments of 1212 complete genomes of viruses, we show that in viral genomes the hairpin structures of thermodynamically predicted RNA secondary structures are more abundant than expected under a simple random null hypothesis. The detected hairpin structures of RNA secondary structures are present both in coding and in noncoding regions for the four groups of viruses categorized as dsDNA, dsRNA, ssDNA and ssRNA. For all groups hairpin structures of RNA secondary structures are detected more frequently than expected for a random null hypothesis in noncoding rather than in coding regions. However, potential RNA secondary structures are also present in coding regions of dsDNA group. In fact we detect evolutionary conserved RNA secondary structures in conserved coding and noncoding regions of a large set of complete genomes of dsDNA herpesviruses.Comment: 9 pages, 2 figure

    Towards Reliable Automatic Protein Structure Alignment

    Full text link
    A variety of methods have been proposed for structure similarity calculation, which are called structure alignment or superposition. One major shortcoming in current structure alignment algorithms is in their inherent design, which is based on local structure similarity. In this work, we propose a method to incorporate global information in obtaining optimal alignments and superpositions. Our method, when applied to optimizing the TM-score and the GDT score, produces significantly better results than current state-of-the-art protein structure alignment tools. Specifically, if the highest TM-score found by TMalign is lower than (0.6) and the highest TM-score found by one of the tested methods is higher than (0.5), there is a probability of (42%) that TMalign failed to find TM-scores higher than (0.5), while the same probability is reduced to (2%) if our method is used. This could significantly improve the accuracy of fold detection if the cutoff TM-score of (0.5) is used. In addition, existing structure alignment algorithms focus on structure similarity alone and simply ignore other important similarities, such as sequence similarity. Our approach has the capacity to incorporate multiple similarities into the scoring function. Results show that sequence similarity aids in finding high quality protein structure alignments that are more consistent with eye-examined alignments in HOMSTRAD. Even when structure similarity itself fails to find alignments with any consistency with eye-examined alignments, our method remains capable of finding alignments highly similar to, or even identical to, eye-examined alignments.Comment: Peer-reviewed and presented as part of the 13th Workshop on Algorithms in Bioinformatics (WABI2013

    Joint Loop End Modeling Improves Covariance Model Based Non-coding RNA Gene Search

    Get PDF
    The effect of more detailed modeling of the interface between stem and loop in non-coding RNA hairpin structures on efficacy of covariance-model-based non-coding RNA gene search is examined. Currently, the prior probabilities of the two stem nucleotides and two loop-end nucleotides at the interface are treated the same as any other stem and loop nucleotides respectively. Laboratory thermodynamic studies show that hairpin stability is dependent on the identities of these four nucleotides, but this is not taken into account in current covariance models. It is shown that separate estimation of emission priors for these nucleotides and joint treatment of substitution probabilities for the two loop-end nucleotides leads to improved non-coding RNA gene search

    Ecological Complex Systems

    Full text link
    Main aim of this topical issue is to report recent advances in noisy nonequilibrium processes useful to describe the dynamics of ecological systems and to address the mechanisms of spatio-temporal pattern formation in ecology both from the experimental and theoretical points of view. This is in order to understand the dynamical behaviour of ecological complex systems through the interplay between nonlinearity, noise, random and periodic environmental interactions. Discovering the microscopic rules and the local interactions which lead to the emergence of specific global patterns or global dynamical behaviour and the noises role in the nonlinear dynamics is an important, key aspect to understand and then to model ecological complex systems.Comment: 13 pages, Editorial of a topical issue on Ecological Complex System to appear in EPJ B, Vol. 65 (2008

    Detecting the Dependent Evolution of Biosequences

    Full text link
    A probabilistic graphical model is developed in order to detect the dependent evolution between different sites in biological sequences. Given a multiple sequence alignment for each molecule of interest and a phylogenetic tree, the model can predict potential interactions within or between nucleic acids and proteins. Initial validation of the model is carried out using tRNA sequence data. The model is able to accurately identify the secondary structure of tRNA as well as several known tertiary interactions

    COSNet : a cost sensitive neural network for semi-supervised learning in graphs

    Get PDF
    The semi-supervised problem of learning node labels in graphs consists, given a partial graph labeling, in inferring the unknown labels of the unlabeled vertices. Several machine learning algorithms have been proposed for solving this problem, including Hopfield networks and label propagation methods; however, some issues have been only partially considered, e.g. the preservation of the prior knowledge and the unbalance between positive and negative labels. To address these items, we propose a Hopfield-based cost sensitive neural network algorithm (COSNet). The method factorizes the solution of the problem in two parts: 1) the sub- network composed by the labelled vertices is considered, and the net- work parameters are estimated through a supervised algorithm; 2) the estimated parameters are extended to the subnetwork composed of the unlabeled vertices, and the attractor reached by the dynamics of this subnetwork allows to predict the labeling of the unlabeled vertices. The proposed method embeds in the neural algorithm the \u201da priori\u201d knowl- edge coded in the labelled part of the graph, and separates node labels and neuron states, allowing to differentially weight positive and nega- tive node labels. Moreover, COSNet introduces an efficient cost-sensitive strategy which allows to learn the near-optimal parameters of the net- work in order to take into account the unbalance between positive and negative node labels. Finally, the dynamics of the network is restricted to its unlabeled part, preserving the minimization of the overall objective function and significantly reducing the time complexity of the learning algorithm. COSNet has been applied to the genome-wide prediction of gene function in a model organism. The results, compared with those ob- tained by other semi-supervised label propagation algorithms and super- vised machine learning methods, show the effectiveness of the proposed approach

    Genome-Wide Association Study in BRCA1 Mutation Carriers Identifies Novel Loci Associated with Breast and Ovarian Cancer Risk

    Get PDF
    BRCA1-associated breast and ovarian cancer risks can be modified by common genetic variants. To identify further cancer risk-modifying loci, we performed a multi-stage GWAS of 11,705 BRCA1 carriers (of whom 5,920 were diagnosed with breast and 1,839 were diagnosed with ovarian cancer), with a further replication in an additional sample of 2,646 BRCA1 carriers. We identified a novel breast cancer risk modifier locus at 1q32 for BRCA1 carriers (rs2290854, P = 2.7×10-8, HR = 1.14, 95% CI: 1.09-1.20). In addition, we identified two novel ovarian cancer risk modifier loci: 17q21.31 (rs17631303, P = 1.4×10-8, HR = 1.27, 95% CI: 1.17-1.38) and 4q32.3 (rs4691139, P = 3.4×10-8, HR = 1.20, 95% CI: 1.17-1.38). The 4q32.3 locus was not associated with ovarian cancer risk in the general population or BRCA2 carriers, suggesting a BRCA1-specific associat

    Measurement of the W+W- Production Cross Section in ppbar Collisions at sqrt(s)=1.96 TeV using Dilepton Events

    Get PDF
    We present a measurement of the W+W- production cross section using 184/pb of ppbar collisions at a center-of-mass energy of 1.96 TeV collected with the Collider Detector at Fermilab. Using the dilepton decay channel W+W- -> l+l-vvbar, where the charged leptons can be either electrons or muons, we find 17 candidate events compared to an expected background of 5.0+2.2-0.8 events. The resulting W+W- production cross section measurement of sigma(ppbar -> W+W-) = 14.6 +5.8 -5.1 (stat) +1.8 -3.0 (syst) +-0.9 (lum) pb agrees well with the Standard Model expectation.Comment: 8 pages, 2 figures, 2 tables. To be submitted to Physical Review Letter

    Structural Genomics of Minimal Organisms: Pipeline and Results

    Get PDF
    The initial objective of the Berkeley Structural Genomics Center was to obtain a near complete three-dimensional (3D) structural information of all soluble proteins of two minimal organisms, closely related pathogens Mycoplasma genitalium and M. pneumoniae. The former has fewer than 500 genes and the latter has fewer than 700 genes. A semiautomated structural genomics pipeline was set up from target selection, cloning, expression, purification, and ultimately structural determination. At the time of this writing, structural information of more than 93percent of all soluble proteins of M. genitalium is avail able. This chapter summarizes the approaches taken by the authors' center
    corecore