37 research outputs found

    Gene Function Classification Using Bayesian Models with Hierarchy-Based Priors

    Get PDF
    We investigate the application of hierarchical classification schemes to the annotation of gene function based on several characteristics of protein sequences including phylogenic descriptors, sequence based attributes, and predicted secondary structure. We discuss three Bayesian models and compare their performance in terms of predictive accuracy. These models are the ordinary multinomial logit (MNL) model, a hierarchical model based on a set of nested MNL models, and a MNL model with a prior that introduces correlations between the parameters for classes that are nearby in the hierarchy. We also provide a new scheme for combining different sources of information. We use these models to predict the functional class of Open Reading Frames (ORFs) from the E. coli genome. The results from all three models show substantial improvement over previous methods, which were based on the C5 algorithm. The MNL model using a prior based on the hierarchy outperforms both the non-hierarchical MNL model and the nested MNL model. In contrast to previous attempts at combining these sources of information, our approach results in a higher accuracy rate when compared to models that use each data source alone. Together, these results show that gene function can be predicted with higher accuracy than previously achieved, using Bayesian models that incorporate suitable prior information

    Physics, Astrophysics and Cosmology with Gravitational Waves

    Get PDF
    Gravitational wave detectors are already operating at interesting sensitivity levels, and they have an upgrade path that should result in secure detections by 2014. We review the physics of gravitational waves, how they interact with detectors (bars and interferometers), and how these detectors operate. We study the most likely sources of gravitational waves and review the data analysis methods that are used to extract their signals from detector noise. Then we consider the consequences of gravitational wave detections and observations for physics, astrophysics, and cosmology.Comment: 137 pages, 16 figures, Published version <http://www.livingreviews.org/lrr-2009-2

    The Evolution of Compact Binary Star Systems

    Get PDF
    We review the formation and evolution of compact binary stars consisting of white dwarfs (WDs), neutron stars (NSs), and black holes (BHs). Binary NSs and BHs are thought to be the primary astrophysical sources of gravitational waves (GWs) within the frequency band of ground-based detectors, while compact binaries of WDs are important sources of GWs at lower frequencies to be covered by space interferometers (LISA). Major uncertainties in the current understanding of properties of NSs and BHs most relevant to the GW studies are discussed, including the treatment of the natal kicks which compact stellar remnants acquire during the core collapse of massive stars and the common envelope phase of binary evolution. We discuss the coalescence rates of binary NSs and BHs and prospects for their detections, the formation and evolution of binary WDs and their observational manifestations. Special attention is given to AM CVn-stars -- compact binaries in which the Roche lobe is filled by another WD or a low-mass partially degenerate helium-star, as these stars are thought to be the best LISA verification binary GW sources.Comment: 105 pages, 18 figure

    Dinucleotide controlled null models for comparative RNA gene prediction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Comparative prediction of RNA structures can be used to identify functional noncoding RNAs in genomic screens. It was shown recently by Babak <it>et al</it>. [BMC Bioinformatics. 8:33] that RNA gene prediction programs can be biased by the genomic dinucleotide content, in particular those programs using a thermodynamic folding model including stacking energies. As a consequence, there is need for dinucleotide-preserving control strategies to assess the significance of such predictions. While there have been randomization algorithms for single sequences for many years, the problem has remained challenging for multiple alignments and there is currently no algorithm available.</p> <p>Results</p> <p>We present a program called SISSIz that simulates multiple alignments of a given average dinucleotide content. Meeting additional requirements of an accurate null model, the randomized alignments are on average of the same sequence diversity and preserve local conservation and gap patterns. We make use of a phylogenetic substitution model that includes overlapping dependencies and site-specific rates. Using fast heuristics and a distance based approach, a tree is estimated under this model which is used to guide the simulations. The new algorithm is tested on vertebrate genomic alignments and the effect on RNA structure predictions is studied. In addition, we directly combined the new null model with the RNAalifold consensus folding algorithm giving a new variant of a thermodynamic structure based RNA gene finding program that is not biased by the dinucleotide content.</p> <p>Conclusion</p> <p>SISSIz implements an efficient algorithm to randomize multiple alignments preserving dinucleotide content. It can be used to get more accurate estimates of false positive rates of existing programs, to produce negative controls for the training of machine learning based programs, or as standalone RNA gene finding program. Other applications in comparative genomics that require randomization of multiple alignments can be considered.</p> <p>Availability</p> <p>SISSIz is available as open source C code that can be compiled for every major platform and downloaded here: <url>http://sourceforge.net/projects/sissiz</url>.</p

    Conserved Secondary Structures in Aspergillus

    Get PDF
    Background: Recent evidence suggests that the number and variety of functional RNAs (ncRNAs as well as cis-acting RNA elements within mRNAs) is much higher than previously thought; thus, the ability to computationally predict and analyze RNAs has taken on new importance. We have computationally studied the secondary structures in an alignment of six Aspergillus genomes. Little is known about the RNAs present in this set of fungi, and this diverse set of genomes has an optimal level of sequence conservation for observing the correlated evolution of base-pairs seen in RNAs. Methodology/Principal Findings: We report the results of a whole-genome search for evolutionarily conserved secondary structures, as well as the results of clustering these predicted secondary structures by structural similarity. We find a total of 7450 predicted secondary structures, including a new predicted,60 bp long hairpin motif found primarily inside introns. We find no evidence for microRNAs. Different types of genomic regions are over-represented in different classes of predicted secondary structures. Exons contain the longest motifs (primarily long, branched hairpins), 59 UTRs primarily contain groupings of short hairpins located near the start codon, and 39 UTRs contain very little secondary structure compared to other regions. There is a large concentration of short hairpins just inside the boundaries of exons. The density of predicted intronic RNAs increases with the length of introns, and the density of predicted secondary structures within mRNA coding regions increases with the number of introns in a gene

    A Genome-Wide Survey of Imprinted Genes in Rice Seeds Reveals Imprinting Primarily Occurs in the Endosperm

    Get PDF
    Genomic imprinting causes the expression of an allele depending on its parental origin. In plants, most imprinted genes have been identified in Arabidopsis endosperm, a transient structure consumed by the embryo during seed formation. We identified imprinted genes in rice seed where both the endosperm and embryo are present at seed maturity. RNA was extracted from embryos and endosperm of seeds obtained from reciprocal crosses between two subspecies Nipponbare (Japonica rice) and 93-11 (Indica rice). Sequenced reads from cDNA libraries were aligned to their respective parental genomes using single-nucleotide polymorphisms (SNPs). Reads across SNPs enabled derivation of parental expression bias ratios. A continuum of parental expression bias states was observed. Statistical analyses indicated 262 candidate imprinted loci in the endosperm and three in the embryo (168 genic and 97 non-genic). Fifty-six of the 67 loci investigated were confirmed to be imprinted in the seed. Imprinted loci are not clustered in the rice genome as found in mammals. All of these imprinted loci were expressed in the endosperm, and one of these was also imprinted in the embryo, confirming that in both rice and Arabidopsis imprinted expression is primarily confined to the endosperm. Some rice imprinted genes were also expressed in vegetative tissues, indicating that they have additional roles in plant growth. Comparison of candidate imprinted genes found in rice with imprinted candidate loci obtained from genome-wide surveys of imprinted genes in Arabidopsis to date shows a low degree of conservation, suggesting that imprinting has evolved independently in eudicots and monocots

    Relativistic Dynamics and Extreme Mass Ratio Inspirals

    Full text link
    It is now well-established that a dark, compact object (DCO), very likely a massive black hole (MBH) of around four million solar masses is lurking at the centre of the Milky Way. While a consensus is emerging about the origin and growth of supermassive black holes (with masses larger than a billion solar masses), MBHs with smaller masses, such as the one in our galactic centre, remain understudied and enigmatic. The key to understanding these holes - how some of them grow by orders of magnitude in mass - lies in understanding the dynamics of the stars in the galactic neighbourhood. Stars interact with the central MBH primarily through their gradual inspiral due to the emission of gravitational radiation. Also stars produce gases which will subsequently be accreted by the MBH through collisions and disruptions brought about by the strong central tidal field. Such processes can contribute significantly to the mass of the MBH and progress in understanding them requires theoretical work in preparation for future gravitational radiation millihertz missions and X-ray observatories. In particular, a unique probe of these regions is the gravitational radiation that is emitted by some compact stars very close to the black holes and which could be surveyed by a millihertz gravitational wave interferometer scrutinizing the range of masses fundamental to understanding the origin and growth of supermassive black holes. By extracting the information carried by the gravitational radiation, we can determine the mass and spin of the central MBH with unprecedented precision and we can determine how the holes "eat" stars that happen to be near them.Comment: Update from the first version, 151 pages, accepted for publication @ Living Reviews in Relativit
    corecore