37 research outputs found
Gene Function Classification Using Bayesian Models with Hierarchy-Based Priors
We investigate the application of hierarchical classification schemes to the
annotation of gene function based on several characteristics of protein
sequences including phylogenic descriptors, sequence based attributes, and
predicted secondary structure. We discuss three Bayesian models and compare
their performance in terms of predictive accuracy. These models are the
ordinary multinomial logit (MNL) model, a hierarchical model based on a set of
nested MNL models, and a MNL model with a prior that introduces correlations
between the parameters for classes that are nearby in the hierarchy. We also
provide a new scheme for combining different sources of information. We use
these models to predict the functional class of Open Reading Frames (ORFs) from
the E. coli genome. The results from all three models show substantial
improvement over previous methods, which were based on the C5 algorithm. The
MNL model using a prior based on the hierarchy outperforms both the
non-hierarchical MNL model and the nested MNL model. In contrast to previous
attempts at combining these sources of information, our approach results in a
higher accuracy rate when compared to models that use each data source alone.
Together, these results show that gene function can be predicted with higher
accuracy than previously achieved, using Bayesian models that incorporate
suitable prior information
Physics, Astrophysics and Cosmology with Gravitational Waves
Gravitational wave detectors are already operating at interesting sensitivity
levels, and they have an upgrade path that should result in secure detections
by 2014. We review the physics of gravitational waves, how they interact with
detectors (bars and interferometers), and how these detectors operate. We study
the most likely sources of gravitational waves and review the data analysis
methods that are used to extract their signals from detector noise. Then we
consider the consequences of gravitational wave detections and observations for
physics, astrophysics, and cosmology.Comment: 137 pages, 16 figures, Published version
<http://www.livingreviews.org/lrr-2009-2
The Evolution of Compact Binary Star Systems
We review the formation and evolution of compact binary stars consisting of
white dwarfs (WDs), neutron stars (NSs), and black holes (BHs). Binary NSs and
BHs are thought to be the primary astrophysical sources of gravitational waves
(GWs) within the frequency band of ground-based detectors, while compact
binaries of WDs are important sources of GWs at lower frequencies to be covered
by space interferometers (LISA). Major uncertainties in the current
understanding of properties of NSs and BHs most relevant to the GW studies are
discussed, including the treatment of the natal kicks which compact stellar
remnants acquire during the core collapse of massive stars and the common
envelope phase of binary evolution. We discuss the coalescence rates of binary
NSs and BHs and prospects for their detections, the formation and evolution of
binary WDs and their observational manifestations. Special attention is given
to AM CVn-stars -- compact binaries in which the Roche lobe is filled by
another WD or a low-mass partially degenerate helium-star, as these stars are
thought to be the best LISA verification binary GW sources.Comment: 105 pages, 18 figure
Dinucleotide controlled null models for comparative RNA gene prediction
<p>Abstract</p> <p>Background</p> <p>Comparative prediction of RNA structures can be used to identify functional noncoding RNAs in genomic screens. It was shown recently by Babak <it>et al</it>. [BMC Bioinformatics. 8:33] that RNA gene prediction programs can be biased by the genomic dinucleotide content, in particular those programs using a thermodynamic folding model including stacking energies. As a consequence, there is need for dinucleotide-preserving control strategies to assess the significance of such predictions. While there have been randomization algorithms for single sequences for many years, the problem has remained challenging for multiple alignments and there is currently no algorithm available.</p> <p>Results</p> <p>We present a program called SISSIz that simulates multiple alignments of a given average dinucleotide content. Meeting additional requirements of an accurate null model, the randomized alignments are on average of the same sequence diversity and preserve local conservation and gap patterns. We make use of a phylogenetic substitution model that includes overlapping dependencies and site-specific rates. Using fast heuristics and a distance based approach, a tree is estimated under this model which is used to guide the simulations. The new algorithm is tested on vertebrate genomic alignments and the effect on RNA structure predictions is studied. In addition, we directly combined the new null model with the RNAalifold consensus folding algorithm giving a new variant of a thermodynamic structure based RNA gene finding program that is not biased by the dinucleotide content.</p> <p>Conclusion</p> <p>SISSIz implements an efficient algorithm to randomize multiple alignments preserving dinucleotide content. It can be used to get more accurate estimates of false positive rates of existing programs, to produce negative controls for the training of machine learning based programs, or as standalone RNA gene finding program. Other applications in comparative genomics that require randomization of multiple alignments can be considered.</p> <p>Availability</p> <p>SISSIz is available as open source C code that can be compiled for every major platform and downloaded here: <url>http://sourceforge.net/projects/sissiz</url>.</p
Conserved Secondary Structures in Aspergillus
Background: Recent evidence suggests that the number and variety of functional RNAs (ncRNAs as well as cis-acting RNA elements within mRNAs) is much higher than previously thought; thus, the ability to computationally predict and analyze RNAs has taken on new importance. We have computationally studied the secondary structures in an alignment of six Aspergillus genomes. Little is known about the RNAs present in this set of fungi, and this diverse set of genomes has an optimal level of sequence conservation for observing the correlated evolution of base-pairs seen in RNAs. Methodology/Principal Findings: We report the results of a whole-genome search for evolutionarily conserved secondary structures, as well as the results of clustering these predicted secondary structures by structural similarity. We find a total of 7450 predicted secondary structures, including a new predicted,60 bp long hairpin motif found primarily inside introns. We find no evidence for microRNAs. Different types of genomic regions are over-represented in different classes of predicted secondary structures. Exons contain the longest motifs (primarily long, branched hairpins), 59 UTRs primarily contain groupings of short hairpins located near the start codon, and 39 UTRs contain very little secondary structure compared to other regions. There is a large concentration of short hairpins just inside the boundaries of exons. The density of predicted intronic RNAs increases with the length of introns, and the density of predicted secondary structures within mRNA coding regions increases with the number of introns in a gene
A Genome-Wide Survey of Imprinted Genes in Rice Seeds Reveals Imprinting Primarily Occurs in the Endosperm
Genomic imprinting causes the expression of an allele depending on its parental origin. In plants, most imprinted genes have been identified in Arabidopsis endosperm, a transient structure consumed by the embryo during seed formation. We identified imprinted genes in rice seed where both the endosperm and embryo are present at seed maturity. RNA was extracted from embryos and endosperm of seeds obtained from reciprocal crosses between two subspecies Nipponbare (Japonica rice) and 93-11 (Indica rice). Sequenced reads from cDNA libraries were aligned to their respective parental genomes using single-nucleotide polymorphisms (SNPs). Reads across SNPs enabled derivation of parental expression bias ratios. A continuum of parental expression bias states was observed. Statistical analyses indicated 262 candidate imprinted loci in the endosperm and three in the embryo (168 genic and 97 non-genic). Fifty-six of the 67 loci investigated were confirmed to be imprinted in the seed. Imprinted loci are not clustered in the rice genome as found in mammals. All of these imprinted loci were expressed in the endosperm, and one of these was also imprinted in the embryo, confirming that in both rice and Arabidopsis imprinted expression is primarily confined to the endosperm. Some rice imprinted genes were also expressed in vegetative tissues, indicating that they have additional roles in plant growth. Comparison of candidate imprinted genes found in rice with imprinted candidate loci obtained from genome-wide surveys of imprinted genes in Arabidopsis to date shows a low degree of conservation, suggesting that imprinting has evolved independently in eudicots and monocots
Relativistic Dynamics and Extreme Mass Ratio Inspirals
It is now well-established that a dark, compact object (DCO), very likely a
massive black hole (MBH) of around four million solar masses is lurking at the
centre of the Milky Way. While a consensus is emerging about the origin and
growth of supermassive black holes (with masses larger than a billion solar
masses), MBHs with smaller masses, such as the one in our galactic centre,
remain understudied and enigmatic. The key to understanding these holes - how
some of them grow by orders of magnitude in mass - lies in understanding the
dynamics of the stars in the galactic neighbourhood. Stars interact with the
central MBH primarily through their gradual inspiral due to the emission of
gravitational radiation. Also stars produce gases which will subsequently be
accreted by the MBH through collisions and disruptions brought about by the
strong central tidal field. Such processes can contribute significantly to the
mass of the MBH and progress in understanding them requires theoretical work in
preparation for future gravitational radiation millihertz missions and X-ray
observatories. In particular, a unique probe of these regions is the
gravitational radiation that is emitted by some compact stars very close to the
black holes and which could be surveyed by a millihertz gravitational wave
interferometer scrutinizing the range of masses fundamental to understanding
the origin and growth of supermassive black holes. By extracting the
information carried by the gravitational radiation, we can determine the mass
and spin of the central MBH with unprecedented precision and we can determine
how the holes "eat" stars that happen to be near them.Comment: Update from the first version, 151 pages, accepted for publication @
Living Reviews in Relativit