1,283 research outputs found
Regulation of splicing factors by alternative splicing and NMD is conserved between kingdoms yet evolutionarily flexible.
Ultraconserved elements, unusually long regions of perfect sequence identity, are found in genes encoding numerous RNA-binding proteins including arginine-serine rich (SR) splicing factors. Expression of these genes is regulated via alternative splicing of the ultraconserved regions to yield mRNAs that are degraded by nonsense-mediated mRNA decay (NMD), a process termed unproductive splicing (Lareau et al. 2007; Ni et al. 2007). As all human SR genes are affected by alternative splicing and NMD, one might expect this regulation to have originated in an early SR gene and persisted as duplications expanded the SR family. But in fact, unproductive splicing of most human SR genes arose independently (Lareau et al. 2007). This paradox led us to investigate the origin and proliferation of unproductive splicing in SR genes. We demonstrate that unproductive splicing of the splicing factor SRSF5 (SRp40) is conserved among all animals and even observed in fungi; this is a rare example of alternative splicing conserved between kingdoms, yet its effect is to trigger mRNA degradation. As the gene duplicated, the ancient unproductive splicing was lost in paralogs, and distinct unproductive splicing evolved rapidly and repeatedly to take its place. SR genes have consistently employed unproductive splicing, and while it is exceptionally conserved in some of these genes, turnover in specific events among paralogs shows flexible means to the same regulatory end
Protein secondary structure: Entropy, correlations and prediction
Is protein secondary structure primarily determined by local interactions
between residues closely spaced along the amino acid backbone, or by non-local
tertiary interactions? To answer this question we have measured the entropy
densities of primary structure and secondary structure sequences, and the local
inter-sequence mutual information density. We find that the important
inter-sequence interactions are short ranged, that correlations between
neighboring amino acids are essentially uninformative, and that only 1/4 of the
total information needed to determine the secondary structure is available from
local inter-sequence correlations. Since the remaining information must come
from non-local interactions, this observation supports the view that the
majority of most proteins fold via a cooperative process where secondary and
tertiary structure form concurrently. To provide a more direct comparison to
existing secondary structure prediction methods, we construct a simple hidden
Markov model (HMM) of the sequences. This HMM achieves a prediction accuracy
comparable to other single sequence secondary structure prediction algorithms,
and can extract almost all of the inter-sequence mutual information. This
suggests that these algorithms are almost optimal, and that we should not
expect a dramatic improvement in prediction accuracy. However, local
correlations between secondary and primary structure are probably of
under-appreciated importance in many tertiary structure prediction methods,
such as threading.Comment: 8 pages, 5 figure
The Role of Reliability in the Publisher/Supplier Relationship in Textbook Production
Most publishers in the textbook publishing industry experience production delays. If the delays are serious it could result in a postponed publication date. The ramifications of a postponed publication date range from anxiety to the author and publisher to lost revenues. Most textbook publishers are required to have their new texts reviewed by adoption committies, therefore it is imperative that their books be ready by a certain date. There are many functions involved in the book making process. A delay in any one area may hold things up in other areas, thus compounding the situation. The purpose of this thesis is to identify where the greatest frequency of delays occur. The hypothesis assumes that suppliers are the main suspect for delays since here is where the publisher usually jobs out the work. It is therefore important for publishers to use suppliers who are reliable. Reliability is also considered in its deeper dimensions to consider its total role in the publisher/supplier relationship. The research was conducted in three parts. The first two parts consisted of survey questionnaires. The surveys were conducted at the Book Manufacturer\u27s Institute Seminar for publishers at Rochester Institute of Technology and the Pubmart Convention in New York City. The third part consisted of several interviews with the production managers of text book houses in Boston, New York and Philadelphia. The two surveys tested the hypothesis objectively as well as providing pertinent background information on the problem. All three parts of the research indicated that most delays are not attributable to suppliers, but to other areas of the book making process - most notably the author. The hypothesis was discredited. The deeper dimension of reliability surfaced as the real problem; viz. integrity. This aspect proved to be the foundation of the publisher/supplier relationship
SCOPe: Structural Classification of Proteins--extended, integrating SCOP and ASTRAL data and classification of new structures.
Structural Classification of Proteins-extended (SCOPe, http://scop.berkeley.edu) is a database of protein structural relationships that extends the SCOP database. SCOP is a manually curated ordering of domains from the majority of proteins of known structure in a hierarchy according to structural and evolutionary relationships. Development of the SCOP 1.x series concluded with SCOP 1.75. The ASTRAL compendium provides several databases and tools to aid in the analysis of the protein structures classified in SCOP, particularly through the use of their sequences. SCOPe extends version 1.75 of the SCOP database, using automated curation methods to classify many structures released since SCOP 1.75. We have rigorously benchmarked our automated methods to ensure that they are as accurate as manual curation, though there are many proteins to which our methods cannot be applied. SCOPe is also partially manually curated to correct some errors in SCOP. SCOPe aims to be backward compatible with SCOP, providing the same parseable files and a history of changes between all stable SCOP and SCOPe releases. SCOPe also incorporates and updates the ASTRAL database. The latest release of SCOPe, 2.03, contains 59 514 Protein Data Bank (PDB) entries, increasing the number of structures classified in SCOP by 55% and including more than 65% of the protein structures in the PDB
SIFTER search: a web server for accurate phylogeny-based protein function prediction.
We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Here, we introduce a user-friendly web interface for accurate protein function prediction using the SIFTER algorithm. SIFTER is a state-of-the-art sequence-based gene molecular function prediction algorithm that uses a statistical model of function evolution to incorporate annotations throughout the phylogenetic tree. Due to the resources needed by the SIFTER algorithm, running SIFTER locally is not trivial for most users, especially for large-scale problems. The SIFTER web server thus provides access to precomputed predictions on 16 863 537 proteins from 232 403 species. Users can explore SIFTER predictions with queries for proteins, species, functions, and homologs of sequences not in the precomputed prediction set. The SIFTER web server is accessible at http://sifter.berkeley.edu/ and the source code can be downloaded
Biases in Illumina transcriptome sequencing caused by random hexamer priming
Generation of cDNA using random hexamer priming induces biases in the nucleotide composition at the beginning of transcriptome sequencing reads from the Illumina Genome Analyzer. The bias is independent of organism and laboratory and impacts the uniformity of the reads along the transcriptome. We provide a read count reweighting scheme, based on the nucleotide frequencies of the reads, that mitigates the impact of the bias
Recommended from our members
A geometrical approach to computing free-energy landscapes from short-ranged potentials
Particles interacting with short-ranged potentials have attracted increasing interest, partly for their ability to model mesoscale systems such as colloids interacting via DNA or depletion. We consider the free-energy landscape of such systems as the range of the potential goes to zero. In this limit, the landscape is entirely defined by geometrical manifolds, plus a single control parameter. These manifolds are fundamental objects that do not depend on the details of the interaction potential and provide the starting point from which any quantity characterizing the systemāequilibrium or nonequilibriumācan be computed for arbitrary potentials. To consider dynamical quantities we compute the asymptotic limit of the FokkerāPlanck equation and show that it becomes restricted to the low-dimensional manifolds connected by āstickyā boundary conditions. To illustrate our theory, we compute the low-dimensional manifolds for Graphic identical particles, providing a complete description of the lowest-energy parts of the landscape including floppy modes with up to 2 internal degrees of freedom. The results can be directly tested on colloidal clusters. This limit is a unique approach for understanding energy landscapes, and our hope is that it can also provide insight into finite-range potentials.Engineering and Applied Science
- ā¦