43,137 research outputs found
TRAPID : an efficient online tool for the functional and comparative analysis of de novo RNA-Seq transcriptomes
Transcriptome analysis through next-generation sequencing technologies allows the generation of detailed gene catalogs for non-model species, at the cost of new challenges with regards to computational requirements and bioinformatics expertise. Here, we present TRAPID, an online tool for the fast and efficient processing of assembled RNA-Seq transcriptome data, developed to mitigate these challenges. TRAPID offers high-throughput open reading frame detection, frameshift correction and includes a functional, comparative and phylogenetic toolbox, making use of 175 reference proteomes. Benchmarking and comparison against state-of-the-art transcript analysis tools reveals the efficiency and unique features of the TRAPID system
Extensive mass spectrometry-based analysis of the fission yeast proteome: the Schizosaccharomyces pombe PeptideAtlas
We report a high quality and system-wide proteome catalogue covering 71% (3,542 proteins) of the predicted genes of fission yeast, Schizosaccharomyces pombe, presenting the largest protein dataset to date for this important model organism. We obtained this high proteome and peptide (11.4 peptides/protein) coverage by a combination of extensive sample fractionation, high resolution Orbitrap mass spectrometry, and combined database searching using the iProphet software as part of the Trans-Proteomics Pipeline. All raw and processed data are made accessible in the S. pombe PeptideAtlas. The identified proteins showed no biases in functional properties and allowed global estimation of protein abundances. The high coverage of the PeptideAtlas allowed correlation with transcriptomic data in a system-wide manner indicating that post-transcriptional processes control the levels of at least half of all identified proteins. Interestingly, the correlation was not equally tight for all functional categories ranging from r(s) >0.80 for proteins involved in translation to r(s) <0.45 for signal transduction proteins. Moreover, many proteins involved in DNA damage repair could not be detected in the PeptideAtlas despite their high mRNA levels, strengthening the translation-on-demand hypothesis for members of this protein class. In summary, the extensive and publicly available S. pombe PeptideAtlas together with the generated proteotypic peptide spectral library will be a useful resource for future targeted, in-depth, and quantitative proteomic studies on this microorganism
PRED-CLASS: cascading neural networks for generalized protein classification and genome-wide applications
A cascading system of hierarchical, artificial neural networks (named
PRED-CLASS) is presented for the generalized classification of proteins into
four distinct classes-transmembrane, fibrous, globular, and mixed-from
information solely encoded in their amino acid sequences. The architecture of
the individual component networks is kept very simple, reducing the number of
free parameters (network synaptic weights) for faster training, improved
generalization, and the avoidance of data overfitting. Capturing information
from as few as 50 protein sequences spread among the four target classes (6
transmembrane, 10 fibrous, 13 globular, and 17 mixed), PRED-CLASS was able to
obtain 371 correct predictions out of a set of 387 proteins (success rate
approximately 96%) unambiguously assigned into one of the target classes. The
application of PRED-CLASS to several test sets and complete proteomes of
several organisms demonstrates that such a method could serve as a valuable
tool in the annotation of genomic open reading frames with no functional
assignment or as a preliminary step in fold recognition and ab initio structure
prediction methods. Detailed results obtained for various data sets and
completed genomes, along with a web sever running the PRED-CLASS algorithm, can
be accessed over the World Wide Web at http://o2.biol.uoa.gr/PRED-CLAS
Multiplierz: An Extensible API Based Desktop Environment for Proteomics Data Analysis
BACKGROUND. Efficient analysis of results from mass spectrometry-based proteomics experiments requires access to disparate data types, including native mass spectrometry files, output from algorithms that assign peptide sequence to MS/MS spectra, and annotation for proteins and pathways from various database sources. Moreover, proteomics technologies and experimental methods are not yet standardized; hence a high degree of flexibility is necessary for efficient support of high- and low-throughput data analytic tasks. Development of a desktop environment that is sufficiently robust for deployment in data analytic pipelines, and simultaneously supports customization for programmers and non-programmers alike, has proven to be a significant challenge. RESULTS. We describe multiplierz, a flexible and open-source desktop environment for comprehensive proteomics data analysis. We use this framework to expose a prototype version of our recently proposed common API (mzAPI) designed for direct access to proprietary mass spectrometry files. In addition to routine data analytic tasks, multiplierz supports generation of information rich, portable spreadsheet-based reports. Moreover, multiplierz is designed around a "zero infrastructure" philosophy, meaning that it can be deployed by end users with little or no system administration support. Finally, access to multiplierz functionality is provided via high-level Python scripts, resulting in a fully extensible data analytic environment for rapid development of custom algorithms and deployment of high-throughput data pipelines. CONCLUSION. Collectively, mzAPI and multiplierz facilitate a wide range of data analysis tasks, spanning technology development to biological annotation, for mass spectrometry-based proteomics research.Dana-Farber Cancer Institute; National Human Genome Research Institute (P50HG004233); National Science Foundation Integrative Graduate Education and Research Traineeship grant (DGE-0654108
Investigating hookworm genomes by comparative analysis of two Ancylostoma species
Background
Hookworms, infecting over one billion people, are the mostly closely related major human parasites to the model nematode Caenorhabditis elegans. Applying genomics techniques to these species, we analyzed 3,840 and 3,149 genes from Ancylostoma caninum and A. ceylanicum.
Results
Transcripts originated from libraries representing infective L3 larva, stimulated L3, arrested L3, and adults. Most genes are represented in single stages including abundant transcripts like hsp-20 in infective L3 and vit-3 in adults. Over 80% of the genes have homologs in C. elegans, and nearly 30% of these were with observable RNA interference phenotypes. Homologies were identified to nematode-specific and clade V specific gene families. To study the evolution of hookworm genes, 574 A. caninum / A. ceylanicum orthologs were identified, all of which were found to be under purifying selection with distribution ratios of nonsynonymous to synonymous amino acid substitutions similar to that reported for C. elegans / C. briggsae orthologs. The phylogenetic distance between A. caninum and A. ceylanicum is almost identical to that for C. elegans / C. briggsae.
Conclusion
The genes discovered should substantially accelerate research toward better understanding of the parasites' basic biology as well as new therapies including vaccines and novel anthelmintics
Human resource management and learning for innovation: pharmaceuticals in Mexico
This paper investigates the influence of human resource management on learning from internal and external sources of knowledge. Learning for innovation is a key ingredient of catching-up processes. The analysis builds on survey data about pharmaceutical firms in Mexico. Results show that the influence of human resource management is contingent on the knowledge flows and innovation goals pursued by the firm. Practices such as training-- particularly from external partners; and remuneration for performance are conducive to learning for innovation.Learning, R&D, human resource management, pharmaceuticals, Mexico
Recommended from our members
Uneven distribution of cobamide biosynthesis and dependence in bacteria predicted by comparative genomics.
The vitamin B12 family of cofactors known as cobamides are essential for a variety of microbial metabolisms. We used comparative genomics of 11,000 bacterial species to analyze the extent and distribution of cobamide production and use across bacteria. We find that 86% of bacteria in this data set have at least one of 15 cobamide-dependent enzyme families, but only 37% are predicted to synthesize cobamides de novo. The distribution of cobamide biosynthesis and use vary at the phylum level. While 57% of Actinobacteria are predicted to biosynthesize cobamides, only 0.6% of Bacteroidetes have the complete pathway, yet 96% of species in this phylum have cobamide-dependent enzymes. The form of cobamide produced by the bacteria could be predicted for 58% of cobamide-producing species, based on the presence of signature lower ligand biosynthesis and attachment genes. Our predictions also revealed that 17% of bacteria have partial biosynthetic pathways, yet have the potential to salvage cobamide precursors. Bacteria with a partial cobamide biosynthesis pathway include those in a newly defined, experimentally verified category of bacteria lacking the first step in the biosynthesis pathway. These predictions highlight the importance of cobamide and cobamide precursor salvaging as examples of nutritional dependencies in bacteria
Adapting Real Quantifier Elimination Methods for Conflict Set Computation
The satisfiability problem in real closed fields is decidable. In the context
of satisfiability modulo theories, the problem restricted to conjunctive sets
of literals, that is, sets of polynomial constraints, is of particular
importance. One of the central problems is the computation of good explanations
of the unsatisfiability of such sets, i.e.\ obtaining a small subset of the
input constraints whose conjunction is already unsatisfiable. We adapt two
commonly used real quantifier elimination methods, cylindrical algebraic
decomposition and virtual substitution, to provide such conflict sets and
demonstrate the performance of our method in practice
Toward Informative Assessment and a Culture of Evidence
Examines how campuses in the Strengthening Pre-collegiate Education in Community Colleges initiative combined traditional and innovative measures of student performance such as "think-aloud" protocol and pre-post testing to improve teaching and learning
- …