14,684 research outputs found
PROTEOFORMER: deep proteome coverage through ribosome profiling and MS integration
An increasing amount of studies integrate mRNA sequencing data into MS-based proteomics to complement the translation product search space. However, several factors, including extensive regulation of mRNA translation and the need for three- or six-frame-translation, impede the use of mRNA-seq data for the construction of a protein sequence search database. With that in mind, we developed the PROTEOFORMER tool that automatically processes data of the recently developed ribosome profiling method (sequencing of ribosome-protected mRNA fragments), resulting in genome-wide visualization of ribosome occupancy. Our tool also includes a translation initiation site calling algorithm allowing the delineation of the open reading frames (ORFs) of all translation products. A complete protein synthesis-based sequence database can thus be compiled for mass spectrometry-based identification. This approach increases the overall protein identification rates with 3% and 11% (improved and new identifications) for human and mouse, respectively, and enables proteome-wide detection of 5'-extended proteoforms, upstream ORF translation and near-cognate translation start sites. The PROTEOFORMER tool is available as a stand-alone pipeline and has been implemented in the galaxy framework for ease of use
Methods to study splicing from high-throughput RNA Sequencing data
The development of novel high-throughput sequencing (HTS) methods for RNA
(RNA-Seq) has provided a very powerful mean to study splicing under multiple
conditions at unprecedented depth. However, the complexity of the information
to be analyzed has turned this into a challenging task. In the last few years,
a plethora of tools have been developed, allowing researchers to process
RNA-Seq data to study the expression of isoforms and splicing events, and their
relative changes under different conditions. We provide an overview of the
methods available to study splicing from short RNA-Seq data. We group the
methods according to the different questions they address: 1) Assignment of the
sequencing reads to their likely gene of origin. This is addressed by methods
that map reads to the genome and/or to the available gene annotations. 2)
Recovering the sequence of splicing events and isoforms. This is addressed by
transcript reconstruction and de novo assembly methods. 3) Quantification of
events and isoforms. Either after reconstructing transcripts or using an
annotation, many methods estimate the expression level or the relative usage of
isoforms and/or events. 4) Providing an isoform or event view of differential
splicing or expression. These include methods that compare relative
event/isoform abundance or isoform expression across two or more conditions. 5)
Visualizing splicing regulation. Various tools facilitate the visualization of
the RNA-Seq data in the context of alternative splicing. In this review, we do
not describe the specific mathematical models behind each method. Our aim is
rather to provide an overview that could serve as an entry point for users who
need to decide on a suitable tool for a specific analysis. We also attempt to
propose a classification of the tools according to the operations they do, to
facilitate the comparison and choice of methods.Comment: 31 pages, 1 figure, 9 tables. Small corrections adde
Detecting Repetitions and Periodicities in Proteins by Tiling the Structural Space
The notion of energy landscapes provides conceptual tools for understanding
the complexities of protein folding and function. Energy Landscape Theory
indicates that it is much easier to find sequences that satisfy the "Principle
of Minimal Frustration" when the folded structure is symmetric (Wolynes, P. G.
Symmetry and the Energy Landscapes of Biomolecules. Proc. Natl. Acad. Sci.
U.S.A. 1996, 93, 14249-14255). Similarly, repeats and structural mosaics may be
fundamentally related to landscapes with multiple embedded funnels. Here we
present analytical tools to detect and compare structural repetitions in
protein molecules. By an exhaustive analysis of the distribution of structural
repeats using a robust metric we define those portions of a protein molecule
that best describe the overall structure as a tessellation of basic units. The
patterns produced by such tessellations provide intuitive representations of
the repeating regions and their association towards higher order arrangements.
We find that some protein architectures can be described as nearly periodic,
while in others clear separations between repetitions exist. Since the method
is independent of amino acid sequence information we can identify structural
units that can be encoded by a variety of distinct amino acid sequences
A Metric for genus-zero surfaces
We present a new method to compare the shapes of genus-zero surfaces. We
introduce a measure of mutual stretching, the symmetric distortion energy, and
establish the existence of a conformal diffeomorphism between any two
genus-zero surfaces that minimizes this energy. We then prove that the energies
of the minimizing diffeomorphisms give a metric on the space of genus-zero
Riemannian surfaces. This metric and the corresponding optimal diffeomorphisms
are shown to have properties that are highly desirable for applications.Comment: 33 pages, 8 figure
Pycortex: an interactive surface visualizer for fMRI.
Surface visualizations of fMRI provide a comprehensive view of cortical activity. However, surface visualizations are difficult to generate and most common visualization techniques rely on unnecessary interpolation which limits the fidelity of the resulting maps. Furthermore, it is difficult to understand the relationship between flattened cortical surfaces and the underlying 3D anatomy using tools available currently. To address these problems we have developed pycortex, a Python toolbox for interactive surface mapping and visualization. Pycortex exploits the power of modern graphics cards to sample volumetric data on a per-pixel basis, allowing dense and accurate mapping of the voxel grid across the surface. Anatomical and functional information can be projected onto the cortical surface. The surface can be inflated and flattened interactively, aiding interpretation of the correspondence between the anatomical surface and the flattened cortical sheet. The output of pycortex can be viewed using WebGL, a technology compatible with modern web browsers. This allows complex fMRI surface maps to be distributed broadly online without requiring installation of complex software
- …