42 research outputs found

    Estimation of alternative splicing isoform frequencies from RNA-Seq data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Massively parallel whole transcriptome sequencing, commonly referred as RNA-Seq, is quickly becoming the technology of choice for gene expression profiling. However, due to the short read length delivered by current sequencing technologies, estimation of expression levels for alternative splicing gene isoforms remains challenging.</p> <p>Results</p> <p>In this paper we present a novel expectation-maximization algorithm for inference of isoform- and gene-specific expression levels from RNA-Seq data. Our algorithm, referred to as IsoEM, is based on disambiguating information provided by the distribution of insert sizes generated during sequencing library preparation, and takes advantage of base quality scores, strand and read pairing information when available. The open source Java implementation of IsoEM is freely available at <url>http://dna.engr.uconn.edu/software/IsoEM/</url>.</p> <p>Conclusions</p> <p>Empirical experiments on both synthetic and real RNA-Seq datasets show that IsoEM has scalable running time and outperforms existing methods of isoform and gene expression level estimation. Simulation experiments confirm previous findings that, for a fixed sequencing cost, using reads longer than 25-36 bases does not necessarily lead to better accuracy for estimating expression levels of annotated isoforms and genes.</p

    A Comparison of Different Approaches to Unravel the Latent Structure within Metabolic Syndrome

    Get PDF
    Background: Exploratory factor analysis is a commonly used statistical technique in metabolic syndrome research to uncover latent structure amongst metabolic variables. The application of factor analysis requires methodological decisions that reflect the hypothesis of the metabolic syndrome construct. These decisions often raise the complexity of the interpretation from the output. We propose two alternative techniques developed from cluster analysis which can achieve a clinically relevant structure, whilst maintaining intuitive advantages of clustering methodology. Methods: Two advanced techniques of clustering in the VARCLUS and matroid methods are discussed and implemented on a metabolic syndrome data set to analyze the structure of ten metabolic risk factors. The subjects were selected from the normative aging study based in Boston, Massachusetts. The sample included a total of 847 men aged between 21 and 81 years who provided complete data on selected risk factors during the period 1987 to 1991. Results: Four core components were identified by the clustering methods. These are labelled obesity, lipids, insulin resistance and blood pressure. The exploratory factor analysis with oblique rotation suggested an overlap of the loadings identified on the insulin resistance and obesity factors. The VARCLUS and matroid analyses separated these components and were able to demonstrate associations between individual risk factors. Conclusions: An oblique rotation can be selected to reflect the clinical concept of a single underlying syndrome, howeve

    Unbiased Reconstruction of a Mammalian Transcriptional Network Mediating Pathogen Responses

    Get PDF
    Models of mammalian regulatory networks controlling gene expression have been inferred from genomic data but have largely not been validated. We present an unbiased strategy to systematically perturb candidate regulators and monitor cellular transcriptional responses. We applied this approach to derive regulatory networks that control the transcriptional response of mouse primary dendritic cells to pathogens. Our approach revealed the regulatory functions of 125 transcription factors, chromatin modifiers, and RNA binding proteins, which enabled the construction of a network model consisting of 24 core regulators and 76 fine-tuners that help to explain how pathogen-sensing pathways achieve specificity. This study establishes a broadly applicable, comprehensive, and unbiased approach to reveal the wiring and functions of a regulatory network controlling a major transcriptional response in primary mammalian cells

    Rapid De Novo Evolution of X Chromosome Dosage Compensation in Silene latifolia, a Plant with Young Sex Chromosomes

    Get PDF
    Evidence for dosage compensation in Silene latifolia, a plant with 10-million-year-old sex chromosomes, reveals that dosage compensation can evolve rapidly in young XY systems and is not an animal-specific phenomenon

    Proteolytic enzyme-immobilization techniques for MS-based protein analysis

    No full text
    Protein digestion utilizing proteases (e.g., trypsin, Lys C and other proteolytic enzymes) is one of the key sample-preparation steps in contemporary proteomics, followed by liquid chromatography coupled to mass spectrometry (MS). Tryptic digestion is traditionally performed in aqueous solutions, usually applying the enzyme and the sample in a 50:1 protein-to-protease ratio. Long digestion times (up to 24 h), auto-digestion sub-products and poor enzyme-to-substrate ratio are common issues with liquid-phase protein-digestion processes. The use of enzymes immobilized onto solid supports can minimize these problems by increasing enzyme-to-substrate ratios, significantly speeding up digestion times and reducing autolysis. The other main goal of protease immobilization is to obtain rugged, efficient enzyme reactors. In this article, we review the most important proteolytic enzyme-immobilization techniques with the main emphasis on fabrication of trypsin microreactors and nanoreactors and their utilization in bottom-up proteomics. We also discuss data reportedly obtained using the various immobilization protocols with respect to enzyme activity and MS-sequence coverage

    Boronic acid–lectin affinity chromatography. 1. Simultaneous glycoprotein binding with selective or combined elution

    No full text
    We introduce a novel combination of boronic acid affinity chromatography with lectin affinity chromatography, dubbed as boronic acid–lectin affinity chromatography (BLAC). Concanavalin A and wheat germ agglutinin lectins were mixed with the pesudo-lectin boronic acid to form the BLAC affinity column and their performance was evaluated with standard glycoproteins. Optimization of the binding and elution buffers for the BLAC system is described. The BLAC columns were employed to isolate glycoproteins of interest using both selective and/or combined elution

    Joint Regularization

    No full text
    We present a principled method to combine kernels under joint regularization constraints. Central to our method is an extension of the representer theorem for handling multiple joint regularization constraints. Experimental evidence shows the feasibility of our approach
    corecore