61 research outputs found

    Bridging the gap between protein-tyrosine phosphorylation networks, metabolism and physiology in liver-specific PTP1b deletion mice

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Computational and Systems Biology Program, 2012.Cataloged from PDF version of thesis.Includes bibliographical references.Metabolic syndrome describes a complex set of obesity-related disorders that enhance diabetes, cardiovascular, and mortality risk. Studies of liver-specific protein-tyrosine phosphatase lb (PTPlb) deletion mice (L-PTPlb-/-) suggests that hepatic PTPlb inhibition would mitigate metabolic syndrome progression through amelioration of hepatic insulin resistance, endoplasmic reticulum stress, and whole-body lipid metabolism. However, the network alterations underlying these phenotypes are poorly understood. Mass spectrometry was used to quantitatively discover protein phosphotyrosine network changes in L-PTP lb-/- mice relative to control mice under both normal and high-fat diet conditions. A phosphosite set enrichment analysis was developed to identify numerous pathways exhibiting PTPlb- and diet-dependent phosphotyrosine regulation. Detection of PTP lb-dependent phosphotyrosine sites on lipid metabolic proteins initiated global lipidomics characterization of corresponding liver samples and revealed altered fatty acid and triglyceride metabolism in L-PTPlb-/- mice. Multivariate modeling techniques were developed to infer molecular dependencies between phosphosites and lipid metabolic changes, resulting in quantitatively predictive phenotypic models.by Emily R. Miraldi.Ph.D

    Hybrid Gene Origination Creates Human-Virus Chimeric Proteins during Infection

    Get PDF
    RNA viruses are a major human health threat. The life cycles of many highly pathogenic RNA viruses like influ-enza A virus (IAV) and Lassa virus depends on host mRNA, because viral polymerases cleave 50-m7G-cappedhost transcripts to prime viral mRNA synthesis (‘‘cap-snatching’’). We hypothesized that start codons withincap-snatched host transcripts could generate chimeric human-viral mRNAs with coding potential. We reportthe existence of this mechanism of gene origination, which we named ‘‘start-snatching.’’ Depending on thereading frame, start-snatching allows the translation of host and viral ‘‘untranslated regions’’ (UTRs) to createN-terminally extended viral proteins or entirely novel polypeptides by genetic overprinting. We show thatboth types of chimeric proteins are made in IAV-infected cells, generate T cell responses, and contributeto virulence. Our results indicate that during infection with IAV, and likely a multitude of other human, animaland plant viruses, a host-dependent mechanism allows the genesis of hybrid genes

    Multi-study inference of regulatory networks for more accurate models of gene regulation.

    No full text
    Gene regulatory networks are composed of sub-networks that are often shared across biological processes, cell-types, and organisms. Leveraging multiple sources of information, such as publicly available gene expression datasets, could therefore be helpful when learning a network of interest. Integrating data across different studies, however, raises numerous technical concerns. Hence, a common approach in network inference, and broadly in genomics research, is to separately learn models from each dataset and combine the results. Individual models, however, often suffer from under-sampling, poor generalization and limited network recovery. In this study, we explore previous integration strategies, such as batch-correction and model ensembles, and introduce a new multitask learning approach for joint network inference across several datasets. Our method initially estimates the activities of transcription factors, and subsequently, infers the relevant network topology. As regulatory interactions are context-dependent, we estimate model coefficients as a combination of both dataset-specific and conserved components. In addition, adaptive penalties may be used to favor models that include interactions derived from multiple sources of prior knowledge including orthogonal genomics experiments. We evaluate generalization and network recovery using examples from Bacillus subtilis and Saccharomyces cerevisiae, and show that sharing information across models improves network reconstruction. Finally, we demonstrate robustness to both false positives in the prior information and heterogeneity among datasets

    Analysis of 3D genomic interactions identifies candidate host genes that transposable elements potentially regulate

    No full text
    Abstract Background The organization of chromatin in the nucleus plays an essential role in gene regulation. About half of the mammalian genome comprises transposable elements. Given their repetitive nature, reads associated with these elements are generally discarded or randomly distributed among elements of the same type in genome-wide analyses. Thus, it is challenging to identify the activities and properties of individual transposons. As a result, we only have a partial understanding of how transposons contribute to chromatin folding and how they impact gene regulation. Results Using PCR and Capture-based chromosome conformation capture (3C) approaches, collectively called 4Tran, we take advantage of the repetitive nature of transposons to capture interactions from multiple copies of endogenous retrovirus (ERVs) in the human and mouse genomes. With 4Tran-PCR, reads are selectively mapped to unique regions in the genome. This enables the identification of transposable element interaction profiles for individual ERV families and integration events specific to particular genomes. With this approach, we demonstrate that transposons engage in long-range intra-chromosomal interactions guided by the separation of chromosomes into A and B compartments as well as topologically associated domains (TADs). In contrast to 4Tran-PCR, Capture-4Tran can uniquely identify both ends of an interaction that involve retroviral repeat sequences, providing a powerful tool for uncovering the individual transposable element insertions that interact with and potentially regulate target genes. Conclusions 4Tran provides new insight into the manner in which transposons contribute to chromosome architecture and identifies target genes that transposable elements can potentially control

    4C-ker: A Method to Reproducibly Identify Genome-Wide Interactions Captured by 4C-Seq Experiments.

    No full text
    4C-Seq has proven to be a powerful technique to identify genome-wide interactions with a single locus of interest (or "bait") that can be important for gene regulation. However, analysis of 4C-Seq data is complicated by the many biases inherent to the technique. An important consideration when dealing with 4C-Seq data is the differences in resolution of signal across the genome that result from differences in 3D distance separation from the bait. This leads to the highest signal in the region immediately surrounding the bait and increasingly lower signals in far-cis and trans. Another important aspect of 4C-Seq experiments is the resolution, which is greatly influenced by the choice of restriction enzyme and the frequency at which it can cut the genome. Thus, it is important that a 4C-Seq analysis method is flexible enough to analyze data generated using different enzymes and to identify interactions across the entire genome. Current methods for 4C-Seq analysis only identify interactions in regions near the bait or in regions located in far-cis and trans, but no method comprehensively analyzes 4C signals of different length scales. In addition, some methods also fail in experiments where chromatin fragments are generated using frequent cutter restriction enzymes. Here, we describe 4C-ker, a Hidden-Markov Model based pipeline that identifies regions throughout the genome that interact with the 4C bait locus. In addition, we incorporate methods for the identification of differential interactions in multiple 4C-seq datasets collected from different genotypes or experimental conditions. Adaptive window sizes are used to correct for differences in signal coverage in near-bait regions, far-cis and trans chromosomes. Using several datasets, we demonstrate that 4C-ker outperforms all existing 4C-Seq pipelines in its ability to reproducibly identify interaction domains at all genomic ranges with different resolution enzymes

    Sparse and Compositionally Robust Inference of Microbial Ecological Networks

    No full text
    <div><p>16S ribosomal RNA (rRNA) gene and other environmental sequencing techniques provide snapshots of microbial communities, revealing phylogeny and the abundances of microbial populations across diverse ecosystems. While changes in microbial community structure are demonstrably associated with certain environmental conditions (from metabolic and immunological health in mammals to ecological stability in soils and oceans), identification of underlying mechanisms requires new statistical tools, as these datasets present several technical challenges. First, the abundances of microbial operational taxonomic units (OTUs) from amplicon-based datasets are compositional. Counts are normalized to the total number of counts in the sample. Thus, microbial abundances are not independent, and traditional statistical metrics (e.g., correlation) for the detection of OTU-OTU relationships can lead to spurious results. Secondly, microbial sequencing-based studies typically measure hundreds of OTUs on only tens to hundreds of samples; thus, inference of OTU-OTU association networks is severely under-powered, and additional information (or assumptions) are required for accurate inference. Here, we present SPIEC-EASI (<b>SP</b>arse <b>I</b>nvers<b>E</b><b>C</b>ovariance Estimation for <b>E</b>cological <b>A</b>ssociation <b>I</b>nference), a statistical method for the inference of microbial ecological networks from amplicon sequencing datasets that addresses both of these issues. SPIEC-EASI combines data transformations developed for compositional data analysis with a graphical model inference framework that assumes the underlying ecological association network is sparse. To reconstruct the network, SPIEC-EASI relies on algorithms for sparse neighborhood and inverse covariance selection. To provide a synthetic benchmark in the absence of an experimentally validated gold-standard network, SPIEC-EASI is accompanied by a set of computational tools to generate OTU count data from a set of diverse underlying network topologies. SPIEC-EASI outperforms state-of-the-art methods to recover edges and network properties on synthetic data under a variety of scenarios. SPIEC-EASI also reproducibly predicts previously unknown microbial associations using data from the American Gut project.</p></div

    Alveolar epithelial progenitor cells require Nkx2-1 to maintain progenitor-specific epigenomic state during lung homeostasis and regeneration

    No full text
    Abstract Lung epithelial regeneration after acute injury requires coordination cellular coordination to pattern the morphologically complex alveolar gas exchange surface. During adult lung regeneration, Wnt-responsive alveolar epithelial progenitor (AEP) cells, a subset of alveolar type 2 (AT2) cells, proliferate and transition to alveolar type 1 (AT1) cells. Here, we report a refined primary murine alveolar organoid, which recapitulates critical aspects of in vivo regeneration. Paired scRNAseq and scATACseq followed by transcriptional regulatory network (TRN) analysis identified two AT1 transition states driven by distinct regulatory networks controlled in part by differential activity of Nkx2-1. Genetic ablation of Nkx2-1 in AEP-derived organoids was sufficient to cause transition to a proliferative stressed Krt8+ state, and AEP-specific deletion of Nkx2-1 in adult mice led to rapid loss of progenitor state and uncontrolled growth of Krt8+ cells. Together, these data implicate dynamic epigenetic maintenance via Nkx2-1 as central to the control of facultative progenitor activity in AEPs

    Workflow of the SPIEC-EASI pipeline.

    No full text
    <p>The SPIEC-EASI pipeline consists of two independent parts for <b>a</b>) synthetic data generation and <b>b</b>) network inference. <b>a</b>) Synthetic data generation requires an OTU count table and a user-selected network topology. Internally, the parameters of a statistical distribution (the zero-inflated Negative binomial model is suggested) are fit to the OTU marginals of the real data, and are combined with the randomly-generated network in the Normal to Anything (NORTA) approach to generate correlated count data. <b>b</b>) Network inference proceeds in three stages on synthetic or real OTU count data: First, data is pre-procssed and centered log-ratio (CLR) transformed to ensure compositional robustness. Next, the user selects one of two graphical model inference procedures: 1) Neighborhood selection (the MB method) or 2) inverse covariance selection (the glasso method). SPIEC-EASI network inference assumes that the underlying network is sparse. We infer the correct model sparseness by the Stability Approach to Regularization Selection (StARS), which involves random subsampling of the dataset to find a network with low variability in the selected set of edges. SPIEC-EASI outputs include an ecological network (from the non-zero entries of the inverse covariance network) and an invertible covariance matrix. If the network was inferred from synthetic data, it can be compared with the input network to assess inference quality.</p

    a)Bivariate illustration of the NorTA approach.

    No full text
    <p>First normal data, incorporating the target correlation structure, is generated. Uniform data are then generated for each margin via the normal density function. These is then converted to an arbitrary marginal distribution (Poisson and Zero-inflated Negative Binomial shown as examples) via its quantile function. To generate realistic synthetic data, parameters for these margins are fit to real data. <b>b</b>) Examples of band-like, cluster, and scale-free network topologies</p
    • …
    corecore