58 research outputs found

    The Escherichia coli transcriptome mostly consists of independently regulated modules

    Get PDF
    Underlying cellular responses is a transcriptional regulatory network (TRN) that modulates gene expression. A useful description of the TRN would decompose the transcriptome into targeted effects of individual transcriptional regulators. Here, we apply unsupervised machine learning to a diverse compendium of over 250 high-quality Escherichia coli RNA-seq datasets to identify 92 statistically independent signals that modulate the expression of specific gene sets. We show that 61 of these transcriptomic signals represent the effects of currently characterized transcriptional regulators. Condition-specific activation of signals is validated by exposure of E. coli to new environmental conditions. The resulting decomposition of the transcriptome provides: a mechanistic, systems-level, network-based explanation of responses to environmental and genetic perturbations; a guide to gene and regulator function discovery; and a basis for characterizing transcriptomic differences in multiple strains. Taken together, our results show that signal summation describes the composition of a model prokaryotic transcriptome

    Domain Organization of Long Signal Peptides of Single-Pass Integral Membrane Proteins Reveals Multiple Functional Capacity

    Get PDF
    Targeting signals direct proteins to their extra - or intracellular destination such as the plasma membrane or cellular organelles. Here we investigated the structure and function of exceptionally long signal peptides encompassing at least 40 amino acid residues. We discovered a two-domain organization (“NtraC model”) in many long signals from vertebrate precursor proteins. Accordingly, long signal peptides may contain an N-terminal domain (N-domain) and a C-terminal domain (C-domain) with different signal or targeting capabilities, separable by a presumably turn-rich transition area (tra). Individual domain functions were probed by cellular targeting experiments with fusion proteins containing parts of the long signal peptide of human membrane protein shrew-1 and secreted alkaline phosphatase as a reporter protein. As predicted, the N-domain of the fusion protein alone was shown to act as a mitochondrial targeting signal, whereas the C-domain alone functions as an export signal. Selective disruption of the transition area in the signal peptide impairs the export efficiency of the reporter protein. Altogether, the results of cellular targeting studies provide a proof-of-principle for our NtraC model and highlight the particular functional importance of the predicted transition area, which critically affects the rate of protein export. In conclusion, the NtraC approach enables the systematic detection and prediction of cryptic targeting signals present in one coherent sequence, and provides a structurally motivated basis for decoding the functional complexity of long protein targeting signals

    Large-Scale Discovery and Characterization of Protein Regulatory Motifs in Eukaryotes

    Get PDF
    The increasing ability to generate large-scale, quantitative proteomic data has brought with it the challenge of analyzing such data to discover the sequence elements that underlie systems-level protein behavior. Here we show that short, linear protein motifs can be efficiently recovered from proteome-scale datasets such as sub-cellular localization, molecular function, half-life, and protein abundance data using an information theoretic approach. Using this approach, we have identified many known protein motifs, such as phosphorylation sites and localization signals, and discovered a large number of candidate elements. We estimate that ∼80% of these are novel predictions in that they do not match a known motif in both sequence and biological context, suggesting that post-translational regulation of protein behavior is still largely unexplored. These predicted motifs, many of which display preferential association with specific biological pathways and non-random positioning in the linear protein sequence, provide focused hypotheses for experimental validation

    The Plasmodium Export Element Revisited

    Get PDF
    We performed a bioinformatical analysis of protein export elements (PEXEL) in the putative proteome of the malaria parasite Plasmodium falciparum. A protein family-specific conservation of physicochemical residue profiles was found for PEXEL-flanking sequence regions. We demonstrate that the family members can be clustered based on the flanking regions only and display characteristic hydrophobicity patterns. This raises the possibility that the flanking regions may contain additional information for a family-specific role of PEXEL. We further show that signal peptide cleavage results in a positional alignment of PEXEL from both proteins with, and without, a signal peptide

    A Metabolomic Approach to the Study of Wine Micro-Oxygenation

    Get PDF
    Wine micro-oxygenation is a globally used treatment and its effects were studied here by analysing by untargeted LC-MS the wine metabolomic fingerprint. Eight different procedural variations, marked by the addition of oxygen (four levels) and iron (two levels) were applied to Sangiovese wine, before and after malolactic fermentation

    Identification of Single- and Multiple-Class Specific Signature Genes from Gene Expression Profiles by Group Marker Index

    Get PDF
    Informative genes from microarray data can be used to construct prediction model and investigate biological mechanisms. Differentially expressed genes, the main targets of most gene selection methods, can be classified as single- and multiple-class specific signature genes. Here, we present a novel gene selection algorithm based on a Group Marker Index (GMI), which is intuitive, of low-computational complexity, and efficient in identification of both types of genes. Most gene selection methods identify only single-class specific signature genes and cannot identify multiple-class specific signature genes easily. Our algorithm can detect de novo certain conditions of multiple-class specificity of a gene and makes use of a novel non-parametric indicator to assess the discrimination ability between classes. Our method is effective even when the sample size is small as well as when the class sizes are significantly different. To compare the effectiveness and robustness we formulate an intuitive template-based method and use four well-known datasets. We demonstrate that our algorithm outperforms the template-based method in difficult cases with unbalanced distribution. Moreover, the multiple-class specific genes are good biomarkers and play important roles in biological pathways. Our literature survey supports that the proposed method identifies unique multiple-class specific marker genes (not reported earlier to be related to cancer) in the Central Nervous System data. It also discovers unique biomarkers indicating the intrinsic difference between subtypes of lung cancer. We also associate the pathway information with the multiple-class specific signature genes and cross-reference to published studies. We find that the identified genes participate in the pathways directly involved in cancer development in leukemia data. Our method gives a promising way to find genes that can involve in pathways of multiple diseases and hence opens up the possibility of using an existing drug on other diseases as well as designing a single drug for multiple diseases

    Lateral opening of a translocon upon entry of protein suggests the mechanism of insertion into membranes

    No full text
    The structure of the protein-translocating channel SecYEβ from Pyrococcus furiosus at 3.1-Å resolution suggests a mechanism for chaperoning transmembrane regions of a protein substrate during its lateral delivery into the lipid bilayer. Cytoplasmic segments of SecY orient the C-terminal α-helical region of another molecule, suggesting a general binding mode and a promiscuous guiding surface capable of accommodating diverse nascent chains at the exit of the ribosomal tunnel. To accommodate this putative nascent chain mimic, the cytoplasmic vestibule widens, and a lateral exit portal is opened throughout its entire length for partition of transmembrane helical segments to the lipid bilayer. In this primed channel, the central plug still occludes the pore while the lateral gate is opened, enabling topological arbitration during early protein insertion. In vivo, a 15 amino acid truncation of the cytoplasmic C-terminal helix of SecY fails to rescue a secY-deficient strain, supporting the essential role of this helix as suggested from the structure

    The Fas ligand intracellular domain is released by ADAM10 and SPPL2a cleavage in T-cells.

    No full text
    Fas ligand (FasL) is a type II transmembrane protein belonging to the tumor necrosis factor family. Its binding to the cognate Fas receptor triggers the apoptosis that plays a pivotal role in the maintenance of immune system homeostasis. The cell death-inducing property of FasL has been associated with its extracellular domain, which can be cleaved off by metalloprotease activity to produce soluble FasL. The fate of the remaining membrane-anchored N-terminal part of the FasL molecule has not been determined. Here we show that post-translational processing of overexpressed and endogenous FasL in T-cells by the disintegrin and metalloprotease ADAM10 generates a 17-kDa N-terminal fragment, which lacks the receptor-binding extracellular domain. This FasL remnant is membrane anchored and further processed by SPPL2a, a member of the signal peptide peptidase-like family of intramembrane-cleaving proteases. SPPL2a cleavage liberates a smaller and highly unstable fragment mainly containing the intracellular FasL domain (FasL ICD). We show that this fragment translocates to the nucleus and is capable of inhibiting gene transcription. With ADAM10 and SPPL2a we have identified two proteases implicated in FasL processing and release of the FasL ICD, which has been shown to be important for retrograde FasL signaling
    corecore