197 research outputs found

    Mass Spectrometry-Based Proteomics for Studying Microbial Physiology from Isolates to Communities

    Get PDF
    With the advent of whole genome sequencing, a new era of biology was ushered in allowing for “systems-biology” approaches to characterizing microbial systems. The field of systems biology aims to catalogue and understand all of the biological components, their functions, and all of their interactions in a living system as well as communities of living systems. Systems biology can be considered an attempt to measure all of the components of a living system and then produce a data-driven model of the system. This model can then be used to generate hypotheses about how the system will respond to perturbations, which can be tested experimentally. The first step in the process is the determination of a microbial genome. This process has, to a large extent, been fully developed, with hundreds of microbial genome sequences completed and hundreds more being characterized at a breathtaking pace. The developments of technologies to use this information and to further probe the functional components of microbes at a global level are currently being developed. The field of gene expression analysis at the transcript level is one example; it is now possible to simultaneously measure and compare the expression of thousands of mRNA products in a single experiment. The natural extension of these experiments is to simultaneously measure and compare the expression of all the proteins present in a microbial system. This is the field of proteomics. With the development of electrospray ionization, rapid tandem mass spectrometry and database-searching algorithms, mass spectrometry (MS) has become the leader in the attempts to decipher proteomes. This research effort is very young and many challenges still exist. The goal of the work described here was to build a state-of-the-art robust MS-based proteomics platform for the characterization of microbial proteomes from isolates to communities. The research presented here describes the successes and challenges of this objective. Proteome analyses of the metal-reducing bacteria Shewanella oneidensis and the metabolically versatile bacteria Rhodopseudomonas palustris are given as examples of the power of this technology to elucidate proteins important to different metabolic states at a global level. The analysis of microbial proteomes from isolates is only the first step of the challenge. In nature, microbial species do not act alone but are always found in mixtures with other species where their intricate interactions are critical for survival. These studies conclude with some of the first efforts to develop methodologies to measure proteomes of simple controlled mixtures of microbial species and then present the first attempt at measuring the proteome of a natural microbial community, a biofilm from an acid mine drainage system. This microbial system illustrates life at the extreme of nature where life not only exists but flourishes in very acidic conditions with high metal concentrations and high temperatures. The technologies developed through these studies were applied to the first deep characterization of a microbial community proteome, the deciphering of the expressed proteome of the acid mine drainage biofilm. The research presented here has led to development of a state-of-the-art robust proteome pipeline, which can now be applied to the proteome analysis of any microbial isolate for a sequenced species. The first steps have also been made toward developing methodologies to characterize microbial proteomes in their natural environments. These developments are key to integrating proteome technologies with genome and transcriptome technologies for global characterizations of microbial species at the systems level. This will lead to understanding of microbial physiology from a global view where instead of analyzing one gene or protein at a time, hundreds of genes/proteins will be interrogated in microbial species as the adapt and survive in the environment

    Characterization of the Extracellular Proteome of a Natural Microbial Community with an Integrated Mass Spectrometric / Bioinformatic Approach

    Get PDF
    Proteomics comprises the identification and characterization of the complete suite of expressed proteins in a given cell, organism or community. The coupling of high performance liquid chromatography (LC) with high throughput mass spectrometry (MS) has provided the foundation for current proteomic progression. The transition from proteomic analysis of a single cultivated microbe to that of natural microbial assemblages has required significant advancement in technology and has provided greater biological understanding of microbial community diversity and function. To enhance the capabilities of a mass spectrometric based proteomic analysis, an integrated approach combining bioinformatics with analytical preparations and experimental data collection was developed and applied. This has resulted in a deep characterization of the extracellular fraction of a community of microbes thriving in an acid mine drainage system. Among the notable features of this relatively low complexity community, they exist in a solution that is highly acidic (pH \u3c 1) and hot (temperature \u3e 40°C), with molar concentrations of metals. The extracellular fraction is of particular interest due to the potential to identify and characterize novel proteins that are critical for survival and interactions with the harsh environment. The following analyses have resulted in the specific identification and characterization of novel extracellular proteins. In order to more accurately identify which proteins are present in the extracellular space, a combined computational prediction and experimental identification of the extracellular fraction was performed. Among the hundreds of proteins identified, a highly abundant novel cytochrome was targeted and ultimately characterized through high performance MS. In order to achieve deep proteomic coverage of the extracellular fraction, a metal affinity based protein enrichment utilizing seven different metals was developed and employed resulting in novel protein identifications. A combined top down and bottom up analysis resulted in the characterization of the intact molecular forms of extracellular proteins, including the identification of post-translational modifications. Finally, in order to determine the effectiveness of current MS methodologies, a software package was designed to characterize the \u3e 100,000 mass spectra collected during an MS experiment, revealing that specific optimizations in the LC, MS and protein sequence database have a significant impact on proteomic depth

    Evolution and Biological Roles of Three-Finger Toxins in Snake Venoms

    Get PDF
    Snake venoms are complex mixtures of many enzymatic and non-enzymatic proteins, as well as small peptides. Several major venom protein superfamilies, including three-finger toxins, phospholipases A2, serine proteinases, metalloproteinases, proteinase inhibitors and lectins, are found in almost all snake venoms, from front-fanged viperids (vipers and pit vipers) and elapids (cobras, mambas, sea snakes, etc.) to rear-fanged colubrids. However, these proteins vary in abundance and functionality between species. Variation in snake venom composition is attributed to both differences in the expression levels of toxin encoding genes and occurrence of amino acid sequence polymorphisms. Documenting intraspecific venom variation has both clinical (antiserum development) and biological (predator and prey coevolution) implications. Venom is primarily a trophic adaptation and as such, the evolution and abundance of venom proteins relates directly to prey capture success and organism natural history. Without this biologically relevant perspective, proteomic and transcriptomic approaches could produce simply a list of proteins, peptides, and transcripts. It is therefore important to consider the presence and evolution of venom proteins in terms of their biological significance to the organism. Three-finger toxins (3FTx) comprise a particularly common venom protein superfamily that contributes significantly to differences in envenomation symptomology, toxicity, and overall venom composition. Three-finger toxins are non-enzymatic proteins that maintain a common molecular scaffold, and bind to different receptors/acceptors and exhibit a wide variety of biological effects. These toxins are the main lethal neurotoxins in some snake venoms and are currently the only known venom proteins associated with prey-specific toxicity. This dissertation has four major objectives: (i) to examine 3FTxs in front-fanged Elapidae and rear-fanged snake venoms for prey-specific toxicity, (ii) to examine differences in 3FTx expression within rear-fanged snake venom glands, (iii) to determine if mRNA transcripts obtained from crude venoms can be utilized for molecular evolutionary studies and venom proteomic studies, and (iv) to determine if a transcriptomic and proteomic integrated approach can more thoroughly characterize differences in rear-fanged snake venom composition. Three-finger toxins were isolated from the venom of the front-fanged Naja kaouthia (Family Elapidae; Monocled Cobra) and rear-fanged Spilotes (Pseustes) sulphureus (Family Colubridae; Amazon Puffing Snake) using chromatographic techniques, and toxicity assays were performed to evaluate prey specificity. Despite various 3FTxs being present in abundance within N. kaouthia venom, only one 3FTx (alpha-cobratoxin) demonstrated lethal toxicity (\u3c5 \u3eµg/g) toward both NSA mice (Mus musculus) and House Geckos (Hemidactylus frenatus). For P. sulphureus, the most abundant 3FTx (sulmotoxin A), a heterodimeric complex, displayed prey-specific toxicity towards House Geckos, and the second most abundant 3FTx (sulmotoxin B) displayed prey-specific toxicity towards mice. This demonstrates how a relatively simple venom with toxins dominated by one venom protein superfamily (3FTXs) can still allow for the targeting of a diversity of prey. Venom gland toxin transcriptomes and crude venom transcriptomes were obtained via individual transcripts with 3’RACE (Rapid Amplification of cDNA Ends) and next- generation sequencing to evaluate the abundance, diversity, and molecular evolution of 3FTxs. Venom protein gene expression within rear-fanged snake venom glands revealed trends towards either viper-like expression, dominated by snake venom metalloproteinases, or elapid-like expression, dominated by 3FTxs. For non-conventional 3FTxs transcripts within these glands and within crude venom, approximately 32% of 3FTx amino acid sites were under positive selection, and approximately 20% of sites were functionally critical and conserved. RNA isolated from crude venom demonstrated to be a successful approach to obtain venom protein transcripts for molecular evolutionary analyses, resulting in a novel approach without the need to sacrifice snakes for tissue. The use of a combined venom gland transcriptome with proteomic approaches aided in characterizing venom composition from previously unstudied rear-fanged snake venoms. This dissertation represents an important step in the incorporation of multiple high-throughput characterization methods and the addition of multiple assays to explore the biological roles of toxins, in particular 3FTxs, within these venoms

    Needles in a haystack of protein diversity: Interrogation of complex biological samples through specialized strategies in bottom-up proteomics uncover peptides of interest for diverse applications

    Get PDF
    Peptide identification is at the core of bottom-up proteomics measurements. However, even with state-of the-art mass spectrometric instrumentation, peptide level information is still lost or missing in these types of experiments. Reasons behind missing peptide identifications in bottom-up proteomics include variable peptide ionization efficiencies, ion suppression effects, as well as the occurrence of chimeric spectra that can lower the efficacy of database search strategies. Peptides derived from naturally abundant proteins in a biological system also have better chances of being identified in comparison to the ones produced from less abundant proteins, at least in regular discovery-based proteomics experiments. This dissertation focused on the recovery of the “missing or hidden proteome” information in complex biological matrices by approaching this challenge under a peptide-centric view and implementing different liquid chromatography tandem mass spectrometry (LC-MS/MS) experimental workflows. In particular, the projects presented here covered: (1) The feasibility of applying a liquid chromatography-multiple reaction monitoring MS methodology for the targeted identification of peptides serving as surrogates of protein biomarkers in environmental matrices with unknown microbial diversities; (2) the evaluation of selecting unique tryptic peptides in-silico that can distinguish groups of proteins, instead of individual proteins, for targeted proteomics workflows; (3) maximizing peptide identification in spectral data collected from different LC-MS/MS setups by applying a multi-peptide-spectrum-match algorithm, and (4) showing that LC-MS/MS combined with de novo assisted-database searches is a feasible strategy for the comprehensive identification of peptides derived from native proteolytic mechanisms in biological systems

    Bioinformatic and Experimental Approaches for Deeper Metaproteomic Characterization of Complex Environmental Samples

    Get PDF
    The coupling of high performance multi-dimensional liquid chromatography and tandem mass spectrometry for characterization of microbial proteins from complex environmental samples has paved the way for a new era in scientific discovery. The field of metaproteomics, which is the study of protein suite of all the organisms in a biological system, has taken a tremendous leap with the introduction of high-throughput proteomics. However, with corresponding increase in sample complexity, novel challenges have been raised with respect to efficient peptide separation via chromatography and bioinformatic analysis of the resulting high throughput data. In this dissertation, various aspects of metaproteomic characterization, including experimental and computational approaches have been systematically evaluated. In this study, robust separation protocols employing strong cation exchange and reverse phase have been designed for efficient peptide separation thus offering excellent orthogonality and ease of automation. These findings will be useful to the proteomics community for obtaining deeper non-redundant peptide identifications which in turn will improve the overall depth of semi-quantitative proteomics. Secondly, computational bottlenecks associated with screening the vast amount of raw mass spectra generated in these proteomic measurements have been addressed. Computational matching of tandem mass spectra via conventional database search strategies lead to modest peptide/protein identifications. This seriously restricts the amount of information retrieved from these complex samples which is mainly due to high complexity and heterogeneity of the sample containing hundreds of proteins shared between different microbial species often having high level of homology. Hence, the challenges associated with metaproteomic data analysis has been addressed by utilizing multiple iterative search engines coupled with de novo sequencing algorithms for a comprehensive and in-depth characterization of complex environmental samples. The work presented here will utilize various sample types ranging from isolates and mock microbial mixtures prepared in the laboratory to complex community samples extracted from industrial waste water, acid-mine drainage and methane seep sediments. In a broad perspective, this dissertation aims to provide tools for gaining deeper insights to proteome characterization in complex environmental ecosystems

    Development and Integration of Informatic Tools for Qualitative and Quantitative Characterization of Proteomic Datasets Generated by Tandem Mass Spectrometry

    Get PDF
    Shotgun proteomic experiments provide qualitative and quantitative analytical information from biological samples ranging in complexity from simple bacterial isolates to higher eukaryotes such as plants and humans and even to communities of microbial organisms. Improvements to instrument performance, sample preparation, and informatic tools are increasing the scope and volume of data that can be analyzed by mass spectrometry (MS). To accommodate for these advances, it is becoming increasingly essential to choose and/or create tools that can not only scale well but also those that make more informed decisions using additional features within the data. Incorporating novel and existing tools into a scalable, modular workflow not only provides more accurate, contextualized perspectives of processed data, but it also generates detailed, standardized outputs that can be used for future studies dedicated to mining general analytical or biological features, anomalies, and trends. This research developed cyber-infrastructure that would allow a user to seamlessly run multiple analyses, store the results, and share processed data with other users. The work represented in this dissertation demonstrates successful implementation of an enhanced bioinformatics workflow designed to analyze raw data directly generated from MS instruments and to create fully-annotated reports of qualitative and quantitative protein information for large-scale proteomics experiments. Answering these questions requires several points of engagement between informatics and analytical understanding of the underlying biochemistry of the system under observation. Deriving meaningful information from analytical data can be achieved through linking together the concerted efforts of more focused, logistical questions. This study focuses on the following aspects of proteomics experiments: spectra to peptide matching, peptide to protein mapping, and protein quantification and differential expression. The interaction and usability of these analyses and other existing tools are also described. By constructing a workflow that allows high-throughput processing of massive datasets, data collected within the past decade can be standardized and updated with the most recent analyses

    Algorithms for Peptide Identification from Mixture Tandem Mass Spectra

    Get PDF
    The large amount of data collected in an mass spectrometry experiment requires effective computational approaches for the automated analysis of those data. Though extensive research has been conducted for such purpose by the proteomics community, there are still remaining challenges, among which, one particular challenge is that the identification rate of the MS/MS spectra collected is rather low. One significant reason that contributes to this situation is the frequently observed mixture spectra, which result from the concurrent fragmentation of multiple precursors in a single MS/MS spectrum. However, nearly all the mainstream computational methods still take the assumption that the acquired spectra come from a single precursor, thus they are not suitable for the identification of mixture spectra. In this research, we focused on developing effective algorithms for the purpose of interpreting mixture tandem mass spectra, and our research work is mainly comprised of two components: de novo sequencing of mixture spectra and mixture spectra identification by database search. For the de novo sequencing approach, firstly we formulated the mixture spectra de novo sequencing problem mathematically, and proposed a dynamic programming algorithm for the problem. Additionally, we use both simulated and real mixture spectra datasets to verify the efficiency of the algorithm described in the research. For the database search identification, we proposed an approach for matching mixture tandem mass spectra with a pair of peptide sequences acquired from the protein sequence database by incorporating a special de novo assisted filtration strategy. Besides the filtration strategy, we also introduced in the research a method to give an reasonable estimation of the mixture coefficient which represents the relative abundance level of the co-sequenced precursors. The preliminary experimental results demonstrated the efficiency of the integrated filtration strategy and mixture coefficient estimating method in reducing examination space and also verified the effectiveness of the proposed matching scheme
    • …
    corecore