438 research outputs found

    QUANTITATIVE AND FUNCTIONAL ANALYSIS PIPELINE FOR LABEL-FREE METAPROTEOMICS DATA AND ITS APPLICATIONS

    Get PDF
    Since the large-scale metaproteome was first reported in 2005, metaproteomics has advanced at a tremendous rate both in its quantitative and qualitative metrics. Furthermore metaproteomics is now being applied as a general tool in microbial ecology in a large variety of environmental studies. Though metaproteomics is becoming a useful and even a standard tool for the microbial ecologist, standardized bioinformatics pipelines are not readily available. Therefore, we developed quantitative and functional analysis pipeline for metaproteomics (QFAM) to help analyze large and complicated metaproteomics data in a robust and timely fashion with outputs designed to be simple and clearly understood by the microbial ecologist. QFAM starts by running peptide-spectrum searches against resultant MS/MS datasets with mixed metagenome/appropriate protein FASTA database. Its primary search algorithm is MyriMatch/IDPicker. MyriMatch/IDPicker uses multi-CPUs effectively, has an accurate scoring-system, correctly use the high MS accuracy data, and finally has a robust method for protein determination. These are required features for metaproteomics requiring large protein database and complicated peptide-structure. QFAM has quantitative (QAM) and functional (FAM) analysis to provide dependable protein signatures and confident information for understanding the characteristics of the metaproteome. QAM employs a ’selfea’ R package, which provides probability models as well as Cohen’s effect sizes. Our benchmark data test and Monte Carlo simulation results show that selfea can reduce false positives efficiently while losing few true positives; one of the key goals of proteomics and/or metaproteomics experiments. FAM has two modules: BioSystems and COG analysis. The BioSystems module is most appropriate for well-annotated model organisms, such as humans, whereas the COG module is useful for less-annotated microorganisms and metagenome sequences. Both modules provide an enrichment test using Fisher’s exact-test and a significance test using selfea. With two statistics, FAM generates differentially enriched functional terms that are insightful for discerning biological information held behind the metaproteome data. Two application studies in chapter 4 and 5 show how QFAM can be employed for metaproteomics data analysis. QFAM is distinguished from other proteomics pipelines by multiprocessing as well as quantitative and functional analysis

    Data analysis tools for mass spectrometry proteomics

    Get PDF
    ABSTRACT Proteins are large biomolecules which consist of amino acid chains. They differ from one another in their amino acid sequences, which are mainly dictated by the nucleotide sequence of their corresponding genes. Proteins fold into specific threedimensional structures that determine their activity. Because many of the proteins act as catalytes in biochemical reactions, they are considered as the executive molecules in the cells and therefore their research is fundamental in biotechnology and medicine. Currently the most common method to investigate the activity, interactions, and functions of proteins on a large scale, is high-throughput mass spectrometry (MS). The mass spectrometers are used for measuring the molecule masses, or more specifically, their mass-to-charge ratios. Typically the proteins are digested into peptides and their masses are measured by mass spectrometry. The masses are matched against known sequences to acquire peptide identifications, and subsequently, the proteins from which the peptides were originated are quantified. The data that are gathered from these experiments contain a lot of noise, leading to loss of relevant information and even to wrong conclusions. The noise can be related, for example, to differences in the sample preparation or to technical limitations of the analysis equipment. In addition, assumptions regarding the data might be wrong or the chosen statistical methods might not be suitable. Taken together, these can lead to irreproducible results. Developing algorithms and computational tools to overcome the underlying issues is of most importance. Thus, this work aims to develop new computational tools to address these problems. In this PhD Thesis, the performance of existing label-free proteomics methods are evaluated and new statistical data analysis methods are proposed. The tested methods include several widely used normalization methods, which are thoroughly evaluated using multiple gold standard datasets. Various statistical methods for differential expression analysis are also evaluated. Furthermore, new methods to calculate differential expression statistic are developed and their superior performance compared to the existing methods is shown using a wide set of metrics. The tools are published as open source software packages.TIIVISTELMÄ Proteiinit ovat aminohappoketjuista muodostuvia isoja biomolekyylejä. Ne eroavat toisistaan aminohappojen järjestyksen osalta, mikä pääosin määräytyy proteiineja koodaavien geenien perusteella. Lisäksi proteiinit laskostuvat kolmiulotteisiksi rakenteiksi, jotka osaltaan määrittelevät niiden toimintaa. Koska proteiinit toimivat katalyytteinä biokemiallisissa reaktioissa, niillä katsotaan olevan keskeinen rooli soluissa ja siksi myös niiden tutkimusta pidetään tärkeänä. Tällä hetkellä yleisin menetelmä laajamittaiseen proteiinien aktiivisuuden, interaktioiden sekä funktioiden tutkimiseen on suurikapasiteettinen massaspektrometria (MS). Massaspektrometreja käytetään mittaamaan molekyylien massoja – tai tarkemmin massan ja varauksen suhdetta. Tyypillisesti proteiinit hajotetaan peptideiksi massojen mittausta varten. Massaspektrometrillä havaittuja massoja verrataan tunnetuista proteiinisekvensseistä koottua tietokantaa vasten, jotta peptidit voidaan tunnistaa. Peptidien myötä myös proteiinit on mahdollista päätellä ja kvantitoida. Kokeissa kerätty data sisältää normaalisti runsaasti kohinaa, joka saattaa johtaa olennaisen tiedon hukkumiseen ja jopa pahimmillaan johtaa vääriin johtopäätöksiin. Tämä kohina voi johtua esimerkiksi näytteen käsittelystä johtuvista eroista tai mittalaitteiden teknisistä rajoitteista. Lisäksi olettamukset datan luonteesta saattavat olla virheellisiä tai käytetään datalle soveltumattomia tilastollisia malleja. Pahimmillaan tämä johtaa tilanteisiin, joissa tutkimuksen tuloksia ei pystytä toistamaan. Erilaisten laskennallisten työkalujen sekä algoritmien kehittäminen näiden ongelmien ehkäisemiseksi onkin ensiarvoisen tärkeää tutkimusten luotettavuuden kannalta. Tässä työssä keskitytäänkin sovelluksiin, joilla pyritään ratkaisemaan tällä osa-alueella ilmeneviä ongelmia. Tutkimuksessa vertaillaan yleisesti käytössä olevia kvantitatiivisen proteomiikan ohjelmistoja ja yleisimpiä datan normalisointimenetelmiä, sekä kehitetään uusia datan analysointityökaluja. Menetelmien keskinäiset vertailut suoritetaan useiden sellaisten standardiaineistojen kanssa, joiden todellinen sisältö tiedetään. Tutkimuksessa vertaillaan lisäksi joukko tilastollisia menetelmiä näytteiden välisten erojen havaitsemiseen sekä kehitetään kokonaan uusia tehokkaita menetelmiä ja osoitetaan niiden parempi suorituskyky suhteessa aikaisempiin menetelmiin. Kaikki tutkimuksessa kehitetyt työkalut on julkaistu avoimen lähdekoodin sovelluksina

    Metaproteomic evidence of changes in protein expression following a change in electrode potential in a robust biocathode microbiome

    Get PDF
    Microorganisms that respire electrodes may be exploited for biotechnology applications if key pathways for extracellular electron transfer (EET) can be identified and manipulated through bioengineering. To determine whether expression of proposed Biocathode-MCL EET proteins are changed by modulating electrode potential without disrupting the relative distribution of microbial constituents, metaproteomic and 16S rRNA gene expression analyses were performed after switching from an optimal to suboptimal potential based on an expected decrease in electrode respiration. Five hundred and seventy-nine unique proteins were identified across both potentials, the majority of which were assigned to three previously defined Biocathode-MCL metagenomic clusters: a Marinobacter sp., a member of the family Chromatiaceae, and a Labrenzia sp. Statistical analysis of spectral counts using the Fisher's exact test identified 16 proteins associated with the optimal potential, five of which are predicted electron transfer proteins. The majority of proteins associated with the suboptimal potential were involved in protein turnover/turnover, motility, and membrane transport. Unipept and 16S rRNA gene expression analyses indicated that the taxonomic profile of the microbiome did not change after 52 hours at the suboptimal potential. These findings show that protein expression is sensitive to the electrode potential without inducing shifts in community composition, a feature that may be exploited for engineering Biocathode-MCL

    Integrative Analysis To Select Cancer Candidate Biomarkers To Targeted Validation

    Get PDF
    Targeted proteomics has flourished as the method of choice for prospecting for and validating potential candidate biomarkers in many diseases. However, challenges still remain due to the lack of standardized routines that can prioritize a limited number of proteins to be further validated in human samples. To help researchers identify candidate biomarkers that best characterize their samples under study, a well-designed integrative analysis pipeline, comprising MS-based discovery, feature selection methods, clustering techniques, bioinformatic analyses and targeted approaches was performed using discovery-based proteomic data from the secretomes of three classes of human cell lines (carcinoma, melanoma and non-cancerous). Three feature selection algorithms, namely, Beta-binomial, Nearest Shrunken Centroids (NSC), and Support Vector Machine-Recursive Features Elimination (SVM-RFE), indicated a panel of 137 candidate biomarkers for carcinoma and 271 for melanoma, which were differentially abundant between the tumor classes. We further tested the strength of the pipeline in selecting candidate biomarkers by immunoblotting, human tissue microarrays, label-free targeted MS and functional experiments. In conclusion, the proposed integrative analysis was able to pre-qualify and prioritize candidate biomarkers from discovery-based proteomics to targeted MS.6414363543652Kulasingam, V., Diamandis, E.P., Strategies for discovering novel cancer biomarkers through utilization of emerging technologies (2008) Nature clinical practice Oncology, 5, pp. 588-599Wu, C.C., Hsu, C.W., Chen, C.D., Yu, C.J., Chang, K.P., Tai, D.I., Liu, H.P., Yu, J.S., Candidate serological biomarkers for cancer identified from the secretomes of 23 cancer cell lines and the human protein atlas (2010) Molecular & cellular proteomics: MCP, 9, pp. 1100-1117Chen, R., Pan, S., Brentnall, T.A., Aebersold, R., Proteomic profiling of pancreatic cancer for biomarker discovery (2005) Molecular & cellular proteomics: MCP, 4, pp. 523-533Shimwell, N.J., Bryan, R.T., Wei, W., James, N.D., Cheng, K.K., Zeegers, M.P., Johnson, P.J., Ward, D.G., Combined proteome and transcriptome analyses for the discovery of urinary biomarkers for urothelial carcinoma (2013) British journal of cancer, 108, pp. 1854-1861White, N.M., Masui, O., Desouza, L.V., Krakovska, O., Metias, S., Romaschin, A.D., Honey, R.J., Siu, K.W., Quantitative proteomic analysis reveals potential diagnostic markers and pathways involved in pathogenesis of renal cell carcinoma (2014) Oncotarget, 5, pp. 506-518Rifai, N., Gillette, M.A., Carr, S.A., Protein biomarker discovery and validation: the long and uncertain path to clinical utility (2006) Nature biotechnology, 24, pp. 971-983Whiteaker, J.R., Lin, C., Kennedy, J., Hou, L., Trute, M., Sokal, I., Yan, P., Gafken, P.R., A targeted proteomics-based pipeline for verification of biomarkers in plasma (2011) Nature biotechnology, 29, pp. 625-634Makawita, S., Diamandis, E.P., The bottleneck in the cancer biomarker pipeline and protein quantification through mass spectrometry-based approaches: current strategies for candidate verification (2010) Clinical chemistry, 56, pp. 212-222Picotti, P., Rinner, O., Stallmach, R., Dautel, F., Farrah, T., Domon, B., Wenschuh, H., Aebersold, R., High-throughput generation of selected reaction-monitoring assays for proteins and proteomes (2010) Nature methods, 7, pp. 43-46Picotti, P., Bodenmiller, B., Aebersold, R., Proteomics meets the scientific method (2013) Nature methods, 10, pp. 24-27Gillette, M.A., Carr, S.A., Quantitative analysis of peptides and proteins in biomedicine by targeted mass spectrometry (2013) Nature methods, 10, pp. 28-34Chang, K.P., Yu, J.S., Chien, K.Y., Lee, C.W., Liang, Y., Liao, C.T., Yen, T.C., Chi, L.M., Identification of PRDX4 and P4HA2 as metastasis-associated proteins in oral cavity squamous cell carcinoma by comparative tissue proteomics of microdissected specimens using iTRAQ technology (2011) Journal of proteome research, 10, pp. 4935-4947de Jong, E.P., Xie, H., Onsongo, G., Stone, M.D., Chen, X.B., Kooren, J.A., Refsland, E.W., Carlis, J.V., Quantitative proteomics reveals myosin and actin as promising saliva biomarkers for distinguishing pre-malignant and malignant oral lesions (2010) PloS one, 5Hu, S., Arellano, M., Boontheung, P., Wang, J., Zhou, H., Jiang, J., Elashoff, D., Wong, D.T., Salivary proteomics for oral cancer biomarker discovery (2008) Clinical cancer research: an official journal of the American Association for Cancer Research, 14, pp. 6246-6252Sepiashvili, L., Hui, A., Ignatchenko, V., Shi, W., Su, S., Xu, W., Huang, S.H., Kislinger, T., Potentially novel candidate biomarkers for head and neck squamous cell carcinoma identified using an integrated cell line-based discovery strategy (2012) Molecular & cellular proteomics: MCP, 11, pp. 1404-1415van der Post, S., Hansson, G.C., Membrane Protein Profiling of Human Colon Reveals Distinct Regional Differences (2014) Molecular & cellular proteomics: MCPSimabuco, F.M., Kawahara, R., Yokoo, S., Granato, D.C., Miguel, L., Agostini, M., Aragao, A.Z., Paes Leme, A.F., ADAM17 mediates OSCC development in an orthotopic murine model (2014) Molecular cancer, 13, p. 24Liu, N.Q., Braakman, R.B., Stingl, C., Luider, T.M., Martens, J.W., Foekens, J.A., Umar, A., Proteomics pipeline for biomarker discovery of laser capture microdissected breast cancer tissue (2012) Journal of mammary gland biology and neoplasia, 17, pp. 155-164Granato, D.C., Zanetti, M.R., Kawahara, R., Yokoo, S., Domingues, R.R., Aragao, A.Z., Agostini, M., Silva, A.R., Integrated proteomics identified up-regulated focal adhesion-mediated proteins in human squamous cell carcinoma in an orthotopic murine model (2014) PloS one, 9Kulasingam, V., Diamandis, E.P., Proteomics analysis of conditioned media from three breast cancer cell lines: a mine for biomarkers and therapeutic targets (2007) Molecular & cellular proteomics: MCP, 6, pp. 1997-2011Pham, T.V., Piersma, S.R., Warmoes, M., Jimenez, C.R., On the beta-binomial model for analysis of spectral count data in label-free tandem mass spectrometry-based proteomics (2010) Bioinformatics, 26, pp. 363-369Christin, C., Hoefsloot, H.C., Smilde, A.K., Hoekman, B., Suits, F., Bischoff, R., Horvatovich, P., A critical assessment of feature selection methods for biomarker discovery in clinical proteomics (2013) Molecular & cellular proteomics: MCP, 12, pp. 263-276Kim, Y., Ignatchenko, V., Yao, C.Q., Kalatskaya, I., Nyalwidhe, J.O., Lance, R.S., Gramolini, A.O., Drake, R.R., Identification of differentially expressed proteins in direct expressed prostatic secretions of men with organconfined versus extracapsular prostate cancer (2012) Molecular & cellular proteomics: MCP, 11, pp. 1870-1884Rutkowski, M.J., Sughrue, M.E., Kane, A.J., Mills, S.A., Parsa, A.T., Cancer and the complement cascade (2010) Molecular cancer research: MCR, 8, pp. 1453-1465Cho, M.S., Vasquez, H.G., Rupaimoole, R., Pradeep, S., Wu, S., Zand, B., Han, H.D., Dalton, H.J., Autocrine effects of tumor-derived complement (2014) Cell reports, 6, pp. 1085-1095Bensimon, A., Heck, A.J., Aebersold, R., Mass spectrometrybased proteomics and network biology (2012) Annual review of biochemistry, 81, pp. 379-405Bonne, N.J., Wong, D.T., Salivary biomarker development using genomic, proteomic and metabolomic approaches (2012) Genome medicine, 4, p. 82Leemans, C.R., Braakhuis, B.J., Brakenhoff, R.H., The molecular biology of head and neck cancer (2011) Nature reviews Cancer, 11, pp. 9-22Argiris, A., Karamouzis, M.V., Raben, D., Ferris, R.L., Head and neck cancer (2008) Lancet, 371, pp. 1695-1709da Silva, S.D., Ferlito, A., Takes, R.P., Brakenhoff, R.H., Valentin, M.D., Woolgar, J.A., Bradford, C.R., Kowalski, L.P., Advances and applications of oral cancer basic research (2011) Oral oncology, 47, pp. 783-791Macor, P., Tedesco, F., Complement as effector system in cancer immunotherapy (2007) Immunology letters, 111, pp. 6-13Bjorge, L., Hakulinen, J., Vintermyr, O.K., Jarva, H., Jensen, T.S., Iversen, O.E., Meri, S., Ascitic complement system in ovarian cancer (2005) British journal of cancer, 92, pp. 895-905Kim, D.Y., Martin, C.B., Lee, S.N., Martin, B.K., Expression of complement protein C5a in a murine mammary cancer model: tumor regression by interference with the cell cycle (2005) Cancer immunology, immunotherapy: CII, 54, pp. 1026-1037Gollapalli, K., Ray, S., Srivastava, R., Renu, D., Singh, P., Dhali, S., Bajpai Dikshit, J., Srivastava, S., Investigation of serum proteome alterations in human glioblastoma multiforme (2012) Proteomics, 12, pp. 2378-2390Rutkowski, M.J., Sughrue, M.E., Kane, A.J., Ahn, B.J., Fang, S., Parsa, A.T., The complement cascade as a mediator of tissue growth and regeneration (2010) Inflammation research: official journal of the European Histamine Research Society, 59, pp. 897-905. , [et al]Markiewski, M.M., DeAngelis, R.A., Benencia, F., Ricklin-Lichtsteiner, S.K., Koutoulaki, A., Gerard, C., Coukos, G., Lambris, J.D., Modulation of the antitumor immune response by complement (2008) Nature immunology, 9, pp. 1225-1235Hanahan, D., Weinberg, R.A., Hallmarks of cancer: the next generation (2011) Cell, 144, pp. 646-674Paiva, J.G., Florian-Cruz, L., Pedrini, H., Telles, G.P., Minghim, R., Improved similarity trees and their application to visual data classification (2011) IEEE transactions on visualization and computer graphics, 17, pp. 2459-2468Rousseeuw, P.J., Silhouettes: A graphical aid to the interpretation and validation of cluster analysis (1987) Journal of Computational and Applied Mathematics, 20, pp. 53-65Tibshirani, R., Hastie, T., Narasimhan, B., Chu, G., Diagnosis of multiple cancer types by shrunken centroids of gene expression (2002) Proceedings of the National Academy of Sciences of the United States of America, 99, pp. 6567-6572Guyon, I.W., Barnhill, J., Vapnik, S., Gene Selection for Cancer Classification using Support Vector Machines (2002) Machine learning, 46, pp. 389-422Kuhn, M., Building Predictive Models in R Using the caret Package (2008) Journal of Statistical Software, 26, pp. 1-26Smit, S., Hoefsloot, H.C., Smilde, A.K., Statistical data processing in clinical proteomics (2008) Journal of chromatography B, Analytical technologies in the biomedical and life sciences, 866, pp. 77-88Carazzolle, M.F., de Carvalho, L.M., Slepicka, H.H., Vidal, R.O., Pereira, G.A., Kobarg, J., Meirelles, G.V., IIS-Integrated Interactome System: a web-based platform for the annotation, analysis and visualization of protein-metabolite-genedrug interactions by integrating a variety of data sources and tools (2014) PloS one, 9Kanehisa, M., Araki, M., Goto, S., Hattori, M., Hirakawa, M., Itoh, M., Katayama, T., Yamanishi, Y., KEGG for linking genomes to life and the environment (2008) Nucleic acids research, 36, pp. D480-D484Smoot, M.E., Ono, K., Ruscheinski, J., Wang, P.L., Ideker, T., Cytoscape 2, new features for data integration and network visualization (2011) Bioinformatics, 27, pp. 431-432Ponten, F., Schwenk, J.M., Asplund, A., Edqvist, P.H., The Human Protein Atlas as a proteomic resource for biomarker discovery (2011) Journal of internal medicine, 270, pp. 428-44

    Integrative analysis to select cancer candidate biomarkers to targeted validation

    Get PDF
    FAPESP - FUNDAÇÃO DE AMPARO À PESQUISA DO ESTADO DE SÃO PAULOCNPQ - CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICOTargeted proteomics has flourished as the method of choice for prospecting for and validating potential candidate biomarkers in many diseases. However, challenges still remain due to the lack of standardized routines that can prioritize a limited number of proteins to be further validated in human samples. To help researchers identify candidate biomarkers that best characterize their samples under study, a well-designed integrative analysis pipeline, comprising MS-based discovery, feature selection methods, clustering techniques, bioinformatic analyses and targeted approaches was performed using discovery-based proteomic data from the secretomes of three classes of human cell lines (carcinoma, melanoma and non-cancerous). Three feature selection algorithms, namely, Beta-binomial, Nearest Shrunken Centroids (NSC), and Support Vector Machine-Recursive Features Elimination (SVM-RFE), indicated a panel of 137 candidate biomarkers for carcinoma and 271 for melanoma, which were differentially abundant between the tumor classes. We further tested the strength of the pipeline in selecting candidate biomarkers by immunoblotting, human tissue microarrays, label-free targeted MS and functional experiments. In conclusion, the proposed integrative analysis was able to pre-qualify and prioritize candidate biomarkers from discovery-based proteomics to targeted MS.Targeted proteomics has flourished as the method of choice for prospecting for and validating potential candidate biomarkers in many diseases. However, challenges still remain due to the lack of standardized routines that can prioritize a limited number of proteins to be further validated in human samples. To help researchers identify candidate biomarkers that best characterize their samples under study, a well-designed integrative analysis pipeline, comprising MS-based discovery, feature selection methods, clustering techniques, bioinformatic analyses and targeted approaches was performed using discovery-based proteomic data from the secretomes of three classes of human cell lines (carcinoma, melanoma and non-cancerous). Three feature selection algorithms, namely, Beta-binomial, Nearest Shrunken Centroids (NSC), and Support Vector Machine-Recursive Features Elimination (SVM-RFE), indicated a panel of 137 candidate biomarkers for carcinoma and 271 for melanoma, which were differentially abundant between the tumor classes. We further tested the strength of the pipeline in selecting candidate biomarkers by immunoblotting, human tissue microarrays, label-free targeted MS and functional experiments. In conclusion, the proposed integrative analysis was able to pre-qualify and prioritize candidate biomarkers from discovery-based proteomics to targeted MS6414363543652FAPESP - FUNDAÇÃO DE AMPARO À PESQUISA DO ESTADO DE SÃO PAULOCNPQ - CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICOFAPESP - FUNDAÇÃO DE AMPARO À PESQUISA DO ESTADO DE SÃO PAULOCNPQ - CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICO2009/54067-3; 2010/19278-0; 2011/22421-2; 2009/53839-2470567/2009-0; 470549/2011-4; 301702/2011-0; 470268/2013-

    Development and Integration of Informatic Tools for Qualitative and Quantitative Characterization of Proteomic Datasets Generated by Tandem Mass Spectrometry

    Get PDF
    Shotgun proteomic experiments provide qualitative and quantitative analytical information from biological samples ranging in complexity from simple bacterial isolates to higher eukaryotes such as plants and humans and even to communities of microbial organisms. Improvements to instrument performance, sample preparation, and informatic tools are increasing the scope and volume of data that can be analyzed by mass spectrometry (MS). To accommodate for these advances, it is becoming increasingly essential to choose and/or create tools that can not only scale well but also those that make more informed decisions using additional features within the data. Incorporating novel and existing tools into a scalable, modular workflow not only provides more accurate, contextualized perspectives of processed data, but it also generates detailed, standardized outputs that can be used for future studies dedicated to mining general analytical or biological features, anomalies, and trends. This research developed cyber-infrastructure that would allow a user to seamlessly run multiple analyses, store the results, and share processed data with other users. The work represented in this dissertation demonstrates successful implementation of an enhanced bioinformatics workflow designed to analyze raw data directly generated from MS instruments and to create fully-annotated reports of qualitative and quantitative protein information for large-scale proteomics experiments. Answering these questions requires several points of engagement between informatics and analytical understanding of the underlying biochemistry of the system under observation. Deriving meaningful information from analytical data can be achieved through linking together the concerted efforts of more focused, logistical questions. This study focuses on the following aspects of proteomics experiments: spectra to peptide matching, peptide to protein mapping, and protein quantification and differential expression. The interaction and usability of these analyses and other existing tools are also described. By constructing a workflow that allows high-throughput processing of massive datasets, data collected within the past decade can be standardized and updated with the most recent analyses

    Proteomic and transcriptomic analysis of near-isogenic soybean lines differing in seed protein content

    Get PDF
    This thesis provides insights to molecular mechanisms and regulatory networks as related to changes in seed protein content in near-isogenic soybean lines. The insights of metabolic regulations from this study will help to point out the factors that may change overall seed composition during seed development, and guide enhancement of protein and oil accumulation in soybean. Ultimately, the understanding of metabolic regulations from this study will guide the expansion of marketability and the increase in economic value of soybean. This thesis focuses on identifying genes and proteins that are important in controlling soybean seed composition by a combination of proteomic and transcriptomic analyses of near-isogenic soybean lines differing in seed protein content

    Statistical methods for differential proteomics at peptide and protein level

    Get PDF
    corecore