Search CORE

1,242 research outputs found

Resources for Translational Bioinformaticians

Author: Chen Jake
Publication venue: Office of the Vice Chancellor for Research
Publication date: 13/04/2012
Field of study

poster abstractIn this project researchers developed software that can help extract results from PubMed literature to a comprehensive connectivity map, developing information on the relationships among drugs, proteins, and diseases. The relationships mined from literature can be thoroughly curated with the tool's web-based online annotation graphical user interface. These comprehensive connectivity maps cover disease-specific information and will become a valuable resource for translational bioinformaticians

IUPUIScholarWorks

Predictive and Personalized Medicine with Systems Biology Solutions

Author: Chen Jake Y.
Wu Xiaogang
Publication venue: Office of the Vice Chancellor for Research
Publication date: 08/04/2011
Field of study

poster abstractSystems biology refers to the use of systems engineering and systems science techniques to the understanding of biological systems. At Indiana Center for Systems Biology and Personalized Medicine (ICSBPM), we are particularly interested in developing systems biology techniques that can help shorten the gaps between basic biomedical research and clinical applications of genome sciences toward predictive and personalized medicine. In the past several years, ICSBPM has developed many critical informatics resources for the systems biology and personalized medicine community. The database and software tools that we developed have promoted systems biology and personalized medicine research communities at the national scale. These tools include: HPD, an integrated human pathway database and analysis tool (Chowbina et al., in BMC Bioinformatics 2009, 10(S11): S5); HAPPI, a human annotated and predicted protein interaction database (Chen et al., in BMC Genomics 2009, 10(S1):S16); HIP2, a Database of Healthy Human Individual's Integrated Plasma Proteome (Saha et al., in BMC Medical Genomics 2008, 1(1):12); PEPPI, a Peptidomic Database of Protein Isoforms (Zhou et al., in BMC bioinformatics 2010, 11(S6), S7); ProteoLens, a multi-scale network visualization and data mining tool (Huan et al., in BMC bioinformatics 2008, 9(S9):S5); GeneTerrain, a visual exploration tool for network-organized expression panel biomarker development (You et al., in Information Visualization 2010, 9(1)), and C-Maps, comprehensive molecular connectivity maps between disease-specific proteins and drugs (Li et al., in PLoS Computational Biology, 5(7), e1000450). These tools has been demonstrated to help improve tumor classifications, understand cancer biological systems at the systems scale, tackle biomarker discovery challenges, and facilitate clinical adoption of predictive models developed from computational techniques. We hope that our experience and resources can cement collaborative translational medicine research towards predictive and personalized medicine applications

IUPUIScholarWorks

Discovery of pathway biomarkers from coupled proteomics and systems biology methods

Author: Chen Jake Y
Zhang Fan
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: Breast cancer is worldwide the second most common type of cancer after lung cancer. Plasma proteome profiling may have a higher chance to identify protein changes between plasma samples such as normal and breast cancer tissues. Breast cancer cell lines have long been used by researches as model system for identifying protein biomarkers. A comparison of the set of proteins which change in plasma with previously published findings from proteomic analysis of human breast cancer cell lines may identify with a higher confidence a subset of candidate protein biomarker. Results: In this study, we analyzed a liquid chromatography (LC) coupled tandem mass spectrometry (MS/MS) proteomics dataset from plasma samples of 40 healthy women and 40 women diagnosed with breast cancer. Using a two-sample t-statistics and permutation procedure, we identified 254 statistically significant, differentially expressed proteins, among which 208 are over-expressed and 46 are under-expressed in breast cancer plasma. We validated this result against previously published proteomic results of human breast cancer cell lines and signaling pathways to derive 25 candidate protein biomarkers in a panel. Using the pathway analysis, we observed that the 25 “activated” plasma proteins were present in several cancer pathways, including ‘Complement and coagulation cascades’, ‘Regulation of actin cytoskeleton’, and ‘Focal adhesion’, and match well with previously reported studies. Additional gene ontology analysis of the 25 proteins also showed that cellular metabolic process and response to external stimulus (especially proteolysis and acute inflammatory response) were enriched functional annotations of the proteins identified in the breast cancer plasma samples. By cross-validation using two additional proteomics studies, we obtained 86% and 83% similarities in pathway-protein matrix between the first study and the two testing studies, which is much better than the similarity we measured with proteins. Conclusions: We presented a ‘systems biology’ method to identify, characterize, analyze and validate panel biomarkers in breast cancer proteomics data, which includes 1) t statistics and permutation process, 2) network, pathway and function annotation analysis, and 3) cross-validation of multiple studies. Our results showed that the systems biology approach is essential to the understanding molecular mechanisms of panel protein biomarkers

Crossref

IUPUIScholarWorks

Springer - Publisher Connector

PubMed Central

HOMER: a human organ-specific molecular electronic repository

Author: Chen Jake Y
Zhang Fan
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Springer - Publisher Connector

PubMed Central

HAPPI: an online database of comprehensive human annotated and predicted protein interactions

Author: Chen Jake Yue
Huan Tianxiao
Mamidipalli SudhaRani
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Human protein-protein interaction (PPIs) data are the foundation for understanding molecular signalling networks and the functional roles of biomolecules. Several human PPI databases have become available; however, comparisons of these datasets have suggested limited data coverage and poor data quality. Ongoing collection and integration of human PPIs from different sources, both experimentally and computationally, can enable disease-specific network biology modelling in translational bioinformatics studies. Results We developed a new web-based resource, the Human Annotated and Predicted Protein Interaction (HAPPI) database, located at <url>http://bio.informatics.iupui.edu/HAPPI/</url>. The HAPPI database was created by extracting and integrating publicly available protein interaction databases, including HPRD, BIND, MINT, STRING, and OPHID, using database integration techniques. We designed a unified entity-relationship data model to resolve semantic level differences of diverse concepts involved in PPI data integration. We applied a unified scoring model to give each PPI a measure of its reliability that can place each PPI at one of the five star rank levels from 1 to 5. We assessed the quality of PPIs contained in the new HAPPI database, using evolutionary conserved co-expression pairs called "MetaGene" pairs to measure the extent of MetaGene pair and PPI pair overlaps. While the overall quality of the HAPPI database across all star ranks is comparable to the overall qualities of HPRD or IntNetDB, the subset of the HAPPI database with star ranks between 3 and 5 has a much higher average quality than all other human PPI databases. As of summer 2008, the database contains 142,956 non-redundant, medium to high-confidence level human protein interaction pairs among 10,592 human proteins. The HAPPI database web application also provides …” should be “The HAPPI database web application also provides hyperlinked information of genes, pathways, protein domains, protein structure displays, and sequence feature maps for interactive exploration of PPI data in the database. Conclusion HAPPI is by far the most comprehensive public compilation of human protein interaction information. It enables its users to fully explore PPI data with quality measures and annotated information necessary for emerging network biology studies.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

An integrated proteomics analysis of bone tissues in response to mechanical stimulation

Author: Chen Jake Y
Li Jiliang
Zhang Fan
Publication venue: BioMed Central
Publication date: 01/07/2010
Field of study

Bone cells can sense physical forces and convert mechanical stimulation conditions into biochemical signals that lead to expression of mechanically sensitive genes and proteins. However, it is still poorly understood how genes and proteins in bone cells are orchestrated to respond to mechanical stimulations. In this research, we applied integrated proteomics, statistical, and network biology techniques to study proteome-level changes to bone tissue cells in response to two different conditions, normal loading and fatigue loading. We harvested ulna midshafts and isolated proteins from the control, loaded, and fatigue loaded Rats. Using a label-free liquid chromatography tandem mass spectrometry (LC-MS/MS) experimental proteomics technique, we derived a comprehensive list of 1,058 proteins that are differentially expressed among normal loading, fatigue loading, and controls. By carefully developing protein selection filters and statistical models, we were able to identify 42 proteins representing 21 Rat genes that were significantly associated with bone cells' response to quantitative changes between normal loading and fatigue loading conditions. We further applied network biology techniques by building a fatigue loading activated protein-protein interaction subnetwork involving 9 of the human-homolog counterpart of the 21 rat genes in a large connected network component. Our study shows that the combination of decreased anti-apoptotic factor, Raf1, and increased pro-apoptotic factor, PDCD8, results in significant increase in the number of apoptotic osteocytes following fatigue loading. We believe controlling osteoblast differentiation/proliferation and osteocyte apoptosis could be promising directions for developing future therapeutic solutions for related bone diseases

Crossref

IUPUIScholarWorks

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

PEPPI: a peptidomic database of human protein isoforms for proteomics experiments

Author: Chen Jake Y
Zhang Fan
Zhou Ao
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background Protein isoform generation, which may derive from alternative splicing, genetic polymorphism, and posttranslational modification, is an essential source of achieving molecular diversity by eukaryotic cells. Previous studies have shown that protein isoforms play critical roles in disease diagnosis, risk assessment, sub-typing, prognosis, and treatment outcome predictions. Understanding the types, presence, and abundance of different protein isoforms in different cellular and physiological conditions is a major task in functional proteomics, and may pave ways to molecular biomarker discovery of human diseases. In tandem mass spectrometry (MS/MS) based proteomics analysis, peptide peaks with exact matches to protein sequence records in the proteomics database may be identified with mass spectrometry (MS) search software. However, due to limited annotation and poor coverage of protein isoforms in proteomics databases, high throughput protein isoform identifications, particularly those arising from alternative splicing and genetic polymorphism, have not been possible. Results Therefore, we present the PEPtidomics Protein Isoform Database (PEPPI, http://bio.informatics.iupui.edu/peppi), a comprehensive database of computationally-synthesized human peptides that can identify protein isoforms derived from either alternatively spliced mRNA transcripts or SNP variations. We collected genome, pre-mRNA alternative splicing and SNP information from Ensembl. We synthesized in silico isoform transcripts that cover all exons and theoretically possible junctions of exons and introns, as well as all their variations derived from known SNPs. With three case studies, we further demonstrated that the database can help researchers discover and characterize new protein isoform biomarkers from experimental proteomics data. Conclusions We developed a new tool for the proteomics community to characterize protein isoforms from MS-based proteomics experiments. By cataloguing each peptide configurations in the PEPPI database, users can study genetic variations and alternative splicing events at the proteome level. They can also batch-download peptide sequences in FASTA format to search for MS/MS spectra derived from human samples. The database can help generate novel hypotheses on molecular risk factors and molecular mechanisms of complex diseases, leading to identification of potentially highly specific protein isoform biomarkers

IUPUIScholarWorks

Springer - Publisher Connector

PubMed Central

Pathway and network analysis in proteomics

Author: Chen Jake Yue
Hasan Mohammad Al
Wu Xiaogang
Publication venue: 'Elsevier BV'
Publication date: 01/12/2014
Field of study

Proteomics is inherently a systems science that studies not only measured protein and their expressions in a cell, but also the interplay of proteins, protein complexes, signaling pathways, and network modules. There is a rapid accumulation of Proteomics data in recent years. However, Proteomics data are highly variable, with results sensitive to data preparation methods, sample condition, instrument types, and analytical methods. To address the challenge in Proteomics data analysis, we review current tools being developed to incorporate biological function and network topological information. We categorize these tools into four types: tools with basic functional information and little topological features (e.g., GO category analysis), tools with rich functional information and little topological features (e.g., GSEA), tools with basic functional information and rich topological features (e.g., Cytoscape), and tools with rich functional information and rich topological features (e.g., PathwayExpress). We first review the potential application of these tools to Proteomics; then we review tools that can achieve automated learning of pathway modules and features, and tools that help perform integrated network visual analytics

Crossref

IUPUIScholarWorks

PubMed Central