Search CORE

72 research outputs found

Global alignment of pairwise protein interaction networks for maximal common conserved patterns

Author: Nagiza F Samatova
Wenhong Tian
Publication venue
Publication date: 01/01/2013
Field of study

A number of tools for the alignment of protein-protein interaction (PPI) networks have laid the foundation for PPI network analysis. Most of alignment tools focus on finding conserved interaction regions across the PPI networks through either local or global mapping of similar sequences. Researchers are still trying to improve the speed, scalability, and accuracy of network alignment. In view of this, we introduce a connected-components based fast algorithm, HopeMap, for network alignment. Observing that the size of true orthologs across species is small comparing to the total number of proteins in all species, we take a different approach based on a precompiled list of homologs identified by KO terms. Applying this approach to S. cerevisiae (yeast) and D. melanogaster (fly), E. coli K12 and S. typhimurium, E. coli K12 and C. crescenttus, we analyze all clusters identified in the alignment. The results are evaluated through up-to-date known gene annotations, gene ontology (GO), and KEGG ortholog groups (KO). Comparing to existing tools, our approach is fast with linear computational cost, highly accurate in terms of KO and GO terms specificity and sensitivity, and can be extended to multiple alignments easily

CiteSeerX

Learning Contextual Embeddings for Knowledge Graph Completion

Author: Harenberg Steve
Moon Changsung
Samatova Nagiza F.
Slankas John
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/07/2017
Field of study

Knowledge Graphs capture entities and their relationships. However, many knowledge graphs are afflicted by missing data. Recently, embedding methods have been used to alleviate this issue via knowledge graph completion. However, most existing methods only consider the relationship in triples, even though contextual relation types, consisting of the surrounding relation types of a triple, can substantially improve prediction accuracy. Therefore, we propose a contextual embedding method that learns the embeddings of entities and predicates while taking contextual relation types into account. The main benefits of our approach are: (1) improved scalability via a reduced number of epochs needed to achieve comparable or better results with the same memory complexity, (2) higher prediction accuracy (an average of 14%) compared to the related algorithms, and (3) high accuracy for both missing entity and predicate predictions. The source code and the YAGO43k dataset of this paper can be found from (https://github.ncsu.edu/cmoon2/kg)

AIS Electronic Library (AISeL)

BioDEAL: community generation of biological annotations

Author: Breimyer Paul
Green Nathan
Kumar Vinay
Samatova Nagiza F
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

WebBANC: Building Semantically-Rich Annotated Corpora from Web User Annotations of Minority Languages

Author: Breimyer Paul
Green Nathan
Kumar Vinay
Samatova Nagiza F
Publication venue
Publication date: 13/05/2009
Field of study

Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009. Editors: Kristiina Jokinen and Eckhard Bick. NEALT Proceedings Series, Vol. 4 (2009), 48-56. © 2009 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/9206

DSpace at Tartu University Library

Toward Personalized Network Biomarkers in Alzheimer's Disease: Computing Individualized Genomic and Protein Crosstalk Maps

Author: Gonzalo Bello
Kanchana Padmanabhan
Kanchana Padmanabhan
Katie Shpanskaya
Nagiza F. Samatova
Nagiza F. Samatova
P. Murali Doraiswamy
P. Murali Doraiswamy
Publication venue: 'Frontiers Media SA'
Publication date: 01/09/2017
Field of study

Directory of Open Access Journals

Impact of Pretreated Switchgrass and Biomass Carbohydrates on Clostridium thermocellum ATCC 27405 Cellulosome Composition: A Quantitative Proteomic Analysis

Author: Hurst Gregory B.
Lankford Patricia K.
McKeown Catherine K.
Mielenz Jonathan R.
Pan Chongle
Raman Babu
Rodriguez Miguel
Samatova Nagiza F.
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Background: Economic feasibility and sustainability of lignocellulosic ethanol production requires the development of robust microorganisms that can efficiently degrade and convert plant biomass to ethanol. The anaerobic thermophilic bacterium Clostridium thermocellum is a candidate microorganism as it is capable of hydrolyzing cellulose and fermenting the hydrolysis products to ethanol and other metabolites. C. thermocellum achieves efficient cellulose hydrolysis using multiprotein extracellular enzymatic complexes, termed cellulosomes. Methodology/Principal Findings: In this study, we used quantitative proteomics (multidimensional LC-MS/MS and 15N-metabolic labeling) to measure relative changes in levels of cellulosomal subunit proteins (per CipA scaffoldin basis) when C. thermocellum ATCC 27405 was grown on a variety of carbon sources [dilute-acid pretreated switchgrass, cellobiose, amorphous cellulose, crystalline cellulose (Avicel) and combinations of crystalline cellulose with pectin or xylan or both]. Cellulosome samples isolated from cultures grown on these carbon sources were compared to 15N labeled cellulosome samples isolated from crystalline cellulose-grown cultures. In total from all samples, proteomic analysis identified 59 dockerin- and 8 cohesin-module containing components, including 16 previously undetected cellulosomal subunits. Many cellulosomal components showed differential protein abundance in the presence of non-cellulose substrates in the growt

CiteSeerX

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

A high-throughput \u3ci\u3ede novo\u3c/i\u3e sequencing approach for shotgun proteomics using high-resolution tandem mass spectrometry

Author: Banfield Jillian F.
Carey Patricia A.
Hettich Robert L.
McDonald William H.
Pan Chongle
Park Byung H.
Samatova Nagiza F.
VerBerkmoes Nathan C.
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 05/03/2010
Field of study

Abstract Background High-resolution tandem mass spectra can now be readily acquired with hybrid instruments, such as LTQ-Orbitrap and LTQ-FT, in high-throughput shotgun proteomics workflows. The improved spectral quality enables more accurate de novo sequencing for identification of post-translational modifications and amino acid polymorphisms. Results In this study, a new de novo sequencing algorithm, called Vonode, has been developed specifically for analysis of such high-resolution tandem mass spectra. To fully exploit the high mass accuracy of these spectra, a unique scoring system is proposed to evaluate sequence tags based primarily on mass accuracy information of fragment ions. Consensus sequence tags were inferred for 11,422 spectra with an average peptide length of 5.5 residues from a total of 40,297 input spectra acquired in a 24-hour proteomics measurement of Rhodopseudomonas palustris. The accuracy of inferred consensus sequence tags was 84%. According to our comparison, the performance of Vonode was shown to be superior to the PepNovo v2.0 algorithm, in terms of the number of de novo sequenced spectra and the sequencing accuracy. Conclusions Here, we improved de novo sequencing performance by developing a new algorithm specifically for high-resolution tandem mass spectral data. The Vonode algorithm is freely available for download at http://compbio.ornl.gov/Vonode webcite

University of Tennessee, Knoxville: Trace

eScholarship - University of California

Complex biomarker discovery in neuroimaging data: Finding a needle in a haystack

Author: Atluri Gowtham
Doraiswamy P. Murali
Fang Gang
Kumar Vipin
Lim Kelvin
MacDonald Angus
Padmanabhan Kanchana
Petrella Jeffrey R.
Samatova Nagiza F.
Steinbach Michael
Publication venue: The Authors. Published by Elsevier Inc.
Publication date: 07/08/2013
Field of study

AbstractNeuropsychiatric disorders such as schizophrenia, bipolar disorder and Alzheimer's disease are major public health problems. However, despite decades of research, we currently have no validated prognostic or diagnostic tests that can be applied at an individual patient level. Many neuropsychiatric diseases are due to a combination of alterations that occur in a human brain rather than the result of localized lesions. While there is hope that newer imaging technologies such as functional and anatomic connectivity MRI or molecular imaging may offer breakthroughs, the single biomarkers that are discovered using these datasets are limited by their inability to capture the heterogeneity and complexity of most multifactorial brain disorders. Recently, complex biomarkers have been explored to address this limitation using neuroimaging data. In this manuscript we consider the nature of complex biomarkers being investigated in the recent literature and present techniques to find such biomarkers that have been developed in related areas of data mining, statistics, machine learning and bioinformatics

Elsevier - Publisher Connector

PubMed Central