Search CORE

145 research outputs found

Prediction of peptides binding to MHC class I alleles by partial periodic pattern mining

Author: Meydan Cem
Otu Hasan
Sezerman Ugur
Sezerman Uğur
Publication venue: ODTU (Ortadoğu Teknik Üniversitesi)
Publication date: 16/04/2009
Field of study

MHC (Major Histocompatibility Complex) is a key player in the immune response of an organism. It is important to be able to predict which antigenic peptides will bind to a spe-cific MHC allele and which will not, creating possibilities for controlling immune response and for the applications of immunotherapy. However a problem encountered in the computational binding prediction methods for MHC class I is the presence of bulges and loops in the peptides, changing the total length. Most machine learning methods in use to-day require the sequences to be of same length to success-fully mine the binding motifs. We propose the use of time-based data mining methods in motif mining to be able to mine motifs position-independently. Also, the information for both binding and non-binding peptides are used on the contrary to the other methods which only rely on binding peptides. The prediction results are between 70-80% for the tested alleles

Sabanci University Research Database

Interpreting the prevalence of regulatory SNPs in cancers and protein-coding SNPs among non-cancer diseases using GWAS association studies

Author: Khalid Zoya
Sezerman Ugur
Sezerman Uğur
Publication venue: 'Boston College University Libraries'
Publication date: 01/10/2014
Field of study

Sabanci University Research Database

An entropy based heuristic model for predicting functional sub-type divisions of protein families

Author: Bakis Yasin
Bakış Yasin
Sezerman Ugur
Sezerman Uğur
Yorukoglu Deniz
Yörükoğlu Deniz
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/07/2009
Field of study

Multiple sequence alignments of protein families are often used for locating residues that are widely apart in the sequence, which are considered as influential for determining functional specificity of proteins towards various substrates, ligands, DNA and other proteins. In this paper, we propose an entropy-score based heuristic algorithm model for predicting functional sub-family divisions of protein families, given the multiple sequence alignment of the protein family as input without any functional sub-type or key site information given for any protein sequence. Two of the experimented test-cases are reported in this paper. First test-case is Nucleotidyl Cyclase protein family consisting of guanalyate and adenylate cyclases. And the second test-case is a dataset of proteins taken from six superfamilies in Structure-Function Linkage Database (SFLD). Results from these test-cases are reported in terms of confirmed sub-type divisions with phylogeny relations from former studies in the literature

Sabanci University Research Database

Predicting sumoylation sites using support vector machines based on various sequence features, conformational flexibility and disorder

Author: Sezerman Ugur
Sezerman Uğur
Yavuz Ahmet Sinan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Background Sumoylation, which is a reversible and dynamic post-translational modification, is one of the vital processes in a cell. Before a protein matures to perform its function, sumoylation may alter its localization, interactions, and possibly structural conformation. Abberations in protein sumoylation has been linked with a variety of disorders and developmental anomalies. Experimental approaches to identification of sumoylation sites may not be effective due to the dynamic nature of sumoylation, laborsome experiments and their cost. Therefore, computational approaches may guide experimental identification of sumoylation sites and provide insights for further understanding sumoylation mechanism. Results In this paper, the effectiveness of using various sequence properties in predicting sumoylation sites was investigated with statistical analyses and machine learning approach employing support vector machines. These sequence properties were derived from windows of size 7 including position-specific amino acid composition, hydrophobicity, estimated sub-window volumes, predicted disorder, and conformational flexibility. 5-fold cross-validation results on experimentally identified sumoylation sites revealed that our method successfully predicts sumoylation sites with a Matthew's correlation coefficient, sensitivity, specificity, and accuracy equal to 0.66, 73%, 98%, and 97%, respectively. Additionally, we have showed that our method compares favorably to the existing prediction methods and basic regular expressions scanner. Conclusions By using support vector machines, a new, robust method for sumoylation site prediction was introduced. Besides, the possible effects of predicted conformational flexibility and disorder on sumoylation site recognition were explored computationally for the first time to our knowledge as an additional parameter that could aid in sumoylation site prediction

Crossref

Springer - Publisher Connector

PubMed Central

Sabanci University Research Database

Prediction of peptides binding to MHC class I and II alleles by temporal motif mining

Author: Meydan Cem
Otu Hasan H.
Sezerman Ugur
Sezerman Uğur
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Background: MHC (Major Histocompatibility Complex) is a key player in the immune response of most vertebrates. The computational prediction of whether a given antigenic peptide will bind to a specific MHC allele is important in the development of vaccines for emerging pathogens, the creation of possibilities for controlling immune response, and for the applications of immunotherapy. One of the problems that make this computational prediction difficult is the detection of the binding core region in peptides, coupled with the presence of bulges and loops causing variations in the total sequence length. Most machine learning methods require the sequences to be of the same length to successfully discover the binding motifs, ignoring the length variance in both motif mining and prediction steps. In order to overcome this limitation, we propose the use of time-based motif mining methods that work position-independently. Results: The prediction method was tested on a benchmark set of 28 different alleles for MHC class I and 27 different alleles for MHC class II. The obtained results are comparable to the state of the art methods for both MHC classes, surpassing the published results for some alleles. The average prediction AUC values are 0.897 for class I, and 0.858 for class II. Conclusions: Temporal motif mining using partial periodic patterns can capture information about the sequences well enough to predict the binding of the peptides and is comparable to state of the art methods in the literature. Unlike neural networks or matrix based predictors, our proposed method does not depend on peptide length and can work with both short and long fragments. This advantage allows better use of the available training data and the prediction of peptides of uncommon lengths

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Sabanci University Research Database

Molecular characterization of cDNA encoding resistance gene-like sequences in Buchloe dactyloides

Author: Budak Hikmet
Dweikat İsmail
Dweikat Ismail
Kasap Zeynep
Mahmood Abid
Sezerman Ugur
Sezerman Uğur
Shearman Robert C.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2006
Field of study

Current knowledge of resistance (R) genes and their use for genetic improvement in buffalograss (Buchloe dactyloides [Nutt.] Engelm.) lag behind most crop plants. This study was conducted to clone and characterize cDNA encoding R gene-like (RGL) sequences in buffalograss. This report is the first to clone and-characterize of buffalograss RGLs. Degenerate primers designed from the conserved motifs of known R genes were used to amplify RGLs and fragments of expected size were isolated and cloned. Sequence analysis of cDNA clones and analysis of putative translation products revealed that most encoded amino acid sequences shared the similar conserved motifs found in the cloned plant disease resistance genes RPS2, MLA6, L6, RPM1, and Xa1. These results indicated diversity of the R gene candidate sequences in buffalograss. Analysis of 5' rapid amplification of cDNA ends (RACE), applied to investigate upstream of RGLs, indicated that regulatory sequences such as TATA box were conserved among the RGLs identified. The cloned RGL in this study will further enhance our knowledge on organization, function, and evolution of R gene family in buffalograss. With the sequences of the primers and sizes of the markers provided, these RGL markers are readily available for use in a genomics-assisted selection in buffalograss

Sabanci University Research Database

DockPro: A VR-Based Tool for Protein-Protein Docking Problem

Author: Balcısoy Selim
Balcisoy Selim
Cakici Serdar
Sezerman Ugur
Sezerman Uğur
Sumengen Selcuk
Sümengen Selçuk
Çakıcı Serdar
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/09/2008
Field of study

Proteins are large molecules that are vital for all living organisms and they are essential components of many industrial products. The process of binding a protein to another is called protein-protein docking. Many automated algorithms have been proposed to find docking configurations that might yield promising protein-protein complexes. However, these automated methods are likely to come up with false positives and have high computational costs. Consequently, Virtual Reality has been used to take advantage of user's experience on the problem; and proposed applications can be further improved. Haptic devices have been used for molecular docking problems; but they are inappropriate for protein-protein docking due to their workspace limitations. Instead of haptic rendering of forces, we provide a novel visual feedback for simulating physicochemical forces of proteins. We propose an interactive 3D application, DockPro, which enables domain experts to come up with dockings of protein-protein couples by using magnetic trackers and gloves in front of a large display

Sabanci University Research Database

Optimization of morphological data in numerical taxonomy analysis using genetic algorithms feature selection method

Author: Babac M. Tekin
Babaç M. Tekin
Bakis Yasin
Bakış Yasin
Meydan Cem
Sezerman Ugur
Sezerman Uğur
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/07/2009
Field of study

Studies in Numerical Taxonomy are carried out by measuring characters as much as possible. The workload over scientists and labor to perform measurements will increase proportionally with the number of variables (or characters) to be used in the study. However, some part of the data may be irrelevant or sometimes meaningless. Here in this study, we introduce an algorithm to obtain a subset of data with minimum characters that can represent original data. Morphological characters were used in optimization of data by Genetic Algorithms Feature Selection method. The analyses were performed on an 18 character*11 taxa data matrix with standardized continuous characters. The analyses resulted in a minimum set of 2 characters, which means the original tree based on the complete data can also be constructed by those two characters

Sabanci University Research Database

The identification of pathway markers in intracranial aneurysm using genome-wide association data from two different populations

Author: Bakir-Gungor Burcu
Bakır-Güngör Burcu
Sezerman Ugur
Sezerman Uğur
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 06/03/2013
Field of study

The identification of significant individual factors causing complex diseases is challenging in genome-wide association studies (GWAS) since each factor has only a modest effect on the disease development mechanism. In this study, we hypothesize that the biological pathways that are targeted by these individual factors show higher conservation within and across populations. To test this hypothesis, we searched for the disease related pathways on two intracranial aneurysm GWAS in European and Japanese case-control cohorts. Even though there were a few significantly conserved SNPs within and between populations, seven of the top ten affected pathways were found significant in both populations. The probability of random occurrence of such an event is 2.44E-36. We therefore claim that even though each individual has a unique combination of factors involved in the mechanism of disease development, most targeted pathways that need to be altered by these factors are, for the most part, the same. These pathways can serve as disease markers. Individuals, for example, can be scanned for factors affecting the genes in marker pathways. Hence, individual factors of disease development can be determined; and this knowledge can be exploited for drug development and personalized therapeutic applications. Here, we discuss the potential avenues of pathway markers in medicine and their translation to preventive and individualized health care

Public Library of Science (PLOS)

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

Sabanci University Research Database

FigShare