Search CORE

8 research outputs found

pSLIP: SVM based protein subcellular localization prediction using multiple physicochemical properties

Author: Chua Gek Huey
Krishnan Arun
Li Kuo-Bin
Sarda Deepak
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Protein subcellular localization is an important determinant of protein function and hence, reliable methods for prediction of localization are needed. A number of prediction algorithms have been developed based on amino acid compositions or on the N-terminal characteristics (signal peptides) of proteins. However, such approaches lead to a loss of contextual information. Moreover, where information about the physicochemical properties of amino acids has been used, the methods employed to exploit that information are less than optimal and could use the information more effectively. RESULTS: In this paper, we propose a new algorithm called pSLIP which uses Support Vector Machines (SVMs) in conjunction with multiple physicochemical properties of amino acids to predict protein subcellular localization in eukaryotes across six different locations, namely, chloroplast, cytoplasmic, extracellular, mitochondrial, nuclear and plasma membrane. The algorithm was applied to the dataset provided by Park and Kanehisa and we obtained prediction accuracies for the different classes ranging from 87.7% – 97.0% with an overall accuracy of 93.1%. CONCLUSION: This study presents a physicochemical property based protein localization prediction algorithm. Unlike other algorithms, contextual information is preserved by dividing the protein sequences into clusters. The prediction accuracy shows an improvement over other algorithms based on various types of amino acid composition (single, pair and gapped pair). We have also implemented a web server to predict protein localization across the six classes (available at )

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Decision-tree instance-space decomposition with grouped gain-ratio

Author: Bauer
Breiman
Breiman
Brodley
Carvalho
Dietterich
Dietterich
Esmeir
Freund
Fürnkranz
Hampshire
Hansen
Hansen
Horton
Jordan
Kusiak
Lior Rokach
Maimon
Marthy
Mertz
Mertz
Oded Maimon
Onho-Machado
Peng
Quinlan
Rahman
Sakar
Salzberg
Shahar Cohen
Sharkey
Warshall
Weigend
Witten
Wolpert
Zhou
Zhou
Zupan
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Model-based classification for subcellular localization prediction of proteins

Author: Bulashevska Alla
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2005
Field of study

KITopen

How to derive causal insights for digital commerce in China? A research commentary on computational social science methods

Author: KAUFFMAN Robert John
NALDI Maurizio
PHANG David C.W.
WANG Kanliang
WANG Qiu-hong
Publication venue: 'Elsevier BV'
Publication date: 01/05/2019
Field of study

Institutional Knowledge at Singapore Management University

A branching fuzzy-logic classifier for building optimization

Author: Lehar Matthew A., 1977-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2005
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 2005.Includes bibliographical references (p. 109-110).We present an input-output model that learns to emulate a complex building simulation of high dimensionality. Many multi-dimensional systems are dominated by the behavior of a small number of inputs over a limited range of input variation. Some also exhibit a tendency to respond relatively strongly to certain inputs over small ranges, and to other inputs over very large ranges of input variation. A branching linear discriminant can be used to isolate regions of local linearity in the input space, while also capturing the effects of scale. The quality of the classification may be improved by using a fuzzy preference relation to classify input configurations that are not well handled by the linear discriminant.by Matthew A. Lehar.Ph.D

DSpace@MIT

Probabilistic models for mining imbalanced relational data

Author: Ghanem Amal Saleh
Publication venue: Curtin University
Publication date: 01/01/2009
Field of study

Most data mining and pattern recognition techniques are designed for learning from at data files with the assumption of equal populations per class. However, most real-world data are stored as rich relational databases that generally have imbalanced class distribution. For such domains, a rich relational technique is required to accurately model the different objects and relationships in the domain, which can not be easily represented as a set of simple attributes, and at the same time handle the imbalanced class problem.Motivated by the significance of mining imbalanced relational databases that represent the majority of real-world data, learning techniques for mining imbalanced relational domains are investigated. In this thesis, the employment of probabilistic models in mining relational databases is explored. In particular, the Probabilistic Relational Models (PRMs) that were proposed as an extension of the attribute-based Bayesian Networks. The effectiveness of PRMs in mining real-world databases was explored by learning PRMs from a real-world university relational database. A visual data mining tool is also proposed to aid the interpretation of the outcomes of the PRM learned models.Despite the effectiveness of PRMs in relational learning, the performance of PRMs as predictive models is significantly hindered by the imbalanced class problem. This is due to the fact that PRMs share the assumption common to other learning techniques of relatively balanced class distributions in the training data. Therefore, this thesis proposes a number of models utilizing the effectiveness of PRMs in relational learning and extending it for mining imbalanced relational domains.The first model introduced in this thesis examines the problem of mining imbalanced relational domains for a single two-class attribute. The model is proposed by enriching the PRM learning with the ensemble learning technique. The premise behind this model is that an ensemble of models would attain better performance than a single model, as misclassification committed by one of the models can be often correctly classified by others.Based on this approach, another model is introduced to address the problem of mining multiple imbalanced attributes, in which it is important to predict several attributes rather than a single one. In this model, the ensemble bagging sampling approach is exploited to attain a single model for mining several attributes. Finally, the thesis outlines the problem of imbalanced multi-class classification and introduces a generalized framework to handle this problem for both relational and non-relational domains

espace@Curtin

Contribution à l'intégration des machines à vecteurs de support au sein des systèmes de reconnaisance de formes : application à la lecture automatique de l'écriture manuscrite

Author: Milgram Jonathan
Publication venue: École de technologie supérieure
Publication date
Field of study

Durant ces dernières années, les machines à vecteurs de support (SVM) ont démontré maintes reprises leur supériorité en termes de généralisation. L'objectif de cette thèse de doctorat a alors consisté à isoler les principaux problèmes liés à l'intégration des SVM au sein de systèmes de reconnaissance de formes et notamment des systèmes de lecture automatique de l'écriture manuscrite et à y apporter des éléments de réponse. Nous nous sommes ainsi intéressés à la résolution de problèmes multi-classes, à l'estimation de probabilités a posteriori d'appartenance aux différentes classes, à l'accélération de la prise de décision et enfin à la combinaison avec une approche de classification agissant par modélisation de manière à pouvoir traiter efficacement à la fois les données ambiguës et les données aberrantes

Espace ÉTS

Combining Pairwise Classifiers with Stacking

Author: A. Klautau
C.-W. Hsu
D.H. Wolpert
E.L. Allwein
J. Fürnkranz
J. Fürnkranz
J. Fürnkranz
J.R. Quinlan
K.M. Ting
M. Moreira
S. Knerr
T. Hastie
T.G. Dietterich
W.W. Cohen
Publication venue: Springer
Publication date: 01/01/2003
Field of study

Pairwise classification is the technique that deals with multi-class problems by converting them into a series of binary problems, one for each pair of classes. The predictions of the binary classifiers are typically combined into an overall prediction by voting and predicting the class that received the largest number of votes. In this paper we try to generalize the voting procedure by replacing it with a trainable classifier, i.e., we propose the use of a meta-level classifier that is trained to arbiter among the conflicting predictions of the binary classifiers

CiteSeerX

TUbiblio

Crossref