230 research outputs found
Boosting parallel perceptrons for label noise reduction in classification problems
The final publication is available at Springer via http://dx.doi.org/10.1007/11499305_60Proceedings of First International Work-Conference on the Interplay Between Natural and Artificial Computation, IWINAC 2005, Las Palmas, Canary Islands, Spain, June 15-18, 2005Boosting combines an ensemble of weak learners to construct a new weighted classifier that is often more accurate than any of its components. The construction of such learners, whose training sets depend on the performance of the previous members of the ensemble, is carried out by successively focusing on those patterns harder to classify. This fact deteriorates boosting’s results when dealing with malicious noise as, for instance, mislabeled training examples. In order to detect and avoid those noisy examples during the learning process, we propose the use of Parallel Perceptrons. Among other things, these novel machines allow to naturally define margins for hidden unit activations. We shall use these margins to detect which patterns may have an incorrect label and also which are safe, in the sense of being well represented in the training sample by many other similar patterns. As candidates for being noisy examples we shall reduce the weights of the former ones, and as a support for the overall detection procedure we shall augment the weights of the latter ones.With partial support of Spain’s CICyT, TIC 01–572, TIN 2004–0767
Balanced boosting with parallel perceptrons
The final publication is available at Springer via http://dx.doi.org/10.1007/11494669_26Proceedings of 8th International Work-Conference on Artificial Neural Networks, IWANN 2005, Vilanova i la Geltrú, Barcelona, Spain, June 8-10, 2005.Boosting constructs a weighted classifier out of possibly weak learners by successively concentrating on those patterns harder to classify. While giving excellent results in many problems, its performance can deteriorate in the presence of patterns with incorrect labels. In this work we shall use parallel perceptrons (PP), a novel approach to the classical committee machines, to detect whether a pattern’s label may not be correct and also whether it is redundant in the sense of being well represented in the training sample by many other similar patterns. Among other things, PP allow to naturally define margins for hidden unit activations, that we shall use to define the above pattern types. This pattern type classification allows a more nuanced approach to boosting. In particular, the procedure we shall propose, balanced boosting, uses it to modify boosting distribution updates. As we shall illustrate numerically, balanced boosting gives very good results on relatively hard classification problems, particularly in some that present a marked imbalance between class sizes.With partial support of Spain’s CICyT, TIC 01–572
Discriminative Methods for Multi-Labeled Classification
In this paper we present methods of enhancing existing discriminative classifiers for multi-labeled predictions. Discriminative methods like support vector machines perform very well for uni-labeled text classification tasks. Multi-labeled classification is a harder task subject to relatively less attention. In the multi-labeled setting, classes are often related to each other or part of a is-a hierarchy. We present a new technique for combining text features and features indicating relationships between classes, which can be used with any discriminative algorithm
A demand-driven approach for a multi-agent system in Supply Chain Management
This paper presents the architecture of a multi-agent decision support system for Supply Chain Management (SCM) which has been designed to compete in the TAC SCM game. The behaviour of the system is demand-driven and the agents plan, predict, and react dynamically to changes in the market. The main strength of the system lies in the ability of the Demand agent to predict customer winning bid prices - the highest prices the agent can offer customers and still obtain their orders. This paper investigates the effect of the ability to predict customer order prices on the overall performance of the system. Four strategies are proposed and compared for predicting such prices. The experimental results reveal which strategies are better and show that there is a correlation between the accuracy of the models' predictions and the overall system performance: the more accurate the prediction of customer order prices, the higher the profit. © 2010 Springer-Verlag Berlin Heidelberg
Motif Discovery through Predictive Modeling of Gene Regulation
We present MEDUSA, an integrative method for learning motif models of
transcription factor binding sites by incorporating promoter sequence and gene
expression data. We use a modern large-margin machine learning approach, based
on boosting, to enable feature selection from the high-dimensional search space
of candidate binding sequences while avoiding overfitting. At each iteration of
the algorithm, MEDUSA builds a motif model whose presence in the promoter
region of a gene, coupled with activity of a regulator in an experiment, is
predictive of differential expression. In this way, we learn motifs that are
functional and predictive of regulatory response rather than motifs that are
simply overrepresented in promoter sequences. Moreover, MEDUSA produces a model
of the transcriptional control logic that can predict the expression of any
gene in the organism, given the sequence of the promoter region of the target
gene and the expression state of a set of known or putative transcription
factors and signaling molecules. Each motif model is either a -length
sequence, a dimer, or a PSSM that is built by agglomerative probabilistic
clustering of sequences with similar boosting loss. By applying MEDUSA to a set
of environmental stress response expression data in yeast, we learn motifs
whose ability to predict differential expression of target genes outperforms
motifs from the TRANSFAC dataset and from a previously published candidate set
of PSSMs. We also show that MEDUSA retrieves many experimentally confirmed
binding sites associated with environmental stress response from the
literature.Comment: RECOMB 200
A multivariate approach to heavy flavour tagging with cascade training
This paper compares the performance of artificial neural networks and boosted
decision trees, with and without cascade training, for tagging b-jets in a
collider experiment. It is shown, using a Monte Carlo simulation of events, that for a b-tagging efficiency of 50%, the light jet
rejection power given by boosted decision trees without cascade training is
about 55% higher than that given by artificial neural networks. The cascade
training technique can improve the performance of boosted decision trees and
artificial neural networks at this b-tagging efficiency level by about 35% and
80% respectively. We conclude that the cascade trained boosted decision trees
method is the most promising technique for tagging heavy flavours at collider
experiments.Comment: 14 pages, 12 figures, revised versio
- …