1,713 research outputs found
Neural Network and Bioinformatic Methods for Predicting HIV-1 Protease Inhibitor Resistance
This article presents a new method for predicting viral resistance to seven protease inhibitors from the HIV-1 genotype, and for identifying the positions in the protease gene at which the specific nature of the mutation affects resistance. The neural network Analog ARTMAP predicts protease inhibitor resistance from viral genotypes. A feature selection method detects genetic positions that contribute to resistance both alone and through interactions with other positions. This method has identified positions 35, 37, 62, and 77, where traditional feature selection methods have not detected a contribution to resistance.
At several positions in the protease gene, mutations confer differing degress of resistance, depending on the specific amino acid to which the sequence has mutated. To find these positions, an Amino Acid Space is introduced to represent genes in a vector space that captures the functional similarity between amino acid pairs. Feature selection identifies several new positions, including 36, 37, and 43, with amino acid-specific contributions to resistance. Analog ARTMAP networks applied to inputs that represent specific amino acids at these positions perform better than networks that use only mutation locations.Air Force Office of Scientific Research (F49620-01-1-0423); National Geospatial-Intelligence Agency (NMA 201-01-1-2016); National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624
Similarity networks for classification: a case study in the Horse Colic problem
This paper develops a two-layer neural network in which the neuron model computes a user-defined similarity function between inputs and weights. The neuron transfer function is formed by composition of an adapted logistic function with the mean of the partial input-weight similarities. The resulting neuron model is capable of dealing directly with variables of potentially different nature (continuous, fuzzy, ordinal, categorical). There is also provision for missing values. The network is trained using a two-stage procedure very similar to that used to train a radial basis function (RBF) neural network. The network is compared to two types of RBF networks in a non-trivial dataset: the Horse Colic problem, taken as a case study and analyzed in detail.Postprint (published version
Fuzzy clustering of ordinal time series based on two novel distances with economic applications
Time series clustering is a central machine learning task with applications
in many fields. While the majority of the methods focus on real-valued time
series, very few works consider series with discrete response. In this paper,
the problem of clustering ordinal time series is addressed. To this aim, two
novel distances between ordinal time series are introduced and used to
construct fuzzy clustering procedures. Both metrics are functions of the
estimated cumulative probabilities, thus automatically taking advantage of the
ordering inherent to the series' range. The resulting clustering algorithms are
computationally efficient and able to group series generated from similar
stochastic processes, reaching accurate results even though the series come
from a wide variety of models. Since the dynamic of the series may vary over
the time, we adopt a fuzzy approach, thus enabling the procedures to locate
each series into several clusters with different membership degrees. An
extensive simulation study shows that the proposed methods outperform several
alternative procedures. Weighted versions of the clustering algorithms are also
presented and their advantages with respect to the original methods are
discussed. Two specific applications involving economic time series illustrate
the usefulness of the proposed approaches
3rd Workshop in Symbolic Data Analysis: book of abstracts
This workshop is the third regular meeting of researchers interested in Symbolic Data Analysis. The main aim of the
event is to favor the meeting of people and the exchange of ideas from different fields - Mathematics, Statistics, Computer Science, Engineering, Economics, among others - that contribute to Symbolic Data Analysis
A Lotting Method for Electronic Reverse Auctions
An increasing number of commercial companies are using online reverse auctions for their sourcing activities. In reverse auctions, multiple suppliers bid for a contract from a buyer for selling goods and/or services. Usually, the buyer has to procure multiple items, which are typically divided into lots for auctioning purposes. By steering the composition of the lots, a buyer can increase the attractiveness of its lots for thesuppliers, which can then make more competitive offers, leading to larger savings for the procuring party. In this paper, a clustering-based heuristic lotting method is proposed for reverse auctions. Agglomerative clustering is used for determining the items that will be put in the same lot. A suitable metric is defined, which allows the procurer to incorporate various approaches to lotting. The proposed lotting method has been tested for the procurement activities of a consumer packaged goods company. The results indicate that the proposed strategy leads to 2-3% savings, while the procurement experts confirm that the lots determined by the proposed method are acceptable given the procurement goals.e-commerce;reverse auctions;hierarchical clustering;lotting;e-procurement
A kernel-based framework for learning graded relations from data
Driven by a large number of potential applications in areas like
bioinformatics, information retrieval and social network analysis, the problem
setting of inferring relations between pairs of data objects has recently been
investigated quite intensively in the machine learning community. To this end,
current approaches typically consider datasets containing crisp relations, so
that standard classification methods can be adopted. However, relations between
objects like similarities and preferences are often expressed in a graded
manner in real-world applications. A general kernel-based framework for
learning relations from data is introduced here. It extends existing approaches
because both crisp and graded relations are considered, and it unifies existing
approaches because different types of graded relations can be modeled,
including symmetric and reciprocal relations. This framework establishes
important links between recent developments in fuzzy set theory and machine
learning. Its usefulness is demonstrated through various experiments on
synthetic and real-world data.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
- …