6,502 research outputs found
Theoretical Interpretations and Applications of Radial Basis Function Networks
Medical applications usually used Radial Basis Function Networks just as Artificial Neural Networks. However, RBFNs are Knowledge-Based Networks that can be interpreted in several way: Artificial Neural Networks, Regularization Networks, Support Vector Machines, Wavelet Networks, Fuzzy Controllers, Kernel Estimators, Instanced-Based Learners. A survey of their interpretations and of their corresponding learning algorithms is provided as well as a brief survey on dynamic learning algorithms. RBFNs' interpretations can suggest applications that are particularly interesting in medical domains
Basic statistics for probabilistic symbolic variables: a novel metric-based approach
In data mining, it is usually to describe a set of individuals using some
summaries (means, standard deviations, histograms, confidence intervals) that
generalize individual descriptions into a typology description. In this case,
data can be described by several values. In this paper, we propose an approach
for computing basic statics for such data, and, in particular, for data
described by numerical multi-valued variables (interval, histograms, discrete
multi-valued descriptions). We propose to treat all numerical multi-valued
variables as distributional data, i.e. as individuals described by
distributions. To obtain new basic statistics for measuring the variability and
the association between such variables, we extend the classic measure of
inertia, calculated with the Euclidean distance, using the squared Wasserstein
distance defined between probability measures. The distance is a generalization
of the Wasserstein distance, that is a distance between quantile functions of
two distributions. Some properties of such a distance are shown. Among them, we
prove the Huygens theorem of decomposition of the inertia. We show the use of
the Wasserstein distance and of the basic statistics presenting a k-means like
clustering algorithm, for the clustering of a set of data described by modal
numerical variables (distributional variables), on a real data set. Keywords:
Wasserstein distance, inertia, dependence, distributional data, modal
variables.Comment: 19 pages, 3 figure
Classifying sequences by the optimized dissimilarity space embedding approach: a case study on the solubility analysis of the E. coli proteome
We evaluate a version of the recently-proposed classification system named
Optimized Dissimilarity Space Embedding (ODSE) that operates in the input space
of sequences of generic objects. The ODSE system has been originally presented
as a classification system for patterns represented as labeled graphs. However,
since ODSE is founded on the dissimilarity space representation of the input
data, the classifier can be easily adapted to any input domain where it is
possible to define a meaningful dissimilarity measure. Here we demonstrate the
effectiveness of the ODSE classifier for sequences by considering an
application dealing with the recognition of the solubility degree of the
Escherichia coli proteome. Solubility, or analogously aggregation propensity,
is an important property of protein molecules, which is intimately related to
the mechanisms underlying the chemico-physical process of folding. Each protein
of our dataset is initially associated with a solubility degree and it is
represented as a sequence of symbols, denoting the 20 amino acid residues. The
herein obtained computational results, which we stress that have been achieved
with no context-dependent tuning of the ODSE system, confirm the validity and
generality of the ODSE-based approach for structured data classification.Comment: 10 pages, 49 reference
3rd Workshop in Symbolic Data Analysis: book of abstracts
This workshop is the third regular meeting of researchers interested in Symbolic Data Analysis. The main aim of the
event is to favor the meeting of people and the exchange of ideas from different fields - Mathematics, Statistics, Computer Science, Engineering, Economics, among others - that contribute to Symbolic Data Analysis
Fuzzy C-ordered medoids clustering of interval-valued data
Fuzzy clustering for interval-valued data helps us to find natural vague boundaries in such data. The
Fuzzy c-Medoids Clustering (FcMdC) method is one of the most popular clustering methods based on a
partitioning around medoids approach. However, one of the greatest disadvantages of this method is its
sensitivity to the presence of outliers in data. This paper introduces a new robust fuzzy clustering
method named Fuzzy c-Ordered-Medoids clustering for interval-valued data (FcOMdC-ID). The Huber's
M-estimators and the Yager's Ordered Weighted Averaging (OWA) operators are used in the method
proposed to make it robust to outliers. The described algorithm is compared with the fuzzy c-medoids
method in the experiments performed on synthetic data with different types of outliers. A real application of the FcOMdC-ID is also provided
- …