186 research outputs found
SymScal: symbolic multidimensional scaling of interval dissimilarities
Multidimensional scaling aims at reconstructing dissimilaritiesbetween pairs of objects by distances in a low dimensional space.However, in some cases the dissimilarity itself is unknown, but therange of the dissimilarity is given. Such fuzzy data fall in thewider class of symbolic data (Bock and Diday, 2000).Denoeux and Masson (2000) have proposed to model an intervaldissimilarity by a range of the distance defined as the minimum andmaximum distance between two rectangles representing the objects. Inthis paper, we provide a new algorithm called SymScal that is basedon iterative majorization. The advantage is that each iteration isguaranteed to improve the solution until no improvement is possible.In a simulation study, we investigate the quality of thisalgorithm. We discuss the use of SymScal on empirical dissimilarityintervals of sounds.iterative majorization;multidimensional scaling;symbolic data analysis;distance smoothing
SymScal: symbolic multidimensional scaling of interval dissimilarities
Multidimensional scaling aims at reconstructing dissimilarities
between pairs of objects by distances in a low dimensional space.
However, in some cases the dissimilarity itself is unknown, but the
range of the dissimilarity is given. Such fuzzy data fall in the
wider class of symbolic data (Bock and Diday, 2000).
Denoeux and Masson (2000) have proposed to model an interval
dissimilarity by a range of the distance defined as the minimum and
maximum distance between two rectangles representing the objects. In
this paper, we provide a new algorithm called SymScal that is based
on iterative majorization. The advantage is that each iteration is
guaranteed to improve the solution until no improvement is possible.
In a simulation study, we investigate the quality of this
algorithm. We discuss the use of SymScal on empirical dissimilarity
intervals of sounds
On central tendency and dispersion measures for intervals and hypercubes
The uncertainty or the variability of the data may be treated by considering,
rather than a single value for each data, the interval of values in which it
may fall. This paper studies the derivation of basic description statistics for
interval-valued datasets. We propose a geometrical approach in the
determination of summary statistics (central tendency and dispersion measures)
for interval-valued variables
Linear regression for numeric symbolic variables: an ordinary least squares approach based on Wasserstein Distance
In this paper we present a linear regression model for modal symbolic data.
The observed variables are histogram variables according to the definition
given in the framework of Symbolic Data Analysis and the parameters of the
model are estimated using the classic Least Squares method. An appropriate
metric is introduced in order to measure the error between the observed and the
predicted distributions. In particular, the Wasserstein distance is proposed.
Some properties of such metric are exploited to predict the response variable
as direct linear combination of other independent histogram variables. Measures
of goodness of fit are discussed. An application on real data corroborates the
proposed method
Representing complex data using localized principal components with application to astronomical data
Often the relation between the variables constituting a multivariate data
space might be characterized by one or more of the terms: ``nonlinear'',
``branched'', ``disconnected'', ``bended'', ``curved'', ``heterogeneous'', or,
more general, ``complex''. In these cases, simple principal component analysis
(PCA) as a tool for dimension reduction can fail badly. Of the many alternative
approaches proposed so far, local approximations of PCA are among the most
promising. This paper will give a short review of localized versions of PCA,
focusing on local principal curves and local partitioning algorithms.
Furthermore we discuss projections other than the local principal components.
When performing local dimension reduction for regression or classification
problems it is important to focus not only on the manifold structure of the
covariates, but also on the response variable(s). Local principal components
only achieve the former, whereas localized regression approaches concentrate on
the latter. Local projection directions derived from the partial least squares
(PLS) algorithm offer an interesting trade-off between these two objectives. We
apply these methods to several real data sets. In particular, we consider
simulated astrophysical data from the future Galactic survey mission Gaia.Comment: 25 pages. In "Principal Manifolds for Data Visualization and
Dimension Reduction", A. Gorban, B. Kegl, D. Wunsch, and A. Zinovyev (eds),
Lecture Notes in Computational Science and Engineering, Springer, 2007, pp.
180--204,
http://www.springer.com/dal/home/generic/search/results?SGWID=1-40109-22-173750210-
On the equivalence between hierarchical segmentations and ultrametric watersheds
We study hierarchical segmentation in the framework of edge-weighted graphs.
We define ultrametric watersheds as topological watersheds null on the minima.
We prove that there exists a bijection between the set of ultrametric
watersheds and the set of hierarchical segmentations. We end this paper by
showing how to use the proposed framework in practice in the example of
constrained connectivity; in particular it allows to compute such a hierarchy
following a classical watershed-based morphological scheme, which provides an
efficient algorithm to compute the whole hierarchy.Comment: 19 pages, double-colum
- …