3,573 research outputs found
Unsupervised Representation Learning with Minimax Distance Measures
We investigate the use of Minimax distances to extract in a nonparametric way
the features that capture the unknown underlying patterns and structures in the
data. We develop a general-purpose and computationally efficient framework to
employ Minimax distances with many machine learning methods that perform on
numerical data. We study both computing the pairwise Minimax distances for all
pairs of objects and as well as computing the Minimax distances of all the
objects to/from a fixed (test) object.
We first efficiently compute the pairwise Minimax distances between the
objects, using the equivalence of Minimax distances over a graph and over a
minimum spanning tree constructed on that. Then, we perform an embedding of the
pairwise Minimax distances into a new vector space, such that their squared
Euclidean distances in the new space equal to the pairwise Minimax distances in
the original space. We also study the case of having multiple pairwise Minimax
matrices, instead of a single one. Thereby, we propose an embedding via first
summing up the centered matrices and then performing an eigenvalue
decomposition to obtain the relevant features.
In the following, we study computing Minimax distances from a fixed (test)
object which can be used for instance in K-nearest neighbor search. Similar to
the case of all-pair pairwise Minimax distances, we develop an efficient and
general-purpose algorithm that is applicable with any arbitrary base distance
measure. Moreover, we investigate in detail the edges selected by the Minimax
distances and thereby explore the ability of Minimax distances in detecting
outlier objects.
Finally, for each setting, we perform several experiments to demonstrate the
effectiveness of our framework.Comment: 32 page
Multivariate Approaches to Classification in Extragalactic Astronomy
Clustering objects into synthetic groups is a natural activity of any
science. Astrophysics is not an exception and is now facing a deluge of data.
For galaxies, the one-century old Hubble classification and the Hubble tuning
fork are still largely in use, together with numerous mono-or bivariate
classifications most often made by eye. However, a classification must be
driven by the data, and sophisticated multivariate statistical tools are used
more and more often. In this paper we review these different approaches in
order to situate them in the general context of unsupervised and supervised
learning. We insist on the astrophysical outcomes of these studies to show that
multivariate analyses provide an obvious path toward a renewal of our
classification of galaxies and are invaluable tools to investigate the physics
and evolution of galaxies.Comment: Open Access paper.
http://www.frontiersin.org/milky\_way\_and\_galaxies/10.3389/fspas.2015.00003/abstract\>.
\<10.3389/fspas.2015.00003 \&g
- …