59,607 research outputs found
Spectral Dimensionality Reduction
In this paper, we study and put under a common framework a number of non-linear dimensionality reduction methods, such as Locally Linear Embedding, Isomap, Laplacian Eigenmaps and kernel PCA, which are based on performing an eigen-decomposition (hence the name 'spectral'). That framework also includes classical methods such as PCA and metric multidimensional scaling (MDS). It also includes the data transformation step used in spectral clustering. We show that in all of these cases the learning algorithm estimates the principal eigenfunctions of an operator that depends on the unknown data density and on a kernel that is not necessarily positive semi-definite. This helps to generalize some of these algorithms so as to predict an embedding for out-of-sample examples without having to retrain the model. It also makes it more transparent what these algorithm are minimizing on the empirical data and gives a corresponding notion of generalization error. Dans cet article, nous étudions et développons un cadre unifié pour un certain nombre de méthodes non linéaires de réduction de dimensionalité, telles que LLE, Isomap, LE (Laplacian Eigenmap) et ACP à noyaux, qui font de la décomposition en valeurs propres (d'où le nom "spectral"). Ce cadre inclut également des méthodes classiques telles que l'ACP et l'échelonnage multidimensionnel métrique (MDS). Il inclut aussi l'étape de transformation de données utilisée dans l'agrégation spectrale. Nous montrons que, dans tous les cas, l'algorithme d'apprentissage estime les fonctions propres principales d'un opérateur qui dépend de la densité inconnue de données et d'un noyau qui n'est pas nécessairement positif semi-défini. Ce cadre aide à généraliser certains modèles pour prédire les coordonnées des exemples hors-échantillons sans avoir à réentraîner le modèle. Il aide également à rendre plus transparent ce que ces algorithmes minimisent sur les données empiriques et donne une notion correspondante d'erreur de généralisation.non-parametric models, non-linear dimensionality reduction, kernel models, modèles non paramétriques, réduction de dimensionalité non linéaire, modèles à noyau
Non-Redundant Spectral Dimensionality Reduction
Spectral dimensionality reduction algorithms are widely used in numerous
domains, including for recognition, segmentation, tracking and visualization.
However, despite their popularity, these algorithms suffer from a major
limitation known as the "repeated Eigen-directions" phenomenon. That is, many
of the embedding coordinates they produce typically capture the same direction
along the data manifold. This leads to redundant and inefficient
representations that do not reveal the true intrinsic dimensionality of the
data. In this paper, we propose a general method for avoiding redundancy in
spectral algorithms. Our approach relies on replacing the orthogonality
constraints underlying those methods by unpredictability constraints.
Specifically, we require that each embedding coordinate be unpredictable (in
the statistical sense) from all previous ones. We prove that these constraints
necessarily prevent redundancy, and provide a simple technique to incorporate
them into existing methods. As we illustrate on challenging high-dimensional
scenarios, our approach produces significantly more informative and compact
representations, which improve visualization and classification tasks
Spectral dimensionality reduction for HMMs
Hidden Markov Models (HMMs) can be accurately approximated using
co-occurrence frequencies of pairs and triples of observations by using a fast
spectral method in contrast to the usual slow methods like EM or Gibbs
sampling. We provide a new spectral method which significantly reduces the
number of model parameters that need to be estimated, and generates a sample
complexity that does not depend on the size of the observation vocabulary. We
present an elementary proof giving bounds on the relative accuracy of
probability estimates from our model. (Correlaries show our bounds can be
weakened to provide either L1 bounds or KL bounds which provide easier direct
comparisons to previous work.) Our theorem uses conditions that are checkable
from the data, instead of putting conditions on the unobservable Markov
transition matrix
Dimensionality reduction and spectral properties of multilayer networks
Network representations are useful for describing the structure of a large
variety of complex systems. Although most studies of real-world networks
suppose that nodes are connected by only a single type of edge, most natural
and engineered systems include multiple subsystems and layers of connectivity.
This new paradigm has attracted a great deal of attention and one fundamental
challenge is to characterize multilayer networks both structurally and
dynamically. One way to address this question is to study the spectral
properties of such networks. Here, we apply the framework of graph quotients,
which occurs naturally in this context, and the associated eigenvalue
interlacing results, to the adjacency and Laplacian matrices of undirected
multilayer networks. Specifically, we describe relationships between the
eigenvalue spectra of multilayer networks and their two most natural quotients,
the network of layers and the aggregate network, and show the dynamical
implications of working with either of the two simplified representations. Our
work thus contributes in particular to the study of dynamical processes whose
critical properties are determined by the spectral properties of the underlying
network.Comment: minor changes; pre-published versio
Application of spectral and spatial indices for specific class identification in Airborne Prism EXperiment (APEX) imaging spectrometer data for improved land cover classification
Hyperspectral remote sensing's ability to capture spectral information of targets in very narrow bandwidths gives rise to many intrinsic applications. However, the major limiting disadvantage to its applicability is its dimensionality, known as the Hughes Phenomenon. Traditional classification and image processing approaches fail to process data along many contiguous bands due to inadequate training samples. Another challenge of successful classification is to deal with the real world scenario of mixed pixels i.e. presence of more than one class within a single pixel. An attempt has been made to deal with the problems of dimensionality and mixed pixels, with an objective to improve the accuracy of class identification. In this paper, we discuss the application of indices to cope with the disadvantage of the dimensionality of the Airborne Prism EXperiment (APEX) hyperspectral Open Science Dataset (OSD) and to improve the classification accuracy using the Possibilistic c–Means (PCM) algorithm. This was used for the formulation of spectral and spatial indices to describe the information in the dataset in a lesser dimensionality. This reduced dimensionality is used for classification, attempting to improve the accuracy of determination of specific classes. Spectral indices are compiled from the spectral signatures of the target and spatial indices have been defined using texture analysis over defined neighbourhoods. The classification of 20 classes of varying spatial distributions was considered in order to evaluate the applicability of spectral and spatial indices in the extraction of specific class information. The classification of the dataset was performed in two stages; spectral and a combination of spectral and spatial indices individually as input for the PCM classifier. In addition to the reduction of entropy, while considering a spectral-spatial indices approach, an overall classification accuracy of 80.50% was achieved, against 65% (spectral indices only) and 59.50% (optimally determined principal component
Making Laplacians commute
In this paper, we construct multimodal spectral geometry by finding a pair of
closest commuting operators (CCO) to a given pair of Laplacians. The CCOs are
jointly diagonalizable and hence have the same eigenbasis. Our construction
naturally extends classical data analysis tools based on spectral geometry,
such as diffusion maps and spectral clustering. We provide several synthetic
and real examples of applications in dimensionality reduction, shape analysis,
and clustering, demonstrating that our method better captures the inherent
structure of multi-modal data
- …