Search CORE

5,406 research outputs found

Detecting outlying subspaces for high-dimensional data: the new task, algorithms and performance

Author: Wang Hai
Zhang Ji
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/10/2006
Field of study

[Abstract]: In this paper, we identify a new task for studying the outlying degree (OD) of high-dimensional data, i.e. finding the subspaces (subsets of features) in which the given points are outliers, which are called their outlying subspaces. Since the state-of-the-art outlier detection techniques fail to handle this new problem, we propose a novel detection algorithm, called High-Dimension Outlying subspace Detection (HighDOD), to detect the outlying subspaces of high-dimensional data efficiently. The intuitive idea of HighDOD is that we measure the OD of the point using the sum of distances between this point and its k nearest neighbors. Two heuristic pruning strategies are proposed to realize fast pruning in the subspace search and an efficient dynamic subspace search method with a sample-based learning process has been implemented. Experimental results show that HighDOD is efficient and outperforms other searching alternatives such as the naive top–down, bottom–up and random search methods, and the existing outlier detection methods cannot fulfill this new task effectively

University of Southern Queensland ePrints

Randomized hybrid linear modeling by local best-fit flats

Author: Lerman Gilad
Szlam Arthur
Wang Yi
Zhang Teng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/05/2010
Field of study

The hybrid linear modeling problem is to identify a set of d-dimensional affine sets in a D-dimensional Euclidean space. It arises, for example, in object tracking and structure from motion. The hybrid linear model can be considered as the second simplest (behind linear) manifold model of data. In this paper we will present a very simple geometric method for hybrid linear modeling based on selecting a set of local best fit flats that minimize a global l1 error measure. The size of the local neighborhoods is determined automatically by the Jones' l2 beta numbers; it is proven under certain geometric conditions that good local neighborhoods exist and are found by our method. We also demonstrate how to use this algorithm for fast determination of the number of affine subspaces. We give extensive experimental evidence demonstrating the state of the art accuracy and speed of the algorithm on synthetic and real hybrid linear data.Comment: To appear in the proceedings of CVPR 201

arXiv.org e-Print Archive

Crossref

Fitness landscape of the cellular automata majority problem: View from the Olympus

Author: Altenberg
Altenberg
Andre
Barnett
Bastolla
Box
Breukelaar
Capcarrère
Chopard
Clergue
Collard
Crutchfield
Das
Das
Fukś
Gacs
Hordijk
Hordijk
Hordijk
Huynen
Juillè
Juillé
Kimura
L. Vanneschi
Land
M. Tomassini
Madras
Mitchell
Mitchell
P. Collard
Packard
Quick
Reidys
Rosé
S. Verel
Sipper
Stadler
Van Nimwegen
Vanneschi
Verel
Weinberger
Weinberger
Wilke
Wolfram
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

In this paper we study cellular automata (CAs) that perform the computational Majority task. This task is a good example of what the phenomenon of emergence in complex systems is. We take an interest in the reasons that make this particular fitness landscape a difficult one. The first goal is to study the landscape as such, and thus it is ideally independent from the actual heuristics used to search the space. However, a second goal is to understand the features a good search technique for this particular problem space should possess. We statistically quantify in various ways the degree of difficulty of searching this landscape. Due to neutrality, investigations based on sampling techniques on the whole landscape are difficult to conduct. So, we go exploring the landscape from the top. Although it has been proved that no CA can perform the task perfectly, several efficient CAs for this task have been found. Exploiting similarities between these CAs and symmetries in the landscape, we define the Olympus landscape which is regarded as the ''heavenly home'' of the best local optima known (blok). Then we measure several properties of this subspace. Although it is easier to find relevant CAs in this subspace than in the overall landscape, there are structural reasons that prevent a searcher from finding overfitted CAs in the Olympus. Finally, we study dynamics and performance of genetic algorithms on the Olympus in order to confirm our analysis and to find efficient CAs for the Majority problem with low computational cost

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Crossref

HAL-UNICE

Serveur académique lausannois

HAL Descartes

Masking Strategies for Image Manifolds

Author: Dadkhahi Hamid
Duarte Marco F.
Publication venue
Publication date: 01/01/2016
Field of study

We consider the problem of selecting an optimal mask for an image manifold, i.e., choosing a subset of the pixels of the image that preserves the manifold's geometric structure present in the original data. Such masking implements a form of compressive sensing through emerging imaging sensor platforms for which the power expense grows with the number of pixels acquired. Our goal is for the manifold learned from masked images to resemble its full image counterpart as closely as possible. More precisely, we show that one can indeed accurately learn an image manifold without having to consider a large majority of the image pixels. In doing so, we consider two masking methods that preserve the local and global geometric structure of the manifold, respectively. In each case, the process of finding the optimal masking pattern can be cast as a binary integer program, which is computationally expensive but can be approximated by a fast greedy algorithm. Numerical experiments show that the relevant manifold structure is preserved through the data-dependent masking process, even for modest mask sizes

arXiv.org e-Print Archive

Crossref

Computation of multiple eigenvalues and generalized eigenvectors for matrices dependent on parameters

Author: Anderson
Antoniou
Arnold
Arnold
Berry
Burke
Demmel
Dobson
Edelman
Edelman
Elmroth
Elmroth
Fairgrieve
Gantmacher
Golub
Golub
Heiss
Kirillov
Kirillov
Korsch
Kublanovskaya
Kågström
Kågström
Latinne
Lewis
Lippert
Mailybaev
Mailybaev
Mailybaev
Mailybaev
Moro
Ruhe
Seyranian
Seyranian
Seyranian
Traviesas
Vishik
Wilkinson
Wilkinson
Publication venue: 'Wiley'
Publication date: 02/02/2005
Field of study

The paper develops Newton's method of finding multiple eigenvalues with one Jordan block and corresponding generalized eigenvectors for matrices dependent on parameters. It computes the nearest value of a parameter vector with a matrix having a multiple eigenvalue of given multiplicity. The method also works in the whole matrix space (in the absence of parameters). The approach is based on the versal deformation theory for matrices. Numerical examples are given. The implementation of the method in MATLAB code is available.Comment: 19 pages, 3 figure

arXiv.org e-Print Archive

Crossref