Search CORE

71,350 research outputs found

HD-Index: Pushing the Scalability-Accuracy Boundary for Approximate kNN Search in High-Dimensional Spaces

Author: Arora Akhil
Bhattacharya Arnab
Kumar Piyush
Sinha Sakshi
Publication venue: 'VLDB Endowment'
Publication date: 23/04/2018
Field of study

Nearest neighbor searching of large databases in high-dimensional spaces is inherently difficult due to the curse of dimensionality. A flavor of approximation is, therefore, necessary to practically solve the problem of nearest neighbor search. In this paper, we propose a novel yet simple indexing scheme, HD-Index, to solve the problem of approximate k-nearest neighbor queries in massive high-dimensional databases. HD-Index consists of a set of novel hierarchical structures called RDB-trees built on Hilbert keys of database objects. The leaves of the RDB-trees store distances of database objects to reference objects, thereby allowing efficient pruning using distance filters. In addition to triangular inequality, we also use Ptolemaic inequality to produce better lower bounds. Experiments on massive (up to billion scale) high-dimensional (up to 1000+) datasets show that HD-Index is effective, efficient, and scalable.Comment: PVLDB 11(8):906-919, 201

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Multi-Scale Morphological Analysis of SDSS DR5 Survey using the Metric Space Technique

Author: Abazajian
Adelman-McCarthy
Andre Khalil
Berlind
Blanton
Coles
David J. Batuski
Gott
Khalil
Khalil
Khalil
Kiang
Martinez
Peebles
Sahni
Shandarin
Silverman
Skrutskie
Spergel
Taylor
White
Wu
Yongfeng Wu
York
Zeldovich
Publication venue: 'IOP Publishing'
Publication date: 08/12/2009
Field of study

Following novel development and adaptation of the Metric Space Technique (MST), a multi-scale morphological analysis of the Sloan Digital Sky Survey (SDSS) Data Release 5 (DR5) was performed. The technique was adapted to perform a space-scale morphological analysis by filtering the galaxy point distributions with a smoothing Gaussian function, thus giving quantitative structural information on all size scales between 5 and 250 Mpc. The analysis was performed on a dozen slices of a volume of space containing many newly measured galaxies from the SDSS DR5 survey. Using the MST, observational data were compared to galaxy samples taken from N-body simulations with current best estimates of cosmological parameters and from random catalogs. By using the maximal ranking method among MST output functions we also develop a way to quantify the overall similarity of the observed samples with the simulated samples

arXiv.org e-Print Archive

Crossref

Confocal microscopy of colloidal particles: towards reliable, optimum coordinates

Author: Ares
Arndt
Ayres
Ballard
Baumgartl
Baumgartl
Besseling
Bevington
Bobroff
Bolinder
Bromley
Brujić
Brujić
Campbell
Campbell
Cao
Cheezum
Cherry
Chestnut
Cianci
Cianci
Conrad
Courtland
Cramér
Crocker
D'Agostini
de Hoog
Derks
Dibble
Dinsmore
Dinsmore
Dinsmore
Dose
Duda
Dullens
Elkins
Elliot
Elliot
Elliot
Feigelson
Gao
Gasser
Gasser
Gonzalez
Grier
Habdas
Habdas
Haw
Hecht
Hoogenboom
Isa
Kegel
Kerker
Lampton
Leunissen
Lukosz
M.C. Jenkins
Murray
Nyquist
Pedersen
Pham
Prasad
Preparata
Press
Racca
Raffel
Royall
S.U. Egelhaaf
Sanchez
Savin
Saxton
Semwogerere
Shannon
Simeonova
Simeonova
Sugii
Thomas
Thompson
van Blaaderen
van Blaaderen
van Blaaderen
van Blaaderen
Varadan
Verhaegh
Warr
Warr
Weeks
Weeks
Willert
Wilson
Yethiraj
Publication venue: 'Elsevier BV'
Publication date: 26/09/2007
Field of study

Over the last decade, the light microscope has become increasingly useful as a quantitative tool for studying colloidal systems. The ability to obtain particle coordinates in bulk samples from micrographs is particularly appealing. In this paper we review and extend methods for optimal image formation of colloidal samples, which is vital for particle coordinates of the highest accuracy, and for extracting the most reliable coordinates from these images. We discuss in depth the accuracy of the coordinates, which is sensitive to the details of the colloidal system and the imaging system. Moreover, this accuracy can vary between particles, particularly in dense systems. We introduce a previously unreported error estimate and use it to develop an iterative method for finding particle coordinates. This individual-particle accuracy assessment also allows comparison between particle locations obtained from different experiments. Though aimed primarily at confocal microscopy studies of colloidal systems, the methods outlined here should transfer readily to many other feature extraction problems, especially where features may overlap one another.Comment: Accepted by Advances in Colloid and Interface Scienc

arXiv.org e-Print Archive

Crossref

Efficient generation and optimization of stochastic template banks by a neighboring cell algorithm

Author: Fehrmann Henning
Pletsch Holger J.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2014
Field of study

Placing signal templates (grid points) as efficiently as possible to cover a multi-dimensional parameter space is crucial in computing-intensive matched-filtering searches for gravitational waves, but also in similar searches in other fields of astronomy. To generate efficient coverings of arbitrary parameter spaces, stochastic template banks have been advocated, where templates are placed at random while rejecting those too close to others. However, in this simple scheme, for each new random point its distance to every template in the existing bank is computed. This rapidly increasing number of distance computations can render the acceptance of new templates computationally prohibitive, particularly for wide parameter spaces or in large dimensions. This work presents a neighboring cell algorithm that can dramatically improve the efficiency of constructing a stochastic template bank. By dividing the parameter space into sub-volumes (cells), for an arbitrary point an efficient hashing technique is exploited to obtain the index of its enclosing cell along with the parameters of its neighboring templates. Hence only distances to these neighboring templates in the bank are computed, massively lowering the overall computing cost, as demonstrated in simple examples. Furthermore, we propose a novel method based on this technique to increase the fraction of covered parameter space solely by directed template shifts, without adding any templates. As is demonstrated in examples, this method can be highly effective..Comment: PRD accepte

arXiv.org e-Print Archive

MPG.PuRe

Traction force microscopy with optimized regularization and automated Bayesian parameter selection for comparing cells

Author: Gompper Gerhard
Hersch Nils
Huang Yunfei
Huber Tobias B.
Merkel Rudolf
Sabass Benedikt
Schell Christoph
Simsek Ahmet N.
Publication venue
Publication date: 13/10/2018
Field of study

Adherent cells exert traction forces on to their environment, which allows them to migrate, to maintain tissue integrity, and to form complex multicellular structures. This traction can be measured in a perturbation-free manner with traction force microscopy (TFM). In TFM, traction is usually calculated via the solution of a linear system, which is complicated by undersampled input data, acquisition noise, and large condition numbers for some methods. Therefore, standard TFM algorithms either employ data filtering or regularization. However, these approaches require a manual selection of filter- or regularization parameters and consequently exhibit a substantial degree of subjectiveness. This shortcoming is particularly serious when cells in different conditions are to be compared because optimal noise suppression needs to be adapted for every situation, which invariably results in systematic errors. Here, we systematically test the performance of new methods from computer vision and Bayesian inference for solving the inverse problem in TFM. We compare two classical schemes, L1- and L2-regularization, with three previously untested schemes, namely Elastic Net regularization, Proximal Gradient Lasso, and Proximal Gradient Elastic Net. Overall, we find that Elastic Net regularization, which combines L1 and L2 regularization, outperforms all other methods with regard to accuracy of traction reconstruction. Next, we develop two methods, Bayesian L2 regularization and Advanced Bayesian L2 regularization, for automatic, optimal L2 regularization. Using artificial data and experimental data, we show that these methods enable robust reconstruction of traction without requiring a difficult selection of regularization parameters specifically for each data set. Thus, Bayesian methods can mitigate the considerable uncertainty inherent in comparing cellular traction forces

arXiv.org e-Print Archive

Directory of Open Access Journals

Juelich Shared Electronic Resources