Search CORE

950 research outputs found

Hashing for Similarity Search: A Survey

Author: Ji Jianqiu
Shen Heng Tao
Song Jingkuan
Wang Jingdong
Publication venue
Publication date: 13/08/2014
Field of study

Similarity search (nearest neighbor search) is a problem of pursuing the data items whose distances to a query item are the smallest from a large database. Various methods have been developed to address this problem, and recently a lot of efforts have been devoted to approximate search. In this paper, we present a survey on one of the main solutions, hashing, which has been widely studied since the pioneering work locality sensitive hashing. We divide the hashing algorithms two main categories: locality sensitive hashing, which designs hash functions without exploring the data distribution and learning to hash, which learns hash functions according the data distribution, and review them from various aspects, including hash function design and distance measure and search scheme in the hash coding space

arXiv.org e-Print Archive

CiteSeerX

TR-2003009: A Hierarchical Projection Pursuit Clustering Algorithm

Author: Haralick Robert M.
Miasnikov Alexei D.
Rome Jayson E.
Publication venue: CUNY Academic Works
Publication date: 01/01/2003
Field of study

City University of New York

Integration of Constraints into Dimensionality Reduction Methods for Visualization

Author: Vu Viet Minh
Publication venue
Publication date: 01/12/2021
Field of study

Repository of the University of Namur

Linear Dimensionality Reduction for Margin-Based Classification: High-Dimensional Data and Sensor Networks

Author: Varshney Kush R.
Willsky Alan
Willsky Alan S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2010
Field of study

Low-dimensional statistics of measurements play an important role in detection problems, including those encountered in sensor networks. In this work, we focus on learning low-dimensional linear statistics of high-dimensional measurement data along with decision rules defined in the low-dimensional space in the case when the probability density of the measurements and class labels is not given, but a training set of samples from this distribution is given. We pose a joint optimization problem for linear dimensionality reduction and margin-based classification, and develop a coordinate descent algorithm on the Stiefel manifold for its solution. Although the coordinate descent is not guaranteed to find the globally optimal solution, crucially, its alternating structure enables us to extend it for sensor networks with a message-passing approach requiring little communication. Linear dimensionality reduction prevents overfitting when learning from finite training data. In the sensor network setting, dimensionality reduction not only prevents overfitting, but also reduces power consumption due to communication. The learned reduced-dimensional space and decision rule is shown to be consistent and its Rademacher complexity is characterized. Experimental results are presented for a variety of datasets, including those from existing sensor networks, demonstrating the potential of our methodology in comparison with other dimensionality reduction approaches.National Science Foundation (U.S.). Graduate Research Fellowship ProgramUnited States. Army Research Office (MURI funded through ARO Grant W911NF-06-1-0076)United States. Air Force Office of Scientific Research (Award FA9550-06-1-0324)Shell International Exploration and Production B.V

DSpace@MIT

Crossref

TR-2003011: Data Modelling and Description: A Guide to Using the SYLModel Library

Author: Haralick Robert M.
Miasnikov Alexei D.
Rome Jayson E.
Publication venue: CUNY Academic Works
Publication date: 01/01/2003
Field of study

City University of New York

Hierarchical Quadratic Random Forest Classifier

Author: Fallah Faezeh
Publication venue
Publication date: 02/06/2023
Field of study

In this paper, we proposed a hierarchical quadratic random forest classifier for classifying multiresolution samples extracted from multichannel data. This forest incorporated a penalized multivariate linear discriminant in each of its decision nodes and processed squared features to realize quadratic decision boundaries in the original feature space. The penalized discriminant was based on a multiclass sparse discriminant analysis and the penalization was based on a group Lasso regularizer which was an intermediate between the Lasso and the ridge regularizer. The classification probabilities estimated by this forest and the features learned by its decision nodes could be used standalone or foster graph-based classifiers

arXiv.org e-Print Archive

Leaf Venation Networks

Author: Ronellenfitsch Henrik Michael
Publication venue
Publication date: 15/02/2016
Field of study

Georg-August-University Göttingen