Search CORE

41,396 research outputs found

Minimum Average Deviance Estimation for Sufficient Dimension Reduction

Author: Adragni Kofi P.
Al-Najjar Elias
Raim Andrew M.
Publication venue
Publication date: 01/11/2016
Field of study

Sufficient dimension reduction reduces the dimensionality of data while preserving relevant regression information. In this article, we develop Minimum Average Deviance Estimation (MADE) methodology for sufficient dimension reduction. It extends the Minimum Average Variance Estimation (MAVE) approach of Xia et al. (2002) from continuous responses to exponential family distributions to include Binomial and Poisson responses. Local likelihood regression is used to learn the form of the regression function from the data. The main parameter of interest is a dimension reduction subspace which projects the covariates to a lower dimension while preserving their relationship with the outcome. To estimate this parameter within its natural space, we consider an iterative algorithm where one step utilizes a Stiefel manifold optimizer. We empirically evaluate the performance of three prediction methods, two that are intrinsic to local likelihood estimation and one that is based on the Nadaraya-Watson estimator. Initial results show that, as expected, MADE can outperform MAVE when there is a departure from the assumption of additive errors

arXiv.org e-Print Archive

Locality Preserving Projections for Grassmann manifold

Author: Chen Haoran
Gao Junbin
Hu Yongli
Sun Yanfeng
Wang Boyue
Yin Baocai
Publication venue
Publication date: 27/04/2017
Field of study

Learning on Grassmann manifold has become popular in many computer vision tasks, with the strong capability to extract discriminative information for imagesets and videos. However, such learning algorithms particularly on high-dimensional Grassmann manifold always involve with significantly high computational cost, which seriously limits the applicability of learning on Grassmann manifold in more wide areas. In this research, we propose an unsupervised dimensionality reduction algorithm on Grassmann manifold based on the Locality Preserving Projections (LPP) criterion. LPP is a commonly used dimensionality reduction algorithm for vector-valued data, aiming to preserve local structure of data in the dimension-reduced space. The strategy is to construct a mapping from higher dimensional Grassmann manifold into the one in a relative low-dimensional with more discriminative capability. The proposed method can be optimized as a basic eigenvalue problem. The performance of our proposed method is assessed on several classification and clustering tasks and the experimental results show its clear advantages over other Grassmann based algorithms.Comment: Accepted by IJCAI 201

arXiv.org e-Print Archive

Crossref

Recommended from our members

Effectiveness of landmark analysis for establishing locality in p2p networks

Author: Allan Alexander
Di Fatta Giuseppe
Publication venue
Publication date: 01/01/2010
Field of study

Locality to other nodes on a peer-to-peer overlay network can be established by means of a set of landmarks shared among the participating nodes. Each node independently collects a set of latency measures to landmark nodes, which are used as a multi-dimensional feature vector. Each peer node uses the feature vector to generate a unique scalar index which is correlated to its topological locality. A popular dimensionality reduction technique is the space filling Hilbert’s curve, as it possesses good locality preserving properties. However, there exists little comparison between Hilbert’s curve and other techniques for dimensionality reduction. This work carries out a quantitative analysis of their properties. Linear and non-linear techniques for scaling the landmark vectors to a single dimension are investigated. Hilbert’s curve, Sammon’s mapping and Principal Component Analysis have been used to generate a 1d space with locality preserving properties. This work provides empirical evidence to support the use of Hilbert’s curve in the context of locality preservation when generating peer identifiers by means of landmark vector analysis. A comparative analysis is carried out with an artificial 2d network model and with a realistic network topology model with a typical power-law distribution of node connectivity in the Internet. Nearest neighbour analysis confirms Hilbert’s curve to be very effective in both artificial and realistic network topologies. Nevertheless, the results in the realistic network model show that there is scope for improvements and better techniques to preserve locality information are required

Central Archive at the University of Reading

CiteSeerX

A Process for Topic Modelling Via Word Embeddings

Author: Ulloa Diego Saldaña
Publication venue
Publication date: 06/10/2023
Field of study

This work combines algorithms based on word embeddings, dimensionality reduction, and clustering. The objective is to obtain topics from a set of unclassified texts. The algorithm to obtain the word embeddings is the BERT model, a neural network architecture widely used in NLP tasks. Due to the high dimensionality, a dimensionality reduction technique called UMAP is used. This method manages to reduce the dimensions while preserving part of the local and global information of the original data. K-Means is used as the clustering algorithm to obtain the topics. Then, the topics are evaluated using the TF-IDF statistics, Topic Diversity, and Topic Coherence to get the meaning of the words on the clusters. The results of the process show good values, so the topic modeling of this process is a viable option for classifying or clustering texts without labels

arXiv.org e-Print Archive