41,396 research outputs found

    Minimum Average Deviance Estimation for Sufficient Dimension Reduction

    Full text link
    Sufficient dimension reduction reduces the dimensionality of data while preserving relevant regression information. In this article, we develop Minimum Average Deviance Estimation (MADE) methodology for sufficient dimension reduction. It extends the Minimum Average Variance Estimation (MAVE) approach of Xia et al. (2002) from continuous responses to exponential family distributions to include Binomial and Poisson responses. Local likelihood regression is used to learn the form of the regression function from the data. The main parameter of interest is a dimension reduction subspace which projects the covariates to a lower dimension while preserving their relationship with the outcome. To estimate this parameter within its natural space, we consider an iterative algorithm where one step utilizes a Stiefel manifold optimizer. We empirically evaluate the performance of three prediction methods, two that are intrinsic to local likelihood estimation and one that is based on the Nadaraya-Watson estimator. Initial results show that, as expected, MADE can outperform MAVE when there is a departure from the assumption of additive errors

    Locality Preserving Projections for Grassmann manifold

    Full text link
    Learning on Grassmann manifold has become popular in many computer vision tasks, with the strong capability to extract discriminative information for imagesets and videos. However, such learning algorithms particularly on high-dimensional Grassmann manifold always involve with significantly high computational cost, which seriously limits the applicability of learning on Grassmann manifold in more wide areas. In this research, we propose an unsupervised dimensionality reduction algorithm on Grassmann manifold based on the Locality Preserving Projections (LPP) criterion. LPP is a commonly used dimensionality reduction algorithm for vector-valued data, aiming to preserve local structure of data in the dimension-reduced space. The strategy is to construct a mapping from higher dimensional Grassmann manifold into the one in a relative low-dimensional with more discriminative capability. The proposed method can be optimized as a basic eigenvalue problem. The performance of our proposed method is assessed on several classification and clustering tasks and the experimental results show its clear advantages over other Grassmann based algorithms.Comment: Accepted by IJCAI 201

    A Process for Topic Modelling Via Word Embeddings

    Full text link
    This work combines algorithms based on word embeddings, dimensionality reduction, and clustering. The objective is to obtain topics from a set of unclassified texts. The algorithm to obtain the word embeddings is the BERT model, a neural network architecture widely used in NLP tasks. Due to the high dimensionality, a dimensionality reduction technique called UMAP is used. This method manages to reduce the dimensions while preserving part of the local and global information of the original data. K-Means is used as the clustering algorithm to obtain the topics. Then, the topics are evaluated using the TF-IDF statistics, Topic Diversity, and Topic Coherence to get the meaning of the words on the clusters. The results of the process show good values, so the topic modeling of this process is a viable option for classifying or clustering texts without labels
    • …
    corecore