2,400 research outputs found

    Predicting Parameters in Deep Learning

    Full text link
    We demonstrate that there is significant redundancy in the parameterization of several deep learning models. Given only a few weight values for each feature it is possible to accurately predict the remaining values. Moreover, we show that not only can the parameter values be predicted, but many of them need not be learned at all. We train several different architectures by learning only a small number of weights and predicting the rest. In the best case we are able to predict more than 95% of the weights of a network without any drop in accuracy

    Regression on fixed-rank positive semidefinite matrices: a Riemannian approach

    Full text link
    The paper addresses the problem of learning a regression model parameterized by a fixed-rank positive semidefinite matrix. The focus is on the nonlinear nature of the search space and on scalability to high-dimensional problems. The mathematical developments rely on the theory of gradient descent algorithms adapted to the Riemannian geometry that underlies the set of fixed-rank positive semidefinite matrices. In contrast with previous contributions in the literature, no restrictions are imposed on the range space of the learned matrix. The resulting algorithms maintain a linear complexity in the problem size and enjoy important invariance properties. We apply the proposed algorithms to the problem of learning a distance function parameterized by a positive semidefinite matrix. Good performance is observed on classical benchmarks

    INFORMATION VISUALIZATION DESIGN FOR MULTIDIMENSIONAL DATA: INTEGRATING THE RANK-BY-FEATURE FRAMEWORK WITH HIERARCHICAL CLUSTERING

    Get PDF
    Interactive exploration of multidimensional data sets is challenging because: (1) it is difficult to comprehend patterns in more than three dimensions, and (2) current systems are often a patchwork of graphical and statistical methods leaving many researchers uncertain about how to explore their data in an orderly manner. This dissertation offers a set of principles and a novel rank-by-feature framework that could enable users to better understand multidimensional and multivariate data by systematically studying distributions in one (1D) or two dimensions (2D), and then discovering relationships, clusters, gaps, outliers, and other features. Users of this rank-by-feature framework can view graphical presentations (histograms, boxplots, and scatterplots), and then choose a feature detection criterion to rank 1D or 2D axis-parallel projections. By combining information visualization techniques (overview, coordination, and dynamic query) with summaries and statistical methods, users can systematically examine the most important 1D and 2D axis-parallel projections. This research provides a number of valuable contributions: Graphics, Ranking, and Interaction for Discovery (GRID) principles- a set of principles for exploratory analysis of multidimensional data, which are summarized as: (1) study 1D, study 2D, then find features (2) ranking guides insight, statistics confirm. GRID principles help users organize their discovery process in an orderly manner so as to produce more thorough analyses and extract deeper insights in any multidimensional data application. Rank-by-feature framework - a user interface framework based on the GRID principles. Interactive information visualization techniques are combined with statistical methods and data mining algorithms to enable users to orderly examine multidimensional data sets using 1D and 2D projections. The design and implementation of the Hierarchical Clustering Explorer (HCE), an information visualization tool available at www.cs.umd.edu/hcil/hce. HCE implements the rank-by-feature framework and supports interactive exploration of hierarchical clustering results to reveal one of the important features - clusters. Validation through case studies and user surveys: Case studies with motivated experts in three research fields and a user survey via emails to a wide range of HCE users demonstrated the efficacy of HCE and the rank-by-feature framework. These studies also revealed potential improvement opportunities in terms of design and implementation

    Doctor of Philosophy

    Get PDF
    dissertationWith the ever-increasing amount of available computing resources and sensing devices, a wide variety of high-dimensional datasets are being produced in numerous fields. The complexity and increasing popularity of these data have led to new challenges and opportunities in visualization. Since most display devices are limited to communication through two-dimensional (2D) images, many visualization methods rely on 2D projections to express high-dimensional information. Such a reduction of dimension leads to an explosion in the number of 2D representations required to visualize high-dimensional spaces, each giving a glimpse of the high-dimensional information. As a result, one of the most important challenges in visualizing high-dimensional datasets is the automatic filtration and summarization of the large exploration space consisting of all 2D projections. In this dissertation, a new type of algorithm is introduced to reduce the exploration space that identifies a small set of projections that capture the intrinsic structure of high-dimensional data. In addition, a general framework for summarizing the structure of quality measures in the space of all linear 2D projections is presented. However, identifying the representative or informative projections is only part of the challenge. Due to the high-dimensional nature of these datasets, obtaining insights and arriving at conclusions based solely on 2D representations are limited and prone to error. How to interpret the inaccuracies and resolve the ambiguity in the 2D projections is the other half of the puzzle. This dissertation introduces projection distortion error measures and interactive manipulation schemes that allow the understanding of high-dimensional structures via data manipulation in 2D projections
    • …
    corecore