14 research outputs found

    DiscoVars: A New Data Analysis Perspective -- Application in Variable Selection for Clustering

    Full text link
    We present a new data analysis perspective to determine variable importance regardless of the underlying learning task. Traditionally, variable selection is considered an important step in supervised learning for both classification and regression problems. The variable selection also becomes critical when costs associated with the data collection and storage are considerably high for cases like remote sensing. Therefore, we propose a new methodology to select important variables from the data by first creating dependency networks among all variables and then ranking them (i.e. nodes) by graph centrality measures. Selecting Top-nn variables according to preferred centrality measure will yield a strong candidate subset of variables for further learning tasks e.g. clustering. We present our tool as a Shiny app which is a user-friendly interface development environment. We also extend the user interface for two well-known unsupervised variable selection methods from literature for comparison reasons.Comment: 13 Pages, Technical Report, Verikar Softwar

    Improving the Efficacy of Context-Aware Applications

    Get PDF
    In this dissertation, we explore methods for enhancing the context-awareness capabilities of modern computers, including mobile devices, tablets, wearables, and traditional computers. Advancements include proposed methods for fusing information from multiple logical sensors, localizing nearby objects using depth sensors, and building models to better understand the content of 2D images. First, we propose a system called Unagi, designed to incorporate multiple logical sensors into a single framework that allows context-aware application developers to easily test new ideas and create novel experiences. Unagi is responsible for collecting data, extracting features, and building personalized models for each individual user. We demonstrate the utility of the system with two applications: adaptive notification filtering and a network content prefetcher. We also thoroughly evaluate the system with respect to predictive accuracy, temporal delay, and power consumption. Next, we discuss a set of techniques that can be used to accurately determine the location of objects near a user in 3D space using a mobile device equipped with both depth and inertial sensors. Using a novel chaining approach, we are able to locate objects farther away than the standard range of the depth sensor without compromising localization accuracy. Empirical testing shows our method is capable of localizing objects 30m from the user with an error of less than 10cm. Finally, we demonstrate a set of techniques that allow a multi-layer perceptron (MLP) to learn resolution-invariant representations of 2D images, including the proposal of an MCMC-based technique to improve the selection of pixels for mini-batches used for training. We also show that a deep convolutional encoder could be trained to output a resolution-independent representation in constant time, and we discuss several potential applications of this research, including image resampling, image compression, and security

    Benefitting from the Variables that Variable Selection Discards

    No full text
    In supervised learning variable selection is used to find a subset of the available inputs that accurately predict the output. This paper shows that some of the variables that variable selection discards can beneficially be used as extra outputs for inductive transfer. Using discarded input variables as extra outputs forces the model to learn mappings from the variables that were selected as inputs to these extra outputs. Inductive transfer makes what is learned by these mappings available to the model that is being trained on the main output, often resulting in improved performance on that main output
    corecore