22,626 research outputs found

    How to Solve Classification and Regression Problems on High-Dimensional Data with a Supervised Extension of Slow Feature Analysis

    Get PDF
    Supervised learning from high-dimensional data, e.g., multimedia data, is a challenging task. We propose an extension of slow feature analysis (SFA) for supervised dimensionality reduction called graph-based SFA (GSFA). The algorithm extracts a label-predictive low-dimensional set of features that can be post-processed by typical supervised algorithms to generate the final label or class estimation. GSFA is trained with a so-called training graph, in which the vertices are the samples and the edges represent similarities of the corresponding labels. A new weighted SFA optimization problem is introduced, generalizing the notion of slowness from sequences of samples to such training graphs. We show that GSFA computes an optimal solution to this problem in the considered function space, and propose several types of training graphs. For classification, the most straightforward graph yields features equivalent to those of (nonlinear) Fisher discriminant analysis. Emphasis is on regression, where four different graphs were evaluated experimentally with a subproblem of face detection on photographs. The method proposed is promising particularly when linear models are insufficient, as well as when feature selection is difficult

    From which world is your graph?

    Full text link
    Discovering statistical structure from links is a fundamental problem in the analysis of social networks. Choosing a misspecified model, or equivalently, an incorrect inference algorithm will result in an invalid analysis or even falsely uncover patterns that are in fact artifacts of the model. This work focuses on unifying two of the most widely used link-formation models: the stochastic blockmodel (SBM) and the small world (or latent space) model (SWM). Integrating techniques from kernel learning, spectral graph theory, and nonlinear dimensionality reduction, we develop the first statistically sound polynomial-time algorithm to discover latent patterns in sparse graphs for both models. When the network comes from an SBM, the algorithm outputs a block structure. When it is from an SWM, the algorithm outputs estimates of each node's latent position.Comment: To appear in NIPS 201

    The Data Big Bang and the Expanding Digital Universe: High-Dimensional, Complex and Massive Data Sets in an Inflationary Epoch

    Get PDF
    Recent and forthcoming advances in instrumentation, and giant new surveys, are creating astronomical data sets that are not amenable to the methods of analysis familiar to astronomers. Traditional methods are often inadequate not merely because of the size in bytes of the data sets, but also because of the complexity of modern data sets. Mathematical limitations of familiar algorithms and techniques in dealing with such data sets create a critical need for new paradigms for the representation, analysis and scientific visualization (as opposed to illustrative visualization) of heterogeneous, multiresolution data across application domains. Some of the problems presented by the new data sets have been addressed by other disciplines such as applied mathematics, statistics and machine learning and have been utilized by other sciences such as space-based geosciences. Unfortunately, valuable results pertaining to these problems are mostly to be found only in publications outside of astronomy. Here we offer brief overviews of a number of concepts, techniques and developments, some "old" and some new. These are generally unknown to most of the astronomical community, but are vital to the analysis and visualization of complex datasets and images. In order for astronomers to take advantage of the richness and complexity of the new era of data, and to be able to identify, adopt, and apply new solutions, the astronomical community needs a certain degree of awareness and understanding of the new concepts. One of the goals of this paper is to help bridge the gap between applied mathematics, artificial intelligence and computer science on the one side and astronomy on the other.Comment: 24 pages, 8 Figures, 1 Table. Accepted for publication: "Advances in Astronomy, special issue "Robotic Astronomy
    corecore