64,926 research outputs found

    A survey of kernel and spectral methods for clustering

    Get PDF
    Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a survey of kernel and spectral clustering methods, two approaches able to produce nonlinear separating hypersurfaces between clusters. The presented kernel clustering methods are the kernel version of many classical clustering algorithms, e.g., K-means, SOM and neural gas. Spectral clustering arise from concepts in spectral graph theory and the clustering problem is configured as a graph cut problem where an appropriate objective function has to be optimized. An explicit proof of the fact that these two paradigms have the same objective is reported since it has been proven that these two seemingly different approaches have the same mathematical foundation. Besides, fuzzy kernel clustering methods are presented as extensions of kernel K-means clustering algorithm. (C) 2007 Pattem Recognition Society. Published by Elsevier Ltd. All rights reserved

    A Framework for Spatio-Temporal Data Analysis and Hypothesis Exploration

    Get PDF
    We present a general framework for pattern discovery and hypothesis exploration in spatio-temporal data sets that is based on delay-embedding. This is a remarkable method of nonlinear time-series analysis that allows the full phase-space behaviour of a system to be reconstructed from only a single observable (accessible variable). Recent extensions to the theory that focus on a probabilistic interpretation extend its scope and allow practical application to noisy, uncertain and high-dimensional systems. The framework uses these extensions to aid alignment of spatio-temporal sub-models (hypotheses) to empirical data - for example satellite images plus remote-sensing - and to explore modifications consistent with this alignment. The novel aspect of the work is a mechanism for linking global and local dynamics using a holistic spatio-temporal feedback loop. An example framework is devised for an urban based application, transit centric developments, and its utility is demonstrated with real data

    Statistical thinking: From Tukey to Vardi and beyond

    Get PDF
    Data miners (minors?) and neural networkers tend to eschew modelling, misled perhaps by misinterpretation of strongly expressed views of John Tukey. I discuss Vardi's views of these issues as well as other aspects of Vardi's work in emision tomography and in sampling bias.Comment: Published at http://dx.doi.org/10.1214/074921707000000210 in the IMS Lecture Notes Monograph Series (http://www.imstat.org/publications/lecnotes.htm) by the Institute of Mathematical Statistics (http://www.imstat.org

    Enhancing Undergraduate AI Courses through Machine Learning Projects

    Full text link
    It is generally recognized that an undergraduate introductory Artificial Intelligence course is challenging to teach. This is, in part, due to the diverse and seemingly disconnected core topics that are typically covered. The paper presents work funded by the National Science Foundation to address this problem and to enhance the student learning experience in the course. Our work involves the development of an adaptable framework for the presentation of core AI topics through a unifying theme of machine learning. A suite of hands-on semester-long projects are developed, each involving the design and implementation of a learning system that enhances a commonly-deployed application. The projects use machine learning as a unifying theme to tie together the core AI topics. In this paper, we will first provide an overview of our model and the projects being developed and will then present in some detail our experiences with one of the projects – Web User Profiling which we have used in our AI class

    From Frequency to Meaning: Vector Space Models of Semantics

    Full text link
    Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term-document, word-context, and pair-pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field
    corecore