52,033 research outputs found

    Are you going to the party: depends, who else is coming? [Learning hidden group dynamics via conditional latent tree models]

    Get PDF
    Scalable probabilistic modeling and prediction in high dimensional multivariate time-series is a challenging problem, particularly for systems with hidden sources of dependence and/or homogeneity. Examples of such problems include dynamic social networks with co-evolving nodes and edges and dynamic student learning in online courses. Here, we address these problems through the discovery of hierarchical latent groups. We introduce a family of Conditional Latent Tree Models (CLTM), in which tree-structured latent variables incorporate the unknown groups. The latent tree itself is conditioned on observed covariates such as seasonality, historical activity, and node attributes. We propose a statistically efficient framework for learning both the hierarchical tree structure and the parameters of the CLTM. We demonstrate competitive performance in multiple real world datasets from different domains. These include a dataset on students' attempts at answering questions in a psychology MOOC, Twitter users participating in an emergency management discussion and interacting with one another, and windsurfers interacting on a beach in Southern California. In addition, our modeling framework provides valuable and interpretable information about the hidden group structures and their effect on the evolution of the time series

    Decoding the urban grid: or why cities are neither trees nor perfect grids

    Get PDF
    In a previous paper (Figueiredo and Amorim, 2005), we introduced the continuity lines, a compressed description that encapsulates topological and geometrical properties of urban grids. In this paper, we applied this technique to a large database of maps that included cities of 22 countries. We explore how this representation encodes into networks universal features of urban grids and, at the same time, retrieves differences that reflect classes of cities. Then, we propose an emergent taxonomy for urban grids

    Cayley Trees and Bethe Lattices, a concise analysis for mathematicians and physicists

    Full text link
    We review critically the concepts and the applications of Cayley Trees and Bethe Lattices in statistical mechanics in a tentative effort to remove widespread misuse of these simple, but yet important - and different - ideal graphs. We illustrate, in particular, two rigorous techniques to deal with Bethe Lattices, based respectively on self-similarity and on the Kolmogorov consistency theorem, linking the latter with the Cavity and Belief Propagation methods, more known to the physics community.Comment: 10 pages, 2 figure

    Impact of Biases in Big Data

    Get PDF
    The underlying paradigm of big data-driven machine learning reflects the desire of deriving better conclusions from simply analyzing more data, without the necessity of looking at theory and models. Is having simply more data always helpful? In 1936, The Literary Digest collected 2.3M filled in questionnaires to predict the outcome of that year's US presidential election. The outcome of this big data prediction proved to be entirely wrong, whereas George Gallup only needed 3K handpicked people to make an accurate prediction. Generally, biases occur in machine learning whenever the distributions of training set and test set are different. In this work, we provide a review of different sorts of biases in (big) data sets in machine learning. We provide definitions and discussions of the most commonly appearing biases in machine learning: class imbalance and covariate shift. We also show how these biases can be quantified and corrected. This work is an introductory text for both researchers and practitioners to become more aware of this topic and thus to derive more reliable models for their learning problems

    Surface networks

    Get PDF
    © Copyright CASA, UCL. The desire to understand and exploit the structure of continuous surfaces is common to researchers in a range of disciplines. Few examples of the varied surfaces forming an integral part of modern subjects include terrain, population density, surface atmospheric pressure, physico-chemical surfaces, computer graphics, and metrological surfaces. The focus of the work here is a group of data structures called Surface Networks, which abstract 2-dimensional surfaces by storing only the most important (also called fundamental, critical or surface-specific) points and lines in the surfaces. Surface networks are intelligent and “natural ” data structures because they store a surface as a framework of “surface ” elements unlike the DEM or TIN data structures. This report presents an overview of the previous works and the ideas being developed by the authors of this report. The research on surface networks has fou
    • 

    corecore