50,872 research outputs found

    Network Inference from Co-Occurrences

    Full text link
    The recovery of network structure from experimental data is a basic and fundamental problem. Unfortunately, experimental data often do not directly reveal structure due to inherent limitations such as imprecision in timing or other observation mechanisms. We consider the problem of inferring network structure in the form of a directed graph from co-occurrence observations. Each observation arises from a transmission made over the network and indicates which vertices carry the transmission without explicitly conveying their order in the path. Without order information, there are an exponential number of feasible graphs which agree with the observed data equally well. Yet, the basic physical principles underlying most networks strongly suggest that all feasible graphs are not equally likely. In particular, vertices that co-occur in many observations are probably closely connected. Previous approaches to this problem are based on ad hoc heuristics. We model the experimental observations as independent realizations of a random walk on the underlying graph, subjected to a random permutation which accounts for the lack of order information. Treating the permutations as missing data, we derive an exact expectation-maximization (EM) algorithm for estimating the random walk parameters. For long transmission paths the exact E-step may be computationally intractable, so we also describe an efficient Monte Carlo EM (MCEM) algorithm and derive conditions which ensure convergence of the MCEM algorithm with high probability. Simulations and experiments with Internet measurements demonstrate the promise of this approach.Comment: Submitted to IEEE Transactions on Information Theory. An extended version is available as University of Wisconsin Technical Report ECE-06-

    From Relational Data to Graphs: Inferring Significant Links using Generalized Hypergeometric Ensembles

    Full text link
    The inference of network topologies from relational data is an important problem in data analysis. Exemplary applications include the reconstruction of social ties from data on human interactions, the inference of gene co-expression networks from DNA microarray data, or the learning of semantic relationships based on co-occurrences of words in documents. Solving these problems requires techniques to infer significant links in noisy relational data. In this short paper, we propose a new statistical modeling framework to address this challenge. It builds on generalized hypergeometric ensembles, a class of generative stochastic models that give rise to analytically tractable probability spaces of directed, multi-edge graphs. We show how this framework can be used to assess the significance of links in noisy relational data. We illustrate our method in two data sets capturing spatio-temporal proximity relations between actors in a social system. The results show that our analytical framework provides a new approach to infer significant links from relational data, with interesting perspectives for the mining of data on social systems.Comment: 10 pages, 8 figures, accepted at SocInfo201

    Multi-Object Classification and Unsupervised Scene Understanding Using Deep Learning Features and Latent Tree Probabilistic Models

    Get PDF
    Deep learning has shown state-of-art classification performance on datasets such as ImageNet, which contain a single object in each image. However, multi-object classification is far more challenging. We present a unified framework which leverages the strengths of multiple machine learning methods, viz deep learning, probabilistic models and kernel methods to obtain state-of-art performance on Microsoft COCO, consisting of non-iconic images. We incorporate contextual information in natural images through a conditional latent tree probabilistic model (CLTM), where the object co-occurrences are conditioned on the extracted fc7 features from pre-trained Imagenet CNN as input. We learn the CLTM tree structure using conditional pairwise probabilities for object co-occurrences, estimated through kernel methods, and we learn its node and edge potentials by training a new 3-layer neural network, which takes fc7 features as input. Object classification is carried out via inference on the learnt conditional tree model, and we obtain significant gain in precision-recall and F-measures on MS-COCO, especially for difficult object categories. Moreover, the latent variables in the CLTM capture scene information: the images with top activations for a latent node have common themes such as being a grasslands or a food scene, and on on. In addition, we show that a simple k-means clustering of the inferred latent nodes alone significantly improves scene classification performance on the MIT-Indoor dataset, without the need for any retraining, and without using scene labels during training. Thus, we present a unified framework for multi-object classification and unsupervised scene understanding

    SciRecSys: A Recommendation System for Scientific Publication by Discovering Keyword Relationships

    Full text link
    In this work, we propose a new approach for discovering various relationships among keywords over the scientific publications based on a Markov Chain model. It is an important problem since keywords are the basic elements for representing abstract objects such as documents, user profiles, topics and many things else. Our model is very effective since it combines four important factors in scientific publications: content, publicity, impact and randomness. Particularly, a recommendation system (called SciRecSys) has been presented to support users to efficiently find out relevant articles
    • …
    corecore