27 research outputs found

    Hierarchical relational models for document networks

    Full text link
    We develop the relational topic model (RTM), a hierarchical model of both network structure and node attributes. We focus on document networks, where the attributes of each document are its words, that is, discrete observations taken from a fixed vocabulary. For each pair of documents, the RTM models their link as a binary random variable that is conditioned on their contents. The model can be used to summarize a network of documents, predict links between them, and predict words within them. We derive efficient inference and estimation algorithms based on variational methods that take advantage of sparsity and scale with the number of links. We evaluate the predictive performance of the RTM for large networks of scientific abstracts, web documents, and geographically tagged news.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS309 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Mixed membership stochastic blockmodels

    Full text link
    Observations consisting of measurements on relationships for pairs of objects arise in many settings, such as protein interaction and gene regulatory networks, collections of author-recipient email, and social networks. Analyzing such data with probabilisic models can be delicate because the simple exchangeability assumptions underlying many boilerplate models no longer hold. In this paper, we describe a latent variable model of such data called the mixed membership stochastic blockmodel. This model extends blockmodels for relational data to ones which capture mixed membership latent relational structure, thus providing an object-specific low-dimensional representation. We develop a general variational inference algorithm for fast approximate posterior inference. We explore applications to social and protein interaction networks.Comment: 46 pages, 14 figures, 3 table

    Automatic information retrieval through text-mining

    Get PDF
    The dissertation presented for obtaining the Master’s Degree in Electrical Engineering and Computer Science, at Universidade Nova de Lisboa, Faculdade de Ciências e TecnologiaNowadays, around a huge amount of firms in the European Union catalogued as Small and Medium Enterprises (SMEs), employ almost a great portion of the active workforce in Europe. Nonetheless, SMEs cannot afford implementing neither methods nor tools to systematically adapt innovation as a part of their business process. Innovation is the engine to be competitive in the globalized environment, especially in the current socio-economic situation. This thesis provides a platform that when integrated with ExtremeFactories(EF) project, aids SMEs to become more competitive by means of monitoring schedule functionality. In this thesis a text-mining platform that possesses the ability to schedule a gathering information through keywords is presented. In order to develop the platform, several choices concerning the implementation have been made, in the sense that one of them requires particular emphasis is the framework, Apache Lucene Core 2 by supplying an efficient text-mining tool and it is highly used for the purpose of the thesis

    Pervasive sensing to model political opinions in face-to-face networks

    Get PDF
    Exposure and adoption of opinions in social networks are important questions in education, business, and government. We de- scribe a novel application of pervasive computing based on using mobile phone sensors to measure and model the face-to-face interactions and subsequent opinion changes amongst undergraduates, during the 2008 US presidential election campaign. We nd that self-reported political discussants have characteristic interaction patterns and can be predicted from sensor data. Mobile features can be used to estimate unique individ- ual exposure to di erent opinions, and help discover surprising patterns of dynamic homophily related to external political events, such as elec- tion debates and election day. To our knowledge, this is the rst time such dynamic homophily e ects have been measured. Automatically esti- mated exposure explains individual opinions on election day. Finally, we report statistically signi cant di erences in the daily activities of individ- uals that change political opinions versus those that do not, by modeling and discovering dominant activities using topic models. We nd people who decrease their interest in politics are routinely exposed (face-to-face) to friends with little or no interest in politics.U.S. Army Research Laboratory (Cooperative Agreement No. W911NF-09-2-0053)United States. Air Force Office of Scientific Research (Award No. FA9550-10-1-0122)Swiss National Science Foundatio

    A Survey of Probabilistic Models for Relational Data

    Full text link

    Transforming Graph Representations for Statistical Relational Learning

    Full text link
    Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since the choice of relational data representation for the nodes, links, and features can dramatically affect the capabilities of SRL algorithms, we survey approaches and opportunities for relational representation transformation designed to improve the performance of these algorithms. This leads us to introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. In particular, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey and compare competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed
    corecore