90 research outputs found

    Overcoming uncertainty for within-network relational machine learning

    Get PDF
    People increasingly communicate through email and social networks to maintain friendships and conduct business, as well as share online content such as pictures, videos and products. Relational machine learning (RML) utilizes a set of observed attributes and network structure to predict corresponding labels for items; for example, to predict individuals engaged in securities fraud, we can utilize phone calls and workplace information to make joint predictions over the individuals. However, in large scale and partially observed network domains, missing labels and edges can significantly impact standard relational machine learning methods by introducing bias into the learning and inference processes. In this thesis, we identify the effects on parameter estimation, correct the biases, and model the uncertainty of the missing data to improve predictive performance. In particular, we investigate this issue on a variety of modeling scenarios and prediction problems.^ First, we introduce the Transitive Chung Lu random graph model for modeling the conditional distribution of edges given a partially observed network. This model fits within a class of scalable generative graph models with scalable sampling processes that we generalize to model distributions of networks with correlated attribute variables via Attributed Graph Models. Second, we utilize TCL to incorporate edge probabilities into relational learning and inference models for partially observed network domains. As part of this work, give a linear time algorithm to perform variational inference over a squared network. We apply the resulting semi-supervised model, Probabilistic Relational EM (PR-EM) to the Active Exploration domain to iteratively locate positive examples in partially observed networks. Due to the sampling process, this domain exhibits extreme bias for learning and inference: we show that PR-EM operates with high accuracy despite the difficult domain. Third, we investigate the performance applying Relational EM methods for semi-supervised relational learning in partially labeled networks and find that fixed point estimates have considerable approximation errors during learning and inference. To solve this, we propose the stochastic Relational Stochastic EM and Relational Data Augmentation methods for semi-supervised relational learning and demonstrate that these approaches improve over the Relational EM method. Fourth, we improve on existing semi-supervised learning methods by imposing hard constraints on the inference steps, allowing semi-supervised methods to learn using better approximations during learning and inference for partially labeled networks. In particular, we find that we can correct for the approximated parameter learning errors during the collective inference step by imposing a Maximum Entropy constraint. We find that this correction allows us to utilize a better approximation over the unlabeled data. In addition, we prove that given an allowable error, this method is only a constant overhead to the original collective inference method. Overall, all of the methods presented in this thesis have provable subquadratic runtimes. We demonstrate each on large scale networks, in some cases including networks with millions of vertices and/or edges. Across all these approaches, we show that incorporating the uncertainty into the modeling process improves modeling and predictive performance

    Modeling Heterogeneous Peer Assortment Effects using Latent Class Pseudo-Maximum Likelihood Exponential Random Graph Models

    Get PDF
    This thesis develops a class of models for inference on networks called Sender/Receiver Latent Class Exponential Random Graph Models (SRLCERGMs). This class of models extends the existing Exponential Random Graph Modeling framework to allow analysts to model unobserved heterogeneity in the effects of nodal covariates and network features. Simulations across a variety of conditions are presented to evaluate the performance of this technique, and an empirical example regarding substance use among adolescents is also presented. Implications for the analysis of social networks in psychological science are discussed.Master of Art

    Hinge-Loss Markov Random Fields and Probabilistic Soft Logic: A Scalable Approach to Structured Prediction

    Get PDF
    A fundamental challenge in developing impactful artificial intelligence technologies is balancing the ability to model rich, structured domains with the ability to scale to big data. Many important problem areas are both richly structured and large scale, from social and biological networks, to knowledge graphs and the Web, to images, video, and natural language. In this thesis I introduce two new formalisms for modeling structured data, distinguished from previous approaches by their ability to both capture rich structure and scale to big data. The first, hinge-loss Markov random fields (HL-MRFs), is a new kind of probabilistic graphical model that generalizes different approaches to convex inference. I unite three views of inference from the randomized algorithms, probabilistic graphical models, and fuzzy logic communities, showing that all three views lead to the same inference objective. I then derive HL-MRFs by generalizing this unified objective. The second new formalism, probabilistic soft logic (PSL), is a probabilistic programming language that makes HL-MRFs easy to define, refine, and reuse for relational data. PSL uses a syntax based on first-order logic to compactly specify complex models. I next introduce an algorithm for inferring most-probable variable assignments (MAP inference) for HL-MRFs that is extremely scalable, much more so than commercially available software, because it uses message passing to leverage the sparse dependency structures common in inference tasks. I then show how to learn the parameters of HL-MRFs using a number of learning objectives. The learned HL-MRFs are as accurate as traditional, discrete models, but much more scalable. To enable HL-MRFs and PSL to capture even richer dependencies, I then extend learning to support latent variables, i.e., variables without training labels. To overcome the bottleneck of repeated inferences required during learning, I introduce paired-dual learning, which interleaves inference and parameter updates. Paired-dual learning learns accurate models and is also scalable, often completing before traditional methods make even one parameter update. Together, these algorithms enable HL-MRFs and PSL to model rich, structured data at scales not previously possible

    Networks of innovation: measuring, modelling and enhancing innovation in surgery

    Get PDF
    The rate of innovation occurring in surgery is beyond our systemic capacity to quantify, with several methodological and practical challenges. The existing paucity of surgical innovation metrics presents a global healthcare problem especially as surgical innovations become increasingly costlier at a time when healthcare provision is experiencing a radical transformation driven by pressures to reduce costs, an ageing population with ever-increasing healthcare needs and patients with growing expectations. This thesis aims to devise a novel, quantitative, network-based framework that will permit modelling and measuring surgical innovation to add the most value to patient care. It involves the systematic, graphical and analytical assessment of surgical innovation in a way that has never been done before. This is based on successful models previously applied in the industry with advanced analytical techniques derived from social science (network analysis). In doing so, it offers an exciting new perspective and opportunity for understanding how the innovation process originates and evolves in surgery and how it can be measured in terms of value and virality, a priority for the NHS, RCS, Imperial and the wider surgical community. The ability to measure value and rank innovations is expected to play a fundamental role in guiding policy, strategically direct surgical research funding, and uncover innovation barriers and catalysts. This will ensure participation in the forefront of novel surgical technology and lay the scientific foundations for the development of improved healthcare models and services to enhance the quality of healthcare delivered.Open Acces

    (how) do CEO turnover and succession matter?

    Get PDF
    Business exit has implications for a firm’s corporate strategy. Two types of exit events are distinguished: those that involve strategic change and those that are status quo-preserving. This study investigates the impact of CEO turnover and succession on strategic versus status quo-preserving business exits. Based on a sample of CEO turnover and succession events and subsequent business exits of German corporations from different industries, our results suggest that neither voluntary nor involuntary CEO turnover is relevant to business exit. In contrast, outsider succession significantly affects the likelihood of strategic business exit, while a corporation’s performance does not moderate this relationship
    • …
    corecore