Search CORE

114 research outputs found

Overcoming uncertainty for within-network relational machine learning

Author: Pfeiffer Joseph J.
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2015
Field of study

People increasingly communicate through email and social networks to maintain friendships and conduct business, as well as share online content such as pictures, videos and products. Relational machine learning (RML) utilizes a set of observed attributes and network structure to predict corresponding labels for items; for example, to predict individuals engaged in securities fraud, we can utilize phone calls and workplace information to make joint predictions over the individuals. However, in large scale and partially observed network domains, missing labels and edges can significantly impact standard relational machine learning methods by introducing bias into the learning and inference processes. In this thesis, we identify the effects on parameter estimation, correct the biases, and model the uncertainty of the missing data to improve predictive performance. In particular, we investigate this issue on a variety of modeling scenarios and prediction problems.^ First, we introduce the Transitive Chung Lu random graph model for modeling the conditional distribution of edges given a partially observed network. This model fits within a class of scalable generative graph models with scalable sampling processes that we generalize to model distributions of networks with correlated attribute variables via Attributed Graph Models. Second, we utilize TCL to incorporate edge probabilities into relational learning and inference models for partially observed network domains. As part of this work, give a linear time algorithm to perform variational inference over a squared network. We apply the resulting semi-supervised model, Probabilistic Relational EM (PR-EM) to the Active Exploration domain to iteratively locate positive examples in partially observed networks. Due to the sampling process, this domain exhibits extreme bias for learning and inference: we show that PR-EM operates with high accuracy despite the difficult domain. Third, we investigate the performance applying Relational EM methods for semi-supervised relational learning in partially labeled networks and find that fixed point estimates have considerable approximation errors during learning and inference. To solve this, we propose the stochastic Relational Stochastic EM and Relational Data Augmentation methods for semi-supervised relational learning and demonstrate that these approaches improve over the Relational EM method. Fourth, we improve on existing semi-supervised learning methods by imposing hard constraints on the inference steps, allowing semi-supervised methods to learn using better approximations during learning and inference for partially labeled networks. In particular, we find that we can correct for the approximated parameter learning errors during the collective inference step by imposing a Maximum Entropy constraint. We find that this correction allows us to utilize a better approximation over the unlabeled data. In addition, we prove that given an allowable error, this method is only a constant overhead to the original collective inference method. Overall, all of the methods presented in this thesis have provable subquadratic runtimes. We demonstrate each on large scale networks, in some cases including networks with millions of vertices and/or edges. Across all these approaches, we show that incorporating the uncertainty into the modeling process improves modeling and predictive performance

Purdue E-Pubs

Choosing between Auctions and Negotiations in Online B2B Markets for IT Services: The Effect of Prior Relationships and Performance

Author: Heck H.W.G.M. van
Koppius O.R.
Radkevitch U.L.
Publication venue
Publication date
Field of study

The choice of contract allocation mechanism in procurement affects such aspects of transactions as information exchange between buyer and supplier, supplier competition, pricing and, eventually, performance. In this study we investigate the buyerâ€™s choice between reverse auctions and bilateral negotiations as an allocation mechanism for IT services contracts. Prior studies into allocation mechanism choice focused on factors pertaining to discrete exchange situation, such as con-tract complexity or availability of suppliers. We broaden the research by focusing on buyersâ€™ past exchange relationships with vendors. Based on the literature on the economics of contracting and agency theory, we hypothesize that prior re-peat interaction with vendors favors the use of negotiations over auctions in the next transaction, while the need to explore the marketplace due to buyerâ€™s inexperience or dissatisfaction with vendorâ€™s performance in the most recent project leads to the use of auctions instead of negotiations. We find support for these hypotheses in a longitudinal dataset of 2,081 IT projects realized by 91 repeat buyers at a leading online services marketplace over a period of eight years. Taken together, the results show that analyzing B2B auctions and negotiations should move beyond analyzing discrete instances and instead analyze them in the context of the individual firmâ€™s history and supplier strategy.outsourcing;IT services;online marketplace;reverse auctions

Research Papers in Economics

Recommended from our members

The analysis of social network data: an exciting frontier for statisticians

Author: O'Malley James James
Publication venue: 'Wiley'
Publication date: 30/04/2013
Field of study

The catalyst for this paper is the recent interest in the relationship between social networks and an individual's health, which has arisen following a series of papers by Nicholas Christakis and James Fowler on person- to-person spread of health behaviors. In this issue, they provide a detailed explanation of their methods that offers insights, justifications, and responses to criticisms [1]. In this paper, we introduce some of the key statistical methods used in social network analysis and indicate where those used by Christakis and Fowler (CF) fit into the general framework. The intent is to provide the background necessary for readers to be able to make their own evaluation of the work by CF and understand the challenges of research involving social networks. We entertain possible solutions to some of the difficulties encountered in accounting for confounding effects in analyses of peer effects and provide comments on the contributions of CF

Harvard University - DASH

Scalable Text and Link Analysis with Mixed-Topic Link Models

Author: Chang J.
Cohn D.
Cohn D.
Getoor L.
Gopalan P.
Gruber A.
Lu Q.
Lu Q.
Yang T.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/03/2013
Field of study

Many data sets contain rich information about objects, as well as pairwise relations between them. For instance, in networks of websites, scientific papers, and other documents, each node has content consisting of a collection of words, as well as hyperlinks or citations to other nodes. In order to perform inference on such data sets, and make predictions and recommendations, it is useful to have models that are able to capture the processes which generate the text at each node and the links between them. In this paper, we combine classic ideas in topic modeling with a variant of the mixed-membership block model recently developed in the statistical physics community. The resulting model has the advantage that its parameters, including the mixture of topics of each document and the resulting overlapping communities, can be inferred with a simple and scalable expectation-maximization algorithm. We test our model on three data sets, performing unsupervised topic classification and link prediction. For both tasks, our model outperforms several existing state-of-the-art methods, achieving higher accuracy with significantly less computation, analyzing a data set with 1.3 million words and 44 thousand links in a few minutes.Comment: 11 pages, 4 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

An Introduction to Conditional Random Fields

Author: McCallum Andrew
Sutton Charles
Publication venue: 'Now Publishers'
Publication date: 01/01/2012
Field of study

CiteSeerX

Crossref

Edinburgh Research Explorer

Predicting Semantic Relations using Global Graph Properties

Author: Eisenstein Jacob
Pinter Yuval
Publication venue
Publication date: 01/01/2018
Field of study

Semantic graphs, such as WordNet, are resources which curate natural language on two distinguishable layers. On the local level, individual relations between synsets (semantic building blocks) such as hypernymy and meronymy enhance our understanding of the words used to express their meanings. Globally, analysis of graph-theoretic properties of the entire net sheds light on the structure of human language as a whole. In this paper, we combine global and local properties of semantic graphs through the framework of Max-Margin Markov Graph Models (M3GM), a novel extension of Exponential Random Graph Model (ERGM) that scales to large multi-relational graphs. We demonstrate how such global modeling improves performance on the local task of predicting semantic relations between synsets, yielding new state-of-the-art results on the WN18RR dataset, a challenging version of WordNet link prediction in which "easy" reciprocal cases are removed. In addition, the M3GM model identifies multirelational motifs that are characteristic of well-formed lexical semantic ontologies.Comment: EMNLP 201

arXiv.org e-Print Archive

Crossref