Unsupervised Prediction Aggregation

Abstract

Consider the scenario where votes from multiple experts utilizing different data modalities or modeling assumptions are available for a given prediction task. The task of combining these signals with the goal of obtaining a better prediction is ubiquitous in Information Retrieval (IR), Natural Language Processing (NLP) and many other areas. In IR, for instance, meta-search aims to combine the outputs of multiple search engines to produce a better ranking. In NLP, aggregation of the outputs of computer systems generating natural language translations [7], syntactic dependency parses [8], identifying intended meanings of words [1], and others has received considerable recent attention. Most existing learning approaches to aggregation address the supervised setting. However, for complex prediction tasks such as these, data annotation is a very labor intensive and time consuming process. In this line of work, we first derive a mathematical and algorithmic framework for learning to combine predictions from multiple signals without supervision. In particular, we use the extended Mallows formalism (e.g. [5, 4]) for modeling aggregation, and derive an unsupervised learning procedure for estimating the model parameters [2]. While direct application of the learning framework can be computationally expensive in general, we propose alternatives to keep learning and inferenc

    Similar works