25,130 research outputs found

    OpinionRank: Extracting Ground Truth Labels from Unreliable Expert Opinions with Graph-Based Spectral Ranking

    Get PDF
    As larger and more comprehensive datasets become standard in contemporary machine learning, it becomes increasingly more difficult to obtain reliable, trustworthy label information with which to train sophisticated models. To address this problem, crowdsourcing has emerged as a popular, inexpensive, and efficient data mining solution for performing distributed label collection. However, crowdsourced annotations are inherently untrustworthy, as the labels are provided by anonymous volunteers who may have varying, unreliable expertise. Worse yet, some participants on commonly used platforms such as Amazon Mechanical Turk may be adversarial, and provide intentionally incorrect label information without the end user\u27s knowledge. We discuss three conventional models of the label generation process, describing their parameterizations and the model-based approaches used to solve them. We then propose OpinionRank, a model-free, interpretable, graph-based spectral algorithm for integrating crowdsourced annotations into reliable labels for performing supervised or semi-supervised learning. Our experiments show that OpinionRank performs favorably when compared against more highly parameterized algorithms. We also show that OpinionRank is scalable to very large datasets and numbers of label sources, and requires considerably fewer computational resources than previous approaches

    Estimating parties’ policy positions through voting advice applications: Some methodological considerations

    Get PDF
    The past few years have seen the advent and proliferation of Voting Advice (or Aid) Applications (VAAs), which offer voting advice on the basis of calculating the ideological congruence between citizens and political actors. Although VAA data have often been used to test many empirical questions regarding voting behaviour and political participation, we know little about the approaches used by VAAs to estimate the positions of political parties. This article presents the most common aspects of the VAA approach and examines some methodological issues regarding the phrasing of statements, the format of response scales, the reliability of coding statements into response scales and the reliability and validity of scaling items into dimensions. The article argues that VAAs have a lot of potential but there is also much space for methodological improvements, and therefore concludes with some recommendations for designing VAAs

    Teaching Neural Networks to Detect the Authors of Texts Using Lexical Descriptors

    Get PDF
    This paper proposes a means of using an artificial neural network to distinguish the authors of paragraphs. Once the network has been trained, its hidden layer activations are recorded as a representation of the average number of words and average characters of words in a paragraphs of an author. This stored information can then be used to identify the texts written by authors. This computational task is solved by dividing it into a number of computationally simple tasks and then combining the solutions to those tasks. Computational simplicity is achieved by distributing the learning task among a number of experts, which in turn divides the input space into a set of subspaces. The combination of these experts is said to constitute a committee machine. Basically, it fuses knowledge acquired by experts to arrive at an overall decision that is supposedly superior to that attainable by anyone of them acting alone. By this, we succeeded to distinguish the paragraphs authored by Ivo Andrić, from the ones authored by Mehmed Meša Selimović
    • …
    corecore