Search CORE

1,134 research outputs found

Machine Learning

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Machine Learning can be defined in various ways related to a scientific domain concerned with the design and development of theoretical and implementation tools that allow building systems with some Human Like intelligent behavior. Machine learning addresses more specifically the ability to improve automatically through experience

Directory of Open Access Books (DOAB)

Keyword Assisted Topic Models

Author: Eshima Shusei
Imai Kosuke
Sasaki Tomoya
Publication venue
Publication date: 13/04/2020
Field of study

For a long time, many social scientists have conducted content analysis by using their substantive knowledge and manually coding documents. In recent years, however, fully automated content analysis based on probabilistic topic models has become increasingly popular because of their scalability. Unfortunately, applied researchers find that these models often fail to yield topics of their substantive interest by inadvertently creating multiple topics with similar content and combining different themes into a single topic. In this paper, we empirically demonstrate that providing topic models with a small number of keywords can substantially improve their performance. The proposed keyword assisted topic model (keyATM) offers an important advantage that the specification of keywords requires researchers to label topics prior to fitting a model to the data. This contrasts with a widespread practice of post-hoc topic interpretation and adjustments that compromises the objectivity of empirical findings. In our applications, we find that the keyATM provides more interpretable results, has better document classification performance, and is less sensitive to the number of topics than the standard topic models. Finally, we show that the keyATM can also incorporate covariates and model time trends. An open-source software package is available for implementing the proposed methodology

arXiv.org e-Print Archive

Election Data Visualisation

Author: Long Elena
Publication venue: 'University of Plymouth'
Publication date: 01/01/2013
Field of study

Visualisations of election data produced by the mass media, other organisations and even individuals are becoming increasingly available across a wide variety of platforms and in many different forms. As more data become available digitally and as improvements to computer hardware and software are made, these visualisations have become more ambitious in scope and more user-friendly. Research has shown that visualising data is an extremely powerful method of communicating information to specialists and non-specialists alike. This amounts to a democratisation of access to political and electoral data. To some extent political science lags behind the progress that has been made in the field of data visualisation. Much of the academic output remains committed to the paper format and much of the data presentation is in the form of simple text and tables. In the digital and information age there is a danger that political science will fall behind. This thesis reports on a number of case studies where efforts were made to visualise election data in order to clarify its structure and to present its meaning. The first case study demonstrates the value of data visualisation to the research process itself, facilitating the understanding of effects produced by different ways of estimating missing data. A second study sought to use visualisation to explain complex aspects of voting systems to the wider public. Three further case studies demonstrate the value of collaboration between political scientists and others possessing a range of skills embracing data management, software engineering, broadcasting and graphic design. These studies also demonstrate some of the problems that are encountered when trying to distil complex data into a form that can be easily viewed and interpreted by non-expert users. More importantly, these studies suggest that when the skills balance is correct then visualisation is both viable and necessary for communicating information on elections

Plymouth Electronic Archive and Research Library

Doctor of Philosophy

Author: Saha Avishek
Publication venue: University of Utah
Publication date: 01/12/2012
Field of study

dissertationMachine learning is the science of building predictive models from data that automatically improve based on past experience. To learn these models, traditional learning algorithms require labeled data. They also require that the entire dataset fits in the memory of a single machine. Labeled data are available or can be acquired for small and moderately sized datasets but curating large datasets can be prohibitively expensive. Similarly, massive datasets are usually too huge to fit into the memory of a single machine. An alternative is to distribute the dataset over multiple machines. Distributed learning, however, poses new challenges as most existing machine learning techniques are inherently sequential. Additionally, these distributed approaches have to be designed keeping in mind various resource limitations of real-world settings, prime among them being intermachine communication. With the advent of big datasets machine learning algorithms are facing new challenges. Their design is no longer limited to minimizing some loss function but, additionally, needs to consider other resources that are critical when learning at scale. In this thesis, we explore different models and measures for learning with limited resources that have a budget. What budgetary constraints are posed by modern datasets? Can we reuse or combine existing machine learning paradigms to address these challenges at scale? How does the cost metrics change when we shift to distributed models for learning? These are some of the questions that have been investigated in this thesis. The answers to these questions hold the key to addressing some of the challenges faced when learning on massive datasets. In the first part of this thesis, we present three different budgeted scenarios that deal with scarcity of labeled data and limited computational resources. The goal is to leverage transfer information from related domains to learn under budgetary constraints. Our proposed techniques comprise semisupervised transfer, online transfer and active transfer. In the second part of this thesis, we study distributed learning with limited communication. We present initial sampling based results, as well as, propose communication protocols for learning distributed linear classifiers

The University of Utah: J. Willard Marriott Digital Library

Holistic interpretation of visual data based on topology:semantic segmentation of architectural facades

Author: Fathalla Radwa
Publication venue
Publication date
Field of study

The work presented in this dissertation is a step towards effectively incorporating contextual knowledge in the task of semantic segmentation. To date, the use of context has been confined to the genre of the scene with a few exceptions in the field. Research has been directed towards enhancing appearance descriptors. While this is unarguably important, recent studies show that computer vision has reached a near-human level of performance in relying on these descriptors when objects have stable distinctive surface properties and in proper imaging conditions. When these conditions are not met, humans exploit their knowledge about the intrinsic geometric layout of the scene to make local decisions. Computer vision lags behind when it comes to this asset. For this reason, we aim to bridge the gap by presenting algorithms for semantic segmentation of building facades making use of scene topological aspects. We provide a classification scheme to carry out segmentation and recognition simultaneously.The algorithm is able to solve a single optimization function and yield a semantic interpretation of facades, relying on the modeling power of probabilistic graphs and efficient discrete combinatorial optimization tools. We tackle the same problem of semantic facade segmentation with the neural network approach.We attain accuracy figures that are on-par with the state-of-the-art in a fully automated pipeline.Starting from pixelwise classifications obtained via Convolutional Neural Networks (CNN). These are then structurally validated through a cascade of Restricted Boltzmann Machines (RBM) and Multi-Layer Perceptron (MLP) that regenerates the most likely layout. In the domain of architectural modeling, there is geometric multi-model fitting. We introduce a novel guided sampling algorithm based on Minimum Spanning Trees (MST), which surpasses other propagation techniques in terms of robustness to noise. We make a number of additional contributions such as measure of model deviation which captures variations among fitted models

Aston Publications Explorer

Philosophy and the practice of Bayesian statistics

Author: Abbott
Ashby
Atkinson
Barkow
Bartlett
Bayarri
Bayarri
Bayarri
Berger
Berk
Berk
Bernardo
Binmore
Bousquet
Box
Box
Box
Braithwaite
Brown
Cesa-Bianchi
Claeskens
Cox
Cox
Cox
Cox
Csiszár
Dawid
Donovan
Doob
Earman
Eggertsson
Fitelson
Foster
Fraser
Freedman
Gelman
Gelman
Gelman
Gelman
Gelman
Gelman
Gelman
Gelman
Gelman
Gelman
Gelman
Gelman
Gelman
Ghitza
Ghosh
Giere
Gigerenzer
Gigerenzer
Glymour
Good
Good
Gray
Greenland
Greenland
Grünwald
Grünwald
Gustafson
Guttorp
Haack
Hacking
Halpern
Hastie
Hempel
Hill
Hjort
Holland
Howson
Hunter
Jaynes
Kass
Kass
Kass
Kelly
Kelly
Kitcher
Kleijn
Kolakowski
Kuhn
Kuhn
Lakatos
Laudan
Laudan
Li
Lijoi
Lindsay
Manski
Manski
Mayo
Mayo
Mayo
Mayo
McAllister
McCarty
Merrill
Metropolis
Morris
Müller
Newman
Norton
Paninski
Popper
Quine
Raftery
Ripley
Rivers
Robins
Rubin
Rubin
Russell
Salmon
Savage
Schervish
Seidenfeld
Seidenfeld
Shalizi
Snijders
Spanos
Stove
Stove
Tilly
Tilly
Toulmin
Tukey
Uffink
Uffink
Vansteelandt
Vidyasagar
Vuong
Wahba
Wasserman
Weinberg
White
Wooldridge
Ziman
Publication venue: 'Wiley'
Publication date: 01/01/2010
Field of study

A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework.Comment: 36 pages, 5 figures. v2: Fixed typo in caption of figure 1. v3: Further typo fixes. v4: Revised in response to referee

arXiv.org e-Print Archive

CiteSeerX

Crossref

STAIRS 2014:proceedings of the 7th European Starting AI Researcher Symposium

Author
Publication venue: 'IOS Press'
Publication date: 01/01/2014
Field of study

International Migration, Integration and Social Cohesion online publications

Politische Maschinen: Maschinelles Lernen für das Verständnis von sozialen Maschinen

Author: Papakyriakopoulos Orestis
Publication venue: Technische Universität München
Publication date: 29/06/2020
Field of study

This thesis investigates human-algorithm interactions in sociotechnological ecosystems. Specifically, it applies machine learning and statistical methods to uncover political dimensions of algorithmic influence in social media platforms and automated decision making systems. Based on the results, the study discusses the legal, political and ethical consequences of algorithmic implementations.Diese Arbeit untersucht Mensch-Algorithmen-Interaktionen in sozio-technologischen Ökosystemen. Sie wendet maschinelles Lernen und statistische Methoden an, um politische Dimensionen des algorithmischen Einflusses auf Socialen Medien und automatisierten Entscheidungssystemen aufzudecken. Aufgrund der Ergebnisse diskutiert die Studie die rechtlichen, politischen und ethischen Konsequenzen von algorithmischen Anwendungen