360 research outputs found
Dynamic Bayesian Combination of Multiple Imperfect Classifiers
Classifier combination methods need to make best use of the outputs of
multiple, imperfect classifiers to enable higher accuracy classifications. In
many situations, such as when human decisions need to be combined, the base
decisions can vary enormously in reliability. A Bayesian approach to such
uncertain combination allows us to infer the differences in performance between
individuals and to incorporate any available prior knowledge about their
abilities when training data is sparse. In this paper we explore Bayesian
classifier combination, using the computationally efficient framework of
variational Bayesian inference. We apply the approach to real data from a large
citizen science project, Galaxy Zoo Supernovae, and show that our method far
outperforms other established approaches to imperfect decision combination. We
go on to analyse the putative community structure of the decision makers, based
on their inferred decision making strategies, and show that natural groupings
are formed. Finally we present a dynamic Bayesian classifier combination
approach and investigate the changes in base classifier performance over time.Comment: 35 pages, 12 figure
Benefits of a high-performance computing cluster for calibrating brain-computer interface technology
Machine learning in astronomy
The search to find answers to the deepest questions we have about the Universe has fueled the collection of data for ever larger volumes of our cosmos. The field of supernova cosmology, for example, is seeing continuous development with upcoming surveys set to produce a vast amount of data that will require new statistical inference and machine learning techniques for processing and analysis. Distinguishing between real objects and artefacts is one of the first steps in any transient science pipeline and, currently, is still carried out by humans - often leading to hand scanners having to sort hundreds or thousands of images per night. This is a time-consuming activity introducing human biases that are extremely hard to characterise. To succeed in the objectives of future transient surveys, the successful substitution of human hand scanners with machine learning techniques for the purpose of this artefact-transient classification therefore represents a vital frontier. In this thesis we test various machine learning algorithms and show that many of them can match the human hand scanner performance in classifying transient difference g, r and i-band imaging data from the SDSS-II SN Survey into real objects and artefacts. Using principal component analysis and linear discriminant analysis, we construct a grand total of 56 feature sets with which to train, optimise and test a Minimum Error Classifier (MEC), a naive Bayes classifier, a k-Nearest Neighbours (kNN) algorithm, a Support Vector Machine (SVM) and the SkyNet artificial neural network
Galaxy Zoo: Morphological Classification and Citizen Science
We provide a brief overview of the Galaxy Zoo and Zooniverse projects,
including a short discussion of the history of, and motivation for, these
projects as well as reviewing the science these innovative internet-based
citizen science projects have produced so far. We briefly describe the method
of applying en-masse human pattern recognition capabilities to complex data in
data-intensive research. We also provide a discussion of the lessons learned
from developing and running these community--based projects including thoughts
on future applications of this methodology. This review is intended to give the
reader a quick and simple introduction to the Zooniverse.Comment: 11 pages, 1 figure; to be published in Advances in Machine Learning
and Data Mining for Astronom
Virtual Astronomy, Information Technology, and the New Scientific Methodology
All sciences, including astronomy, are now entering the era of information abundance. The exponentially increasing volume and complexity of modern data sets promises to transform the scientific practice, but also poses a number of common technological challenges. The Virtual Observatory concept is the astronomical community's response to these challenges: it aims to harness the progress in information technology in the service of astronomy, and at the same time provide a valuable testbed for information technology and applied computer science. Challenges broadly fall into two categories: data handling (or "data farming"), including issues such as archives, intelligent storage, databases, interoperability, fast networks, etc., and data mining, data understanding, and knowledge discovery, which include issues such as automated clustering and classification, multivariate correlation searches, pattern recognition, visualization in highly hyperdimensional parameter spaces, etc., as well as various applications of machine learning in these contexts. Such techniques are forming a methodological foundation for science with massive and complex data sets in general, and are likely to have a much broather impact on the modern society, commerce, information economy, security, etc. There is a powerful emerging synergy between the
computationally enabled science and the science-driven computing, which will drive the progress in science, scholarship, and many other venues in the 21st century
- …