1,912 research outputs found
Logarithmic Opinion Pools for Conditional Random Fields
Institute for Communicating and Collaborative SystemsSince their recent introduction, conditional random fields
(CRFs) have been successfully applied to a multitude of structured
labelling tasks in many different domains. Examples include natural
language processing (NLP), bioinformatics and computer vision. Within
NLP itself we have seen many different application areas, like named
entity recognition, shallow parsing, information extraction from
research papers and language modelling. Most of this work has
demonstrated the need, directly or indirectly, to employ some form of
regularisation when applying CRFs in order to overcome the tendency
for these models to overfit. To date a popular method for
regularising CRFs has been to fit a Gaussian prior distribution over
the model parameters.
In this thesis we explore other methods of CRF regularisation,
investigating their properties and comparing their effectiveness. We
apply our ideas to sequence labelling problems in NLP, specifically
part-of-speech tagging and named entity recognition.
We start with an analysis of conventional approaches to CRF
regularisation, and investigate possible extensions to such
approaches. In particular, we consider choices of prior distribution
other than the Gaussian, including the Laplacian and Hyperbolic; we
look at the effect of regularising different features separately, to
differing degrees, and explore how we may define an appropriate level
of regularisation for each feature; we investigate the effect of
allowing the mean of a prior distribution to take on non-zero values;
and we look at the impact of relaxing the feature expectation
constraints satisfied by a standard CRF, leading to a modified CRF
model we call the inequality CRF. Our analysis leads to the
general conclusion that although there is some capacity for
improvement of conventional regularisation through modification and
extension, this is quite limited. Conventional regularisation with a
prior is in general hampered by the need to fit a hyperparameter or
set of hyperparameters, which can be an expensive process.
We then approach the CRF overfitting problem from a different
perspective. Specifically, we introduce a form of CRF ensemble called
a logarithmic opinion pool (LOP), where CRF distributions are
combined under a weighted product. We show how a LOP has theoretical
properties which provide a framework for designing new overfitting
reduction schemes in terms of diverse models, and demonstrate how such
diverse models may be constructed in a number of different ways.
Specifically, we show that by constructing CRF models from manually
crafted partitions of a feature set and combining them with equal
weight under a LOP, we may obtain an ensemble that significantly
outperforms a standard CRF trained on the entire feature set, and is
competitive in performance to a standard CRF regularised with a
Gaussian prior. The great advantage of LOP approach is that, unlike
the Gaussian prior method, it does not require us to search a
hyperparameter space.
Having demonstrated the success of LOPs in the simple case, we then
move on to consider more complex uses of the framework. In
particular, we investigate whether it is possible to further improve
the LOP ensemble by allowing parameters in different models to
interact during training in such a way that diversity between the
models is encouraged.
Lastly, we show how the LOP approach may be used as a remedy for a
problem that standard CRFs can sometimes suffer. In certain
situations, negative effects may be introduced to a CRF by the
inclusion of highly discriminative features. An example of this is
provided by gazetteer features, which encode a word's presence in a
gazetteer. We show how LOPs may be used to reduce these negative
effects, and so provide some insight into how gazetteer features may
be more effectively handled in CRFs, and log-linear models in
general
A literature review on the use of expert opinion in probabilistic risk analysis
Risk assessment is part of the decision making process in many fields of discipline, such as engineering, public health, environment, program management, regulatory policy, and finance. There has been considerable debate over the philosophical and methodological treatment of risk in the past few decades, ranging from its definition and classification to methods of its assessment. Probabilistic risk analysis (PRA) specifically deals with events represented by low probabilities of occurring with high levels of unfavorable consequences. Expert judgment is often a critical source of information in PRA, since empirical data on the variables of interest are rarely available. The author reviews the literature on the use of expert opinion in PRA, in particular on the approaches to eliciting and aggregating experts'assessments. The literature suggests that the methods by which expert opinions are collected and combined have a significant effect on the resulting estimates. The author discusses two types of approaches to eliciting and aggregating expert judgments-behavioral and mathematical approaches, with the emphasis on the latter. It is generally agreed that mathematical approaches tend to yield more accurate estimates than behavioral approaches. After a short description of behavioral approaches, the author discusses mathematical approaches in detail, presenting three aggregation models: non-Bayesian axiomatic models, Bayesian models, andpsychological scaling models. She also discusses issues of stochastic dependence.Health Monitoring&Evaluation,ICT Policy and Strategies,Public Health Promotion,Enterprise Development&Reform,Statistical&Mathematical Sciences,ICT Policy and Strategies,Health Monitoring&Evaluation,Statistical&Mathematical Sciences,Science Education,Scientific Research&Science Parks
Dynamic Bayesian Predictive Synthesis in Time Series Forecasting
We discuss model and forecast combination in time series forecasting. A
foundational Bayesian perspective based on agent opinion analysis theory
defines a new framework for density forecast combination, and encompasses
several existing forecast pooling methods. We develop a novel class of dynamic
latent factor models for time series forecast synthesis; simulation-based
computation enables implementation. These models can dynamically adapt to
time-varying biases, miscalibration and inter-dependencies among multiple
models or forecasters. A macroeconomic forecasting study highlights the dynamic
relationships among synthesized forecast densities, as well as the potential
for improved forecast accuracy at multiple horizons
Bayesian Calibration of Generalized Pools of Predictive Distributions
Decision-makers often consult different experts to build reliable forecasts on variables
of interest. Combining more opinions and calibrating them to maximize the forecast accuracy is
consequently a crucial issue in several economic problems. This paper applies a Bayesian beta mixture
model to derive a combined and calibrated density function using random calibration functionals
and random combination weights. In particular, it compares the application of linear, harmonic and
logarithmic pooling in the Bayesian combination approach. The three combination schemes, i.e.,
linear, harmonic and logarithmic, are studied in simulation examples with multimodal densities and
an empirical application with a large database of stock data. All of the experiments show that in
a beta mixture calibration framework, the three combination schemes are substantially equivalent,
achieving calibration, and no clear preference for one of them appears. The financial application
shows that the linear pooling together with beta mixture calibration achieves the best results in terms
of calibrated forecast
AdaBoost.MRF: boosted Markov random forests and application to multilevel activity recognition
Activity recognition is an important issue in building intelligent monitoring systems. We address the recognition of multilevel activities in this paper via a conditional Markov random field (MRF), known as the dynamic conditional random field (DCRF). Parameter estimation in general MRFs using maximum likelihood is known to be computationally challenging (except for extreme cases), and thus we propose an efficient boosting-based algorithm AdaBoost.MRF for this task. Distinct from most existing work, our algorithm can handle hidden variables (missing labels) and is particularly attractive for smarthouse domains where reliable labels are often sparsely observed. Furthermore, our method works exclusively on trees and thus is guaranteed to converge. We apply the AdaBoost.MRF algorithm to a home video surveillance application and demonstrate its efficacy
Bayesian Nonparametric Calibration and Combination of Predictive Distributions
We introduce a Bayesian approach to predictive density calibration and
combination that accounts for parameter uncertainty and model set
incompleteness through the use of random calibration functionals and random
combination weights. Building on the work of Ranjan, R. and Gneiting, T. (2010)
and Gneiting, T. and Ranjan, R. (2013), we use infinite beta mixtures for the
calibration. The proposed Bayesian nonparametric approach takes advantage of
the flexibility of Dirichlet process mixtures to achieve any continuous
deformation of linearly combined predictive distributions. The inference
procedure is based on Gibbs sampling and allows accounting for uncertainty in
the number of mixture components, mixture weights, and calibration parameters.
The weak posterior consistency of the Bayesian nonparametric calibration is
provided under suitable conditions for unknown true density. We study the
methodology in simulation examples with fat tails and multimodal densities and
apply it to density forecasts of daily S&P returns and daily maximum wind speed
at the Frankfurt airport.Comment: arXiv admin note: text overlap with arXiv:1305.2026 by other author
- …