27 research outputs found
A dynamic programming approach for generalized nearly isotonic optimization
Shape restricted statistical estimation problems have been extensively
studied, with many important practical applications in signal processing,
bioinformatics, and machine learning. In this paper, we propose and study a
generalized nearly isotonic optimization (GNIO) model, which recovers, as
special cases, many classic problems in shape constrained statistical
regression, such as isotonic regression, nearly isotonic regression and
unimodal regression problems. We develop an efficient and easy-to-implement
dynamic programming algorithm for solving the proposed model whose recursion
nature is carefully uncovered and exploited. For special -GNIO
problems, implementation details and the optimal running time
analysis of our algorithm are discussed. Numerical experiments, including the
comparison between our approach and the powerful commercial solver Gurobi for
solving -GNIO and -GNIO problems, on both simulated and real
data sets are presented to demonstrate the high efficiency and robustness of
our proposed algorithm in solving large scale GNIO problems
OBTAINING ACCURATE PROBABILITIES USING CLASSIFIER CALIBRATION
Learning probabilistic classification and prediction models that generate accurate probabilities is essential in many prediction and decision-making tasks in machine learning and data mining. One way to achieve this goal is to post-process the output of classification models to obtain more accurate probabilities. These post-processing methods are often referred to as calibration methods in the machine learning literature.
This thesis describes a suite of parametric and non-parametric methods for calibrating the output of classification and prediction models. In order to evaluate the calibration performance of a classifier, we introduce two new calibration measures that are intuitive statistics of the calibration
curves. We present extensive experimental results on both simulated and real datasets to evaluate the performance of the proposed methods compared with commonly used calibration methods in the literature. In particular, in terms of binary classifier calibration, our experimental results
show that the proposed methods are able to improve the calibration power of classifiers while retaining their discrimination performance. Our theoretical findings show that by using a simple non-parametric calibration method, it is possible to improve the calibration performance of a classifier
without sacrificing discrimination capability. The methods are also computationally tractable for large-scale datasets as they run in O(N log N) time, where N is the number of samples.
In this thesis we also introduce a novel framework to derive calibrated probabilities of causal relationships from observational data. The framework consists of three main components: (1) an approximate method for generating initial probability estimates of the edge types for each pair
of variables, (2) the availability of a relatively small number of the causal relationships in the network for which the truth status is known, which we call a calibration training set, and (3) a calibration method for using the approximate probability estimates and the calibration training set
to generate calibrated probabilities for the many remaining pairs of variables. Our experiments on a range of simulated data support that the proposed approach improves the calibration of edge predictions. The results also support that the approach often improves the precision and recall of those predictions
Non-convex Optimization for Machine Learning
A vast majority of machine learning algorithms train their models and perform
inference by solving optimization problems. In order to capture the learning
and prediction problems accurately, structural constraints such as sparsity or
low rank are frequently imposed or else the objective itself is designed to be
a non-convex function. This is especially true of algorithms that operate in
high-dimensional spaces or that train non-linear models such as tensor models
and deep networks.
The freedom to express the learning problem as a non-convex optimization
problem gives immense modeling power to the algorithm designer, but often such
problems are NP-hard to solve. A popular workaround to this has been to relax
non-convex problems to convex ones and use traditional methods to solve the
(convex) relaxed optimization problems. However this approach may be lossy and
nevertheless presents significant challenges for large scale optimization.
On the other hand, direct approaches to non-convex optimization have met with
resounding success in several domains and remain the methods of choice for the
practitioner, as they frequently outperform relaxation-based techniques -
popular heuristics include projected gradient descent and alternating
minimization. However, these are often poorly understood in terms of their
convergence and other properties.
This monograph presents a selection of recent advances that bridge a
long-standing gap in our understanding of these heuristics. The monograph will
lead the reader through several widely used non-convex optimization techniques,
as well as applications thereof. The goal of this monograph is to both,
introduce the rich literature in this area, as well as equip the reader with
the tools and techniques needed to analyze these simple procedures for
non-convex problems.Comment: The official publication is available from now publishers via
http://dx.doi.org/10.1561/220000005
Learning with Submodular Functions: A Convex Optimization Perspective
International audienceSubmodular functions are relevant to machine learning for at least two reasons: (1) some problems may be expressed directly as the optimization of submodular functions and (2) the lovasz extension of submodular functions provides a useful set of regularization functions for supervised and unsupervised learning. In this monograph, we present the theory of submodular functions from a convex analysis perspective, presenting tight links between certain polyhedra, combinatorial optimization and convex optimization problems. In particular, we show how submodular function minimization is equivalent to solving a wide variety of convex optimization problems. This allows the derivation of new efficient algorithms for approximate and exact submodular function minimization with theoretical guarantees and good practical performance. By listing many examples of submodular functions, we review various applications to machine learning, such as clustering, experimental design, sensor placement, graphical model structure learning or subset selection, as well as a family of structured sparsity-inducing norms that can be derived and used from submodular functions
Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain
The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio
Novel Methods for Efficient Changepoint Detection
This thesis introduces several novel computationally efficient methods for offline and online changepoint detection. The first part of the thesis considers the challenge of detecting abrupt changes in scenarios where there is some autocorrelated noise or where the mean fluctuates locally between the changes. In such situations, existing implementations can lead to substantial overestimation of the number of changes. In response to this challenge, we introduce DeCAFS, an efficient dynamic programming algorithm to deal with such scenarios. DeCAFS models local fluctuations as a random walk process and autocorrelated noise as an AR(1) process. Through theory and empirical studies we demonstrate that this approach has greater power at detecting abrupt changes than existing approaches. The second part of the thesis considers a practical, computational challenge that can arise with online changepoint detection within the real-time domain. We introduce a new procedure, called FOCuS, a fast online changepoint detection algorithm based on the simple Page-CUSUM sequential likelihood ratio test. FOCuS enables the online changepoint detection problem to be solved sequentially in time, through an efficient dynamic programming recursion. In particular, we establish that FOCuS outperforms current state-of-the-art algorithms both in terms of efficiency and statistical power, and can be readily extended to more general scenarios. The final part of the thesis extends ideas from the nonparametric changepoint detection literature to the online setting. Specifically, a novel algorithm, NUNC, is introduced to perform an online detection for changes in the distribution of real-time data. We explore the properties of two variants of this algorithm using both simulated and real data examples