Search CORE

9 research outputs found

Achievability of Asymptotic Minimax Regret in Online and Batch Prediction

Author: Myllymäki Petri
Roos Teemu
Watanabe Kazuho
Publication venue
Publication date: 01/11/2013
Field of study

The normalized maximum likelihood model achieves the minimax coding (log-loss) regret for data of fixed sample size n. However, it is a batch strategy, i.e., it requires that n be known in advance. Furthermore, it is computationally infeasible for most statistical models, and several computationally feasible alternative strategies have been devised. We characterize the achievability of asymptotic minimaxity by batch strategies (i.e., strategies that depend on n) as well as online strategies (i.e., strategies independent of n). On one hand, we conjecture that for a large class of models, no online strategy can be asymptotically minimax. We prove that this holds under a slightly stronger definition of asymptotic minimaxity. Our numerical experiments support the conjecture about non-achievability by so called last-step minimax algorithms, which are independent of n. On the other hand, we show that in the multinomial model, a Bayes mixture defined by the conjugate Dirichlet prior with a simple dependency on n achieves asymptotic minimaxity for all sequences, thus providing a simpler asymptotic minimax strategy compared to earlier work by Xie and Barron. The numerical results also demonstrate superior finite-sample behavior by a number of novel batch and online algorithms.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Bayesian properties of normalized maximum likelihood and its fast computation

Author: Barron Andrew
Roos Teemu
Watanabe Kazuho
Publication venue: IEEE
Publication date: 01/01/2014
Field of study

Peer reviewe

arXiv.org e-Print Archive

CiteSeerX

Crossref

Helsingin yliopiston digitaalinen arkisto

Inference And Learning: Computational Difficulty And Efficiency

Author: Liang Tengyuan
Publication venue: ScholarlyCommons
Publication date: 01/01/2017
Field of study

In this thesis, we mainly investigate two collections of problems: statistical network inference and model selection in regression. The common feature shared by these two types of problems is that they typically exhibit an interesting phenomenon in terms of computational difficulty and efficiency. For statistical network inference, our goal is to infer the network structure based on a noisy observation of the network. Statistically, we model the network as generated from the structural information with the presence of noise, for example, planted submatrix model (for bipartite weighted graph), stochastic block model, and Watts-Strogatz model. As the relative amount of ``signal-to-noise\u27\u27 varies, the problems exhibit different stages of computational difficulty. On the theoretical side, we investigate these stages through characterizing the transition thresholds on the ``signal-to-noise\u27\u27 ratio, for the aforementioned models. On the methodological side, we provide new computationally efficient procedures to reconstruct the network structure for each model. For model selection in regression, our goal is to learn a ``good\u27\u27 model based on a certain model class from the observed data sequences (feature and response pairs), when the model can be misspecified. More concretely, we study two model selection problems: to learn from general classes of functions based on i.i.d. data with minimal assumptions, and to select from the sparse linear model class based on possibly adversarially chosen data in a sequential fashion. We develop new theoretical and algorithmic tools beyond empirical risk minimization to study these problems from a learning theory point of view

ScholarlyCommons@Penn

Two studies in resource-efficient inference: structural testing of networks, and selective classification

Author: Gangrade Aditya
Publication venue
Publication date: 26/01/2022
Field of study

Inference systems suffer costs arising from information acquisition, and from communication and computational costs of executing complex models. This dissertation proposes, in two distinct themes, systems-level methods to reduce these costs without affecting the accuracy of inference by using ancillary low-cost methods to cheaply address most queries, while only using resource-heavy methods on 'difficult' instances. The first theme concerns testing methods in structural inference of networks and graphical models, the proposal being that one first cheaply tests whether the structure underlying a dataset differs from a reference structure, and only estimates the new structure if this difference is large. This study focuses on theoretically establishing separations between the costs of testing and learning to determine when a strategy such as the above has benefits. For two canonical models---the Ising model, and the stochastic block model---fundamental limits are derived on the costs of one- and two-sample goodness-of-fit tests by determining information-theoretic lower bounds, and developing matching tests. A biphasic behaviour in the costs of testing is demonstrated: there is a critical size scale such that detection of differences smaller than this size is nearly as expensive as recovering the structure, while detection of larger differences has vanishing costs relative to recovery. The second theme concerns using Selective classification (SC), or classification with an option to abstain, to control inference-time costs in the machine learning framework. The proposal is to learn a low-complexity selective classifier that only abstains on hard instances, and to execute more expensive methods upon abstention. Herein, a novel SC formulation with a focus on high-accuracy is developed, and used to obtain both theoretical characterisations, and a scheme for learning selective classifiers based on optimising a collection of class-wise decoupled one-sided risks. This scheme attains strong empirical performance, and admits efficient implementation, leading to an effective SC methodology. Finally, SC is studied in the online learning setting with feedback only provided upon abstention, modelling the practical lack of reliable labels without expensive feature collection, and a Pareto-optimal low-error scheme is described

Boston University Institutional Repository (OpenBU)

Adaptivity in Online and Statistical Learning

Author: Mhammedi Zakaria
Publication venue
Publication date: 01/01/2021
Field of study

Many modern machine learning algorithms, though successful, are still based on heuristics. In a typical application, such heuristics may manifest in the choice of a specific Neural Network structure, its number of parameters, or the learning rate during training. Relying on these heuristics is not ideal from a computational perspective (often involving multiple runs of the algorithm), and can also lead to over-fitting in some cases. This motivates the following question: for which machine learning tasks/settings do there exist efficient algorithms that automatically adapt to the best parameters? Characterizing the settings where this is the case and designing corresponding (parameter-free) algorithms within the online learning framework constitutes one of this thesis' primary goals. Towards this end, we develop algorithms for constrained and unconstrained online convex optimization that can automatically adapt to various parameters of interest such as the Lipschitz constant, the curvature of the sequence of losses, and the norm of the comparator. We also derive new performance lower-bounds characterizing the limits of adaptivity for algorithms in these settings. Part of systematizing the choice of machine learning methods also involves having ``certificates'' for the performance of algorithms. In the statistical learning setting, this translates to having (tight) generalization bounds. Adaptivity can manifest here through data-dependent bounds that become small whenever the problem is ``easy''. In this thesis, we provide such data-dependent bounds for the expected loss (the standard risk measure) and other risk measures. We also explore how such bounds can be used in the context of risk-monotonicity

The Australian National University

Recommended from our members

Topics in Deep Learning and Data-driven Optimization

Author: Bahamou Achraf
Publication venue
Publication date: 01/01/2023
Field of study

Data-driven optimization has become an increasingly popular approach for solving complex problems in various domains, such as finance, online retail, and engineering. However, in many real-world applications, the amount of available data can vary significantly, ranging from limited to large data sets. Both of these regimes present unique modeling and optimization challenges. In this thesis, we explore two distinct problems in two different data availability and model complexity regimes. In the first part (Chapters 2 and 3), we focus on the development of novel optimization algorithms for training deep neural network (DNN) models on large data sets, in particular, we develop practical optimization methods that incorporate curvature information in an economical way to accelerate the optimization process. The performance of the proposed methods is compared to that of several state-of-the-art methods used to train DNNs, to validate their effectiveness both in terms of time efficiency and generalization power. In the second part of the dissertation (Chapters 4), we focus on data-driven pricing in the limited data regime. More specifically, we study the fundamental problem of a seller pricing a product based on historical information consisting of the observed demand at a single historical price point. We develop a novel framework that allows characterizing optimal performance for deterministic or more general randomized mechanisms and leads to fundamental novel insights on the value of limited demand data for pricing

Columbia University Academic Commons

Safety and Reliability - Safe Societies in a Changing World

Author
Publication venue
Publication date
Field of study

The contributions cover a wide range of methodologies and application areas for safety and reliability that contribute to safe societies in a changing world. These methodologies and applications include: - foundations of risk and reliability assessment and management - mathematical methods in reliability and safety - risk assessment - risk management - system reliability - uncertainty analysis - digitalization and big data - prognostics and system health management - occupational safety - accident and incident modeling - maintenance modeling and applications - simulation for safety and reliability analysis - dynamic risk and barrier management - organizational factors and safety culture - human factors and human reliability - resilience engineering - structural reliability - natural hazards - security - economic analysis in risk managemen

OAPEN Library