7,565 research outputs found

    A Short Introduction to Model Selection, Kolmogorov Complexity and Minimum Description Length (MDL)

    Full text link
    The concept of overfitting in model selection is explained and demonstrated with an example. After providing some background information on information theory and Kolmogorov complexity, we provide a short explanation of Minimum Description Length and error minimization. We conclude with a discussion of the typical features of overfitting in model selection.Comment: 20 pages, Chapter 1 of The Paradox of Overfitting, Master's thesis, Rijksuniversiteit Groningen, 200

    Half-integrality, LP-branching and FPT Algorithms

    Full text link
    A recent trend in parameterized algorithms is the application of polytope tools (specifically, LP-branching) to FPT algorithms (e.g., Cygan et al., 2011; Narayanaswamy et al., 2012). However, although interesting results have been achieved, the methods require the underlying polytope to have very restrictive properties (half-integrality and persistence), which are known only for few problems (essentially Vertex Cover (Nemhauser and Trotter, 1975) and Node Multiway Cut (Garg et al., 1994)). Taking a slightly different approach, we view half-integrality as a \emph{discrete} relaxation of a problem, e.g., a relaxation of the search space from {0,1}V\{0,1\}^V to {0,1/2,1}V\{0,1/2,1\}^V such that the new problem admits a polynomial-time exact solution. Using tools from CSP (in particular Thapper and \v{Z}ivn\'y, 2012) to study the existence of such relaxations, we provide a much broader class of half-integral polytopes with the required properties, unifying and extending previously known cases. In addition to the insight into problems with half-integral relaxations, our results yield a range of new and improved FPT algorithms, including an O∗(∣Σ∣2k)O^*(|\Sigma|^{2k})-time algorithm for node-deletion Unique Label Cover with label set Σ\Sigma and an O∗(4k)O^*(4^k)-time algorithm for Group Feedback Vertex Set, including the setting where the group is only given by oracle access. All these significantly improve on previous results. The latter result also implies the first single-exponential time FPT algorithm for Subset Feedback Vertex Set, answering an open question of Cygan et al. (2012). Additionally, we propose a network flow-based approach to solve some cases of the relaxation problem. This gives the first linear-time FPT algorithm to edge-deletion Unique Label Cover.Comment: Added results on linear-time FPT algorithms (not present in SODA paper

    A Computable Economist’s Perspective on Computational Complexity

    Get PDF
    A computable economist's view of the world of computational complexity theory is described. This means the model of computation underpinning theories of computational complexity plays a central role. The emergence of computational complexity theories from diverse traditions is emphasised. The unifications that emerged in the modern era was codified by means of the notions of efficiency of computations, non-deterministic computations, completeness, reducibility and verifiability - all three of the latter concepts had their origins on what may be called 'Post's Program of Research for Higher Recursion Theory'. Approximations, computations and constructions are also emphasised. The recent real model of computation as a basis for studying computational complexity in the domain of the reals is also presented and discussed, albeit critically. A brief sceptical section on algorithmic complexity theory is included in an appendix

    Quantifying uncertainties on excursion sets under a Gaussian random field prior

    Get PDF
    We focus on the problem of estimating and quantifying uncertainties on the excursion set of a function under a limited evaluation budget. We adopt a Bayesian approach where the objective function is assumed to be a realization of a Gaussian random field. In this setting, the posterior distribution on the objective function gives rise to a posterior distribution on excursion sets. Several approaches exist to summarize the distribution of such sets based on random closed set theory. While the recently proposed Vorob'ev approach exploits analytical formulae, further notions of variability require Monte Carlo estimators relying on Gaussian random field conditional simulations. In the present work we propose a method to choose Monte Carlo simulation points and obtain quasi-realizations of the conditional field at fine designs through affine predictors. The points are chosen optimally in the sense that they minimize the posterior expected distance in measure between the excursion set and its reconstruction. The proposed method reduces the computational costs due to Monte Carlo simulations and enables the computation of quasi-realizations on fine designs in large dimensions. We apply this reconstruction approach to obtain realizations of an excursion set on a fine grid which allow us to give a new measure of uncertainty based on the distance transform of the excursion set. Finally we present a safety engineering test case where the simulation method is employed to compute a Monte Carlo estimate of a contour line

    Fluid Model Checking

    Full text link
    In this paper we investigate a potential use of fluid approximation techniques in the context of stochastic model checking of CSL formulae. We focus on properties describing the behaviour of a single agent in a (large) population of agents, exploiting a limit result known also as fast simulation. In particular, we will approximate the behaviour of a single agent with a time-inhomogeneous CTMC which depends on the environment and on the other agents only through the solution of the fluid differential equation. We will prove the asymptotic correctness of our approach in terms of satisfiability of CSL formulae and of reachability probabilities. We will also present a procedure to model check time-inhomogeneous CTMC against CSL formulae

    Normalized Web Distance and Word Similarity

    Get PDF
    There is a great deal of work in cognitive psychology, linguistics, and computer science, about using word (or phrase) frequencies in context in text corpora to develop measures for word similarity or word association, going back to at least the 1960s. The goal of this chapter is to introduce the normalizedis a general way to tap the amorphous low-grade knowledge available for free on the Internet, typed in by local users aiming at personal gratification of diverse objectives, and yet globally achieving what is effectively the largest semantic electronic database in the world. Moreover, this database is available for all by using any search engine that can return aggregate page-count estimates for a large range of search-queries. In the paper introducing the NWD it was called `normalized Google distance (NGD),' but since Google doesn't allow computer searches anymore, we opt for the more neutral and descriptive NWD. web distance (NWD) method to determine similarity between words and phrases. ItComment: Latex, 20 pages, 7 figures, to appear in: Handbook of Natural Language Processing, Second Edition, Nitin Indurkhya and Fred J. Damerau Eds., CRC Press, Taylor and Francis Group, Boca Raton, FL, 2010, ISBN 978-142008592

    Death and Suicide in Universal Artificial Intelligence

    Full text link
    Reinforcement learning (RL) is a general paradigm for studying intelligent behaviour, with applications ranging from artificial intelligence to psychology and economics. AIXI is a universal solution to the RL problem; it can learn any computable environment. A technical subtlety of AIXI is that it is defined using a mixture over semimeasures that need not sum to 1, rather than over proper probability measures. In this work we argue that the shortfall of a semimeasure can naturally be interpreted as the agent's estimate of the probability of its death. We formally define death for generally intelligent agents like AIXI, and prove a number of related theorems about their behaviour. Notable discoveries include that agent behaviour can change radically under positive linear transformations of the reward signal (from suicidal to dogmatically self-preserving), and that the agent's posterior belief that it will survive increases over time.Comment: Conference: Artificial General Intelligence (AGI) 2016 13 pages, 2 figure
    • …
    corecore