9,934 research outputs found

    Model Diagnostics meets Forecast Evaluation: Goodness-of-Fit, Calibration, and Related Topics

    Get PDF
    Principled forecast evaluation and model diagnostics are vital in fitting probabilistic models and forecasting outcomes of interest. A common principle is that fitted or predicted distributions ought to be calibrated, ideally in the sense that the outcome is indistinguishable from a random draw from the posited distribution. Much of this thesis is centered on calibration properties of various types of forecasts. In the first part of the thesis, a simple algorithm for exact multinomial goodness-of-fit tests is proposed. The algorithm computes exact pp-values based on various test statistics, such as the log-likelihood ratio and Pearson\u27s chi-square. A thorough analysis shows improvement on extant methods. However, the runtime of the algorithm grows exponentially in the number of categories and hence its use is limited. In the second part, a framework rooted in probability theory is developed, which gives rise to hierarchies of calibration, and applies to both predictive distributions and stand-alone point forecasts. Based on a general notion of conditional T-calibration, the thesis introduces population versions of T-reliability diagrams and revisits a score decomposition into measures of miscalibration, discrimination, and uncertainty. Stable and efficient estimators of T-reliability diagrams and score components arise via nonparametric isotonic regression and the pool-adjacent-violators algorithm. For in-sample model diagnostics, a universal coefficient of determination is introduced that nests and reinterprets the classical R2R^2 in least squares regression. In the third part, probabilistic top lists are proposed as a novel type of prediction in classification, which bridges the gap between single-class predictions and predictive distributions. The probabilistic top list functional is elicited by strictly consistent evaluation metrics, based on symmetric proper scoring rules, which admit comparison of various types of predictions

    Foundations for programming and implementing effect handlers

    Get PDF
    First-class control operators provide programmers with an expressive and efficient means for manipulating control through reification of the current control state as a first-class object, enabling programmers to implement their own computational effects and control idioms as shareable libraries. Effect handlers provide a particularly structured approach to programming with first-class control by naming control reifying operations and separating from their handling. This thesis is composed of three strands of work in which I develop operational foundations for programming and implementing effect handlers as well as exploring the expressive power of effect handlers. The first strand develops a fine-grain call-by-value core calculus of a statically typed programming language with a structural notion of effect types, as opposed to the nominal notion of effect types that dominates the literature. With the structural approach, effects need not be declared before use. The usual safety properties of statically typed programming are retained by making crucial use of row polymorphism to build and track effect signatures. The calculus features three forms of handlers: deep, shallow, and parameterised. They each offer a different approach to manipulate the control state of programs. Traditional deep handlers are defined by folds over computation trees, and are the original con-struct proposed by Plotkin and Pretnar. Shallow handlers are defined by case splits (rather than folds) over computation trees. Parameterised handlers are deep handlers extended with a state value that is threaded through the folds over computation trees. To demonstrate the usefulness of effects and handlers as a practical programming abstraction I implement the essence of a small UNIX-style operating system complete with multi-user environment, time-sharing, and file I/O. The second strand studies continuation passing style (CPS) and abstract machine semantics, which are foundational techniques that admit a unified basis for implementing deep, shallow, and parameterised effect handlers in the same environment. The CPS translation is obtained through a series of refinements of a basic first-order CPS translation for a fine-grain call-by-value language into an untyped language. Each refinement moves toward a more intensional representation of continuations eventually arriving at the notion of generalised continuation, which admit simultaneous support for deep, shallow, and parameterised handlers. The initial refinement adds support for deep handlers by representing stacks of continuations and handlers as a curried sequence of arguments. The image of the resulting translation is not properly tail-recursive, meaning some function application terms do not appear in tail position. To rectify this the CPS translation is refined once more to obtain an uncurried representation of stacks of continuations and handlers. Finally, the translation is made higher-order in order to contract administrative redexes at translation time. The generalised continuation representation is used to construct an abstract machine that provide simultaneous support for deep, shallow, and parameterised effect handlers. kinds of effect handlers. The third strand explores the expressiveness of effect handlers. First, I show that deep, shallow, and parameterised notions of handlers are interdefinable by way of typed macro-expressiveness, which provides a syntactic notion of expressiveness that affirms the existence of encodings between handlers, but it provides no information about the computational content of the encodings. Second, using the semantic notion of expressiveness I show that for a class of programs a programming language with first-class control (e.g. effect handlers) admits asymptotically faster implementations than possible in a language without first-class control

    A productive response to legacy system petrification

    Get PDF
    Requirements change. The requirements of a legacy information system change, often in unanticipated ways, and at a more rapid pace than the rate at which the information system itself can be evolved to support them. The capabilities of a legacy system progressively fall further and further behind their evolving requirements, in a degrading process termed petrification. As systems petrify, they deliver diminishing business value, hamper business effectiveness, and drain organisational resources. To address legacy systems, the first challenge is to understand how to shed their resistance to tracking requirements change. The second challenge is to ensure that a newly adaptable system never again petrifies into a change resistant legacy system. This thesis addresses both challenges. The approach outlined herein is underpinned by an agile migration process - termed Productive Migration - that homes in upon the specific causes of petrification within each particular legacy system and provides guidance upon how to address them. That guidance comes in part from a personalised catalogue of petrifying patterns, which capture recurring themes underlying petrification. These steer us to the problems actually present in a given legacy system, and lead us to suitable antidote productive patterns via which we can deal with those problems one by one. To prevent newly adaptable systems from again degrading into legacy systems, we appeal to a follow-on process, termed Productive Evolution, which embraces and keeps pace with change rather than resisting and falling behind it. Productive Evolution teaches us to be vigilant against signs of system petrification and helps us to nip them in the bud. The aim is to nurture systems that remain supportive of the business, that are adaptable in step with ongoing requirements change, and that continue to retain their value as significant business assets

    Anytime algorithms for ROBDD symmetry detection and approximation

    Get PDF
    Reduced Ordered Binary Decision Diagrams (ROBDDs) provide a dense and memory efficient representation of Boolean functions. When ROBDDs are applied in logic synthesis, the problem arises of detecting both classical and generalised symmetries. State-of-the-art in symmetry detection is represented by Mishchenko's algorithm. Mishchenko showed how to detect symmetries in ROBDDs without the need for checking equivalence of all co-factor pairs. This work resulted in a practical algorithm for detecting all classical symmetries in an ROBDD in O(|G|Ā³) set operations where |G| is the number of nodes in the ROBDD. Mishchenko and his colleagues subsequently extended the algorithm to find generalised symmetries. The extended algorithm retains the same asymptotic complexity for each type of generalised symmetry. Both the classical and generalised symmetry detection algorithms are monolithic in the sense that they only return a meaningful answer when they are left to run to completion. In this thesis we present efficient anytime algorithms for detecting both classical and generalised symmetries, that output pairs of symmetric variables until a prescribed time bound is exceeded. These anytime algorithms are complete in that given sufficient time they are guaranteed to find all symmetric pairs. Theoretically these algorithms reside in O(nĀ³+n|G|+|G|Ā³) and O(nĀ³+nĀ²|G|+|G|Ā³) respectively, where n is the number of variables, so that in practice the advantage of anytime generality is not gained at the expense of efficiency. In fact, the anytime approach requires only very modest data structure support and offers unique opportunities for optimisation so the resulting algorithms are very efficient. The thesis continues by considering another class of anytime algorithms for ROBDDs that is motivated by the dearth of work on approximating ROBDDs. The need for approximation arises because many ROBDD operations result in an ROBDD whose size is quadratic in the size of the inputs. Furthermore, if ROBDDs are used in abstract interpretation, the running time of the analysis is related not only to the complexity of the individual ROBDD operations but also the number of operations applied. The number of operations is, in turn, constrained by the number of times a Boolean function can be weakened before stability is achieved. This thesis proposes a widening that can be used to both constrain the size of an ROBDD and also ensure that the number of times that it is weakened is bounded by some given constant. The widening can be used to either systematically approximate an ROBDD from above (i.e. derive a weaker function) or below (i.e. infer a stronger function). The thesis also considers how randomised techniques may be deployed to improve the speed of computing an approximation by avoiding potentially expensive ROBDD manipulation

    Computational analysis of single-cell dynamics: protein localisation, cell cycle, and metabolic adaptation

    Get PDF
    Cells need to be able to adapt quickly to changes in nutrient availability in their environment in order to survive. Budding yeasts constitute a convenient model to study how eukaryotic cells respond to sudden environmental change because of their fast growth and relative simplicity. Many of the intracellular changes needed for adaptation are spatial and transient; they can be captured experimentally using ļ¬‚uorescence time-lapse microscopy. These data are limited when only used for observation, and become most powerful when they can be used to extract quantitative, dynamic, single-cell information. In this thesis we describe an analysis framework heavily based on deep learning methods that allows us to quantitatively describe diļ¬€erent aspects of cellsā€™ response to a new environment from microscopy data. chapter 2 describes a start-to-ļ¬nish pipeline for data access and preprocessing, cell segmentation, volume and growth rate estimation, and lineage extraction. We provide benchmarks of run time and describe how to speed up analysis using parallelisation. We then show how this pipeline can be extended with custom processing functions, and how it can be used for real-time analysis of microscopy experiments. In chapter 3 we develop a method for predicting the location of the vacuole and nucleus from bright ļ¬eld images. We combine this method with cell segmentation to quantify the timing of three aspects of the cellsā€™ response to a sudden nutrient shift: a transient change in transcription factor nuclear localisation, a change in instantaneous growth rate, and the reorganisation of the plasma membrane through the endocytosis of certain membrane proteins. In particular, we quantify the relative timing of these processes and show that there is a consistent lag between the perception of the stress at the level of gene expression and the reorganisation of the cell membrane. In chapter 4 we evaluate several methods to obtain cell cycle phase information in a label-free manner. We begin by using the outputs of cell segmentation to predict cytokinesis with high accuracy. We then predict cell cycle phase at a higher granularity directly from bright ļ¬eld images. We show that bright ļ¬eld images contain information about the cell cycle which is not visible by eye. We use these methods to quantify the relationship between cell cycle phase length and growth rate. Finally, in chapter 5 we look beyond microscopy to the bigger picture. We sketch an abstract description of how, at a genome-scale, cells might choose a strategy for adapting to a nutrient shift based on limited, noisy, and local information. Starting from a constraint-based model of metabolism, we propose an algorithm to navigate through metabolic space using only a lossy encoding of the full metabolic network. We show how this navigation can be used to adapt to a changing environment, and how its results diļ¬€er from the global optimisation usually applied to metabolic models

    Studies of b Hadron decays to Charmonium, the LHCb upgrade and operation

    Get PDF
    Precise measurements of CP violation provide stringent tests of the Standard Model towards the search for signs of new physics. Using LHC proton-proton collision data, collected by the LHCb detector during 2015 and 2016 at the centre-of-mass energy of 13 TeV corresponding to an integrated luminosity of 1.9 fbāˆ’1 , presented is the latest measurement of the CP -violating phase, Ļ•s , using B 0 s ā†’ J/ĻˆĻ• decays. The machine-learning-based data selection, data-driven corrections to simulated event samples, and the control of systematic effects using dedicated samples are discussed. The values Ļ•s = āˆ’0.083Ā±0.041Ā±0.006 rad, āˆ†Ī“s = 0.077Ā± 0.008Ā±0.003 psāˆ’1 (i.e. the decay width difference between the light and the heavy mass eigenstates in the B 0 s system) and Ī“s āˆ’Ī“d = āˆ’0.0041Ā±0.0024Ā±0.0015 psāˆ’1 (i.e. the difference of the average B 0 s and B 0 d meson decay widths) are obtained, yielding the Worldā€™s most precise determination of these quantities 1 . Furthermore, shown are the efforts and contributions towards the LHCb Upgrade: the quality assurance and testing of the LHCb RICH Upgrade components, and the redesign and upgrade of the fully online software trigger ā€“ LHCb HLT Upgrade. Regarding the former, an original implementation of a parallelised, robust and highly available automation system is introduced. In connection to the latter, a novel neural network architecture and optimisation methods are laid out, enabling complex machine learning to be performed in a low latency high-throughput environment. Those directly influence the future deployment of the experiment and its data collecting and analysis capabilities. Thus, they are essential for future more precise and stringent research
    • ā€¦
    corecore