192,418 research outputs found

    Statistical methods for cricket team selection : a thesis presented in partial fulfilment of the requirement of the degree of Master of Applied Statistics at Massey University

    Get PDF
    Cricket generates a large amount of data for both batsmen and bowlers. Methods for using this data to select a cricket team are examined. Utilising the assumption that an individual's natural ability is expressed via performance outputs, this thesis seeks to describe and understand the underlying statistical processes of player performance. Randomness is tested for and then the distributional properties of the data are sought. This information is then used to monitor the estimate of natural ability via widely accepted control methods, such as Shewhart control charts, CUSUM, EWMA and multivariate versions of these procedures. To accommodate the distribution presented by batting scores, a new control chart based on quartiles is also studied. Further, ranking and selection procedures employ the estimates of individual ability to select the best individuals and note the probability of correct selection. Major contributions of this study include: a) Development of performance measures for cricket b) 2 - Dimensional runs test, with further applicability outside cricket. c) Statistical interpretation specific to cricket • Outliers are very important • Form is autocorrelation • Zone rules for cricket needed to detect good/poor performance • Relatively short nominal ARL's d) Control Chart based on quantiles to preserve outlier influences in a non-parametric procedure. e) The recommendation of appropriate tools for monitoring batting, bowling and all-rounder performance and also choosing man of the match. f) Discriminates between different types of bowlers using the consistency of their performance measures. g) Evaluates the members of a team relative to potential contenders

    Advances in ranking and selection: variance estimation and constraints

    Get PDF
    In this thesis, we first show that the performance of ranking and selection (R&S) procedures in steady-state simulations depends highly on the quality of the variance estimates that are used. We study the performance of R&S procedures using three variance estimators --- overlapping area, overlapping Cramer--von Mises, and overlapping modified jackknifed Durbin--Watson estimators --- that show better long-run performance than other estimators previously used in conjunction with R&S procedures for steady-state simulations. We devote additional study to the development of the new overlapping modified jackknifed Durbin--Watson estimator and demonstrate some of its useful properties. Next, we consider the problem of finding the best simulated system under a primary performance measure, while also satisfying stochastic constraints on secondary performance measures, known as constrained ranking and selection. We first present a new framework that allows certain systems to become dormant, halting sampling for those systems as the procedure continues. We also develop general procedures for constrained R&S that guarantee a nominal probability of correct selection, under any number of constraints and correlation across systems. In addition, we address new topics critical to efficiency of the these procedures, namely the allocation of error between feasibility check and selection, the use of common random numbers, and the cost of switching between simulated systems.Ph.D.Committee Co-chairs: Sigrun Andradottir, Dave Goldsman and Seong-Hee Kim; Committee Members:Shabbir Ahmed and Brani Vidakovi

    Mostly Harmless Simulations? Using Monte Carlo Studies for Estimator Selection

    Get PDF
    We consider two recent suggestions for how to perform an empirically motivated Monte Carlo study to help select a treatment effect estimator under unconfoundedness. We show theoretically that neither is likely to be informative except under restrictive conditions that are unlikely to be satisfied in many contexts. To test empirical relevance, we also apply the approaches to a real-world setting where estimator performance is known. Both approaches are worse than random at selecting estimators which minimise absolute bias. They are better when selecting estimators that minimise mean squared error. However, using a simple bootstrap is at least as good and often better. For now researchers would be best advised to use a range of estimators and compare estimates for robustness

    Matching Code and Law: Achieving Algorithmic Fairness with Optimal Transport

    Full text link
    Increasingly, discrimination by algorithms is perceived as a societal and legal problem. As a response, a number of criteria for implementing algorithmic fairness in machine learning have been developed in the literature. This paper proposes the Continuous Fairness Algorithm (CFAθ\theta) which enables a continuous interpolation between different fairness definitions. More specifically, we make three main contributions to the existing literature. First, our approach allows the decision maker to continuously vary between specific concepts of individual and group fairness. As a consequence, the algorithm enables the decision maker to adopt intermediate ``worldviews'' on the degree of discrimination encoded in algorithmic processes, adding nuance to the extreme cases of ``we're all equal'' (WAE) and ``what you see is what you get'' (WYSIWYG) proposed so far in the literature. Second, we use optimal transport theory, and specifically the concept of the barycenter, to maximize decision maker utility under the chosen fairness constraints. Third, the algorithm is able to handle cases of intersectionality, i.e., of multi-dimensional discrimination of certain groups on grounds of several criteria. We discuss three main examples (credit applications; college admissions; insurance contracts) and map out the legal and policy implications of our approach. The explicit formalization of the trade-off between individual and group fairness allows this post-processing approach to be tailored to different situational contexts in which one or the other fairness criterion may take precedence. Finally, we evaluate our model experimentally.Comment: Vastly extended new version, now including computational experiment

    Decision Making in the Presence of Subjective Stochastic Constraints

    Get PDF
    Constrained Ranking and Selection considers optimizing a primary performance measure over a finite set of alternatives subject to constraints on secondary performance measures. When the constraints are stochastic, the corresponding performance measures should be estimated by simulation. When the constraints are subjective, the decision maker is willing to consider multiple constraint threshold values. In this thesis, we consider three problem formulations when subjective stochastic constraints are present. In Chapter 2, we consider the problem of finding a set of feasible or near-feasible systems among a finite number of simulated systems in the presence of subjective stochastic constraints. A decision maker may want to test multiple constraint threshold values for the feasibility check, or she may want to determine how a set of feasible systems changes as constraints become more strict with the objective of pruning systems or finding the system with the best performance. We present indifference-zone procedures that recycle observations for the feasibility check and provide an overall probability of correct decision for all threshold values. Our numerical experiments show that the proposed procedures perform well in reducing the required number of observations relative to four alternative procedures (that either restart feasibility check from scratch with respect to each set of thresholds or with the Bonferroni inequality applied in a conservative way) while providing a statistical guarantee on the probability of correct decision. Chapter 3, considers the problem of finding a system with the best primary performance measure among a finite number of simulated systems in the presence of subjective stochastic constraints on secondary performance measures. When no feasible system exists, the decision maker may be willing to relax some constraint thresholds. We take multiple threshold values for each constraint as a user’s input and propose indifference-zone procedures that perform the phases of feasibility check and selection-of-the-best sequentially or simultaneously. We prove that the proposed procedures yield the best system in the most desirable feasible region possible with at least a pre-specified probability. Our experimental results show that our procedures perform well with respect to the number of observations required to make a decision, as compared with straightforward procedures that repeatedly solve the problem for each set of constraint thresholds. In Chapter 4, we consider the problem of finding a portfolio of systems with the best primary performance measure among finitely many simulated systems as stochastic constraints on secondary performance measures are relaxed. By finding a portfolio of the best systems under a variety of constraint thresholds, the decision maker can identify a robust solution with respect to the constraints or consider the trade-off between the primary performance measure and the level of feasibility of the secondary performance measures. We propose indifference-zone procedures that perform the phases of feasibility check and selection-of-the-best sequentially and simultaneously, and prove that the proposed procedures identify the portolio of the best systems with at least a pre-specified probability. Our proposed procedures show a significant reduction in the required number of observations compared with straightforward procedures that repeatedly identify the best system with respect to each set of constraint thresholds.Ph.D

    Automated Model Selection with AMSFin a production process of the automotive industry

    Get PDF
    Machine learning, statistics and knowledge engineering provide a broad variety of supervised learning algorithms for classification. In this paper we introduce the Automated Model Selection Framework (AMSF) which presents automatic and semi-automatic methods to select classifiers. To achieve this we split up the selection process into three distinct phases. Two of those select algorithms by static rules which are derived from a manually created knowledgebase. At this stage of AMSF the user can choose between different rankers in the third phase. Currently, we use instance based learning and a scoring scheme for ranking the classifiers. After evaluation of different rankers we will recommend the most successful to the user by default. Besides describing the architecture and design issues, we additionally point out the versatile ways AMSF is applied in a production process of the automotive industr

    A Methodology for the Selection of Multi-Criteria Decision Analysis Methods in Real Estate and Land Management Processes

    Get PDF
    Real estate and land management are characterised by a complex, elaborate combination of technical, regulatory and governmental factors. In Europe, Public Administrators must address the complex decision-making problems that need to be resolved, while also acting in consideration of the expectations of the different stakeholders involved in settlement transformation. In complex situations (e.g., with different aspects to be considered and multilevel actors involved), decision-making processes are often used to solve multidisciplinary and multidimensional analyses, which support the choices of those who are making the decision. Multi-Criteria Decision Analysis (MCDA) methods are included among the examination and evaluation techniques considered useful by the European Community. Such analyses and techniques are performed using methods, which aim to reach a synthesis of the various forms of input data needed to define decision-making problems of a similar complexity. Thus, one or more of the conclusions reached allow for informed, well thought-out, strategic decisions. According to the technical literature on MCDA, numerous methods are applicable in different decision-making situations, however, advice for selecting the most appropriate for the specific field of application and problem have not been thoroughly investigated. In land and real estate management, numerous queries regarding evaluations often arise. In brief, the objective of this paper is to outline a procedure with which to select the method best suited to the specific queries of evaluation, which commonly arise while addressing decision-making problems. In particular issues of land and real estate management, representing the so-called “settlement sector”. The procedure will follow a theoretical-methodological approach by formulating a taxonomy of the endogenous and exogenous variables of the multi-criteria analysis method
    • …
    corecore