6,948 research outputs found

    An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities

    Full text link
    We describe an extension of Earley's parser for stochastic context-free grammars that computes the following quantities given a stochastic context-free grammar and an input string: a) probabilities of successive prefixes being generated by the grammar; b) probabilities of substrings being generated by the nonterminals, including the entire string being generated by the grammar; c) most likely (Viterbi) parse of the string; d) posterior expected number of applications of each grammar production, as required for reestimating rule probabilities. (a) and (b) are computed incrementally in a single left-to-right pass over the input. Our algorithm compares favorably to standard bottom-up parsing methods for SCFGs in that it works efficiently on sparse grammars by making use of Earley's top-down control structure. It can process any context-free rule format without conversion to some normal form, and combines computations for (a) through (d) in a single algorithm. Finally, the algorithm has simple extensions for processing partially bracketed inputs, and for finding partial parses and their likelihoods on ungrammatical inputs.Comment: 45 pages. Slightly shortened version to appear in Computational Linguistics 2

    An Explanatory Study on the Non-Parametric Multivariate T2 Control Chart

    Get PDF
    Most control charts require the assumption of normal distribution for observations. When distribution is not normal, one can use non-parametric control charts such as sign control chart. A deficiency of such control charts could be the loss of information due to replacing an observation with its sign or rank. Furthermore, because the chart statistics of T2 are correlated, the T2 chart is not a desire performance. Non-parametric bootstrap algorithm could help to calculate control chart parameters using the original observations while no assumption regarding the distribution is needed. In this paper, first, a bootstrap multivariate control chart is presented based on Hotelling’s T2 statistic then the performance of the bootstrap multivariate control chart is compared to a Hotelling’s T2 parametric multivariate control chart, a multivariate sign control chart, and a multivariate Wilcoxon control chart using a simulation study. Ultimately, the bootstrap multivariate control chart is used in an empirical example to study the process of sugar production

    An Abstract Machine for Unification Grammars

    Full text link
    This work describes the design and implementation of an abstract machine, Amalia, for the linguistic formalism ALE, which is based on typed feature structures. This formalism is one of the most widely accepted in computational linguistics and has been used for designing grammars in various linguistic theories, most notably HPSG. Amalia is composed of data structures and a set of instructions, augmented by a compiler from the grammatical formalism to the abstract instructions, and a (portable) interpreter of the abstract instructions. The effect of each instruction is defined using a low-level language that can be executed on ordinary hardware. The advantages of the abstract machine approach are twofold. From a theoretical point of view, the abstract machine gives a well-defined operational semantics to the grammatical formalism. This ensures that grammars specified using our system are endowed with well defined meaning. It enables, for example, to formally verify the correctness of a compiler for HPSG, given an independent definition. From a practical point of view, Amalia is the first system that employs a direct compilation scheme for unification grammars that are based on typed feature structures. The use of amalia results in a much improved performance over existing systems. In order to test the machine on a realistic application, we have developed a small-scale, HPSG-based grammar for a fragment of the Hebrew language, using Amalia as the development platform. This is the first application of HPSG to a Semitic language.Comment: Doctoral Thesis, 96 pages, many postscript figures, uses pstricks, pst-node, psfig, fullname and a macros fil

    Characterisation and optimal design of a new double sampling c chart

    Full text link
    [EN] This paper proposes a new double sampling scheme for c control chart (DS-c), which was designed to improve the performance of c chart or to reduce the inspection cost. The mathematical expression required to do an exact evaluation of ARL and ASN is deduced. Further, a bi-objective genetic algorithm is implemented to obtain the optimal design of the DS-c scheme. This optimisation is aimed to simultaneously minimising the error probability type II and the ASN, guaranteeing a desired level for the error probability type I. A performance comparison between the double sampling (DS), fixed parameters (FP), variable simple size (VSS) and exponential weighted moving average (EWMA) schemes for the c chart is carried out. The comparison shows that with the implementation of DS-c scheme is obtained a significant reduction of the out of control ARL with a lower ASN respect to FP and a better ARL profile than VSS and EWMA.Campuzano, MJ.; Carrión García, A.; Mosquera, J. (2019). Characterisation and optimal design of a new double sampling c chart. European J of Industrial Engineering. 13(6):775-793. https://doi.org/10.1504/EJIE.2019.104312S77579313

    Making Maps Of The Cosmic Microwave Background: The MAXIMA Example

    Get PDF
    This work describes Cosmic Microwave Background (CMB) data analysis algorithms and their implementations, developed to produce a pixelized map of the sky and a corresponding pixel-pixel noise correlation matrix from time ordered data for a CMB mapping experiment. We discuss in turn algorithms for estimating noise properties from the time ordered data, techniques for manipulating the time ordered data, and a number of variants of the maximum likelihood map-making procedure. We pay particular attention to issues pertinent to real CMB data, and present ways of incorporating them within the framework of maximum likelihood map-making. Making a map of the sky is shown to be not only an intermediate step rendering an image of the sky, but also an important diagnostic stage, when tests for and/or removal of systematic effects can efficiently be performed. The case under study is the MAXIMA data set. However, the methods discussed are expected to be applicable to the analysis of other current and forthcoming CMB experiments.Comment: Replaced to match the published version, only minor change

    Bivariate modified hotelling’s T2 charts using bootstrap data

    Get PDF
    The conventional Hotelling’s  charts are evidently inefficient as it has resulted in disorganized data with outliers, and therefore, this study proposed the application of a novel alternative robust Hotelling’s  charts approach. For the robust scale estimator , this approach encompasses the use of the Hodges-Lehmann vector and the covariance matrix in place of the arithmetic mean vector and the covariance matrix, respectively.  The proposed chart was examined performance wise. For the purpose, simulated bivariate bootstrap datasets were used in two conditions, namely independent variables and dependent variables. Then, assessment was made to the modified chart in terms of its robustness. For the purpose, the likelihood of outliers’ detection and false alarms were computed. From the outcomes from the computations made, the proposed charts demonstrated superiority over the conventional ones for all the cases tested
    corecore