6,948 research outputs found
An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities
We describe an extension of Earley's parser for stochastic context-free
grammars that computes the following quantities given a stochastic context-free
grammar and an input string: a) probabilities of successive prefixes being
generated by the grammar; b) probabilities of substrings being generated by the
nonterminals, including the entire string being generated by the grammar; c)
most likely (Viterbi) parse of the string; d) posterior expected number of
applications of each grammar production, as required for reestimating rule
probabilities. (a) and (b) are computed incrementally in a single left-to-right
pass over the input. Our algorithm compares favorably to standard bottom-up
parsing methods for SCFGs in that it works efficiently on sparse grammars by
making use of Earley's top-down control structure. It can process any
context-free rule format without conversion to some normal form, and combines
computations for (a) through (d) in a single algorithm. Finally, the algorithm
has simple extensions for processing partially bracketed inputs, and for
finding partial parses and their likelihoods on ungrammatical inputs.Comment: 45 pages. Slightly shortened version to appear in Computational
Linguistics 2
An Explanatory Study on the Non-Parametric Multivariate T2 Control Chart
Most control charts require the assumption of normal distribution for observations. When distribution is not normal, one can use non-parametric control charts such as sign control chart. A deficiency of such control charts could be the loss of information due to replacing an observation with its sign or rank. Furthermore, because the chart statistics of T2 are correlated, the T2 chart is not a desire performance. Non-parametric bootstrap algorithm could help to calculate control chart parameters using the original observations while no assumption regarding the distribution is needed. In this paper, first, a bootstrap multivariate control chart is presented based on Hotelling’s T2 statistic then the performance of the bootstrap multivariate control chart is compared to a Hotelling’s T2 parametric multivariate control chart, a multivariate sign control chart, and a multivariate Wilcoxon control chart using a simulation study. Ultimately, the bootstrap multivariate control chart is used in an empirical example to study the process of sugar production
An Abstract Machine for Unification Grammars
This work describes the design and implementation of an abstract machine,
Amalia, for the linguistic formalism ALE, which is based on typed feature
structures. This formalism is one of the most widely accepted in computational
linguistics and has been used for designing grammars in various linguistic
theories, most notably HPSG. Amalia is composed of data structures and a set of
instructions, augmented by a compiler from the grammatical formalism to the
abstract instructions, and a (portable) interpreter of the abstract
instructions. The effect of each instruction is defined using a low-level
language that can be executed on ordinary hardware.
The advantages of the abstract machine approach are twofold. From a
theoretical point of view, the abstract machine gives a well-defined
operational semantics to the grammatical formalism. This ensures that grammars
specified using our system are endowed with well defined meaning. It enables,
for example, to formally verify the correctness of a compiler for HPSG, given
an independent definition. From a practical point of view, Amalia is the first
system that employs a direct compilation scheme for unification grammars that
are based on typed feature structures. The use of amalia results in a much
improved performance over existing systems.
In order to test the machine on a realistic application, we have developed a
small-scale, HPSG-based grammar for a fragment of the Hebrew language, using
Amalia as the development platform. This is the first application of HPSG to a
Semitic language.Comment: Doctoral Thesis, 96 pages, many postscript figures, uses pstricks,
pst-node, psfig, fullname and a macros fil
Characterisation and optimal design of a new double sampling c chart
[EN] This paper proposes a new double sampling scheme for c control chart (DS-c), which was designed to improve the performance of c chart or to reduce the inspection cost. The mathematical expression required to do an exact evaluation of ARL and ASN is deduced. Further, a bi-objective genetic algorithm is implemented to obtain the optimal design of the DS-c scheme. This optimisation is aimed to simultaneously minimising the error probability type II and the ASN, guaranteeing a desired level for the error probability type I. A performance comparison between the double sampling (DS), fixed parameters (FP), variable simple size (VSS) and exponential weighted moving average (EWMA) schemes for the c chart is carried out. The comparison shows that with the implementation of DS-c scheme is obtained a significant reduction of the out of control ARL with a lower ASN respect to FP and a better ARL profile than VSS and EWMA.Campuzano, MJ.; Carrión García, A.; Mosquera, J. (2019). Characterisation and optimal design of a new double sampling c chart. European J of Industrial Engineering. 13(6):775-793. https://doi.org/10.1504/EJIE.2019.104312S77579313
BitPar
Statistical parse
Making Maps Of The Cosmic Microwave Background: The MAXIMA Example
This work describes Cosmic Microwave Background (CMB) data analysis
algorithms and their implementations, developed to produce a pixelized map of
the sky and a corresponding pixel-pixel noise correlation matrix from time
ordered data for a CMB mapping experiment. We discuss in turn algorithms for
estimating noise properties from the time ordered data, techniques for
manipulating the time ordered data, and a number of variants of the maximum
likelihood map-making procedure. We pay particular attention to issues
pertinent to real CMB data, and present ways of incorporating them within the
framework of maximum likelihood map-making. Making a map of the sky is shown to
be not only an intermediate step rendering an image of the sky, but also an
important diagnostic stage, when tests for and/or removal of systematic effects
can efficiently be performed. The case under study is the MAXIMA data set.
However, the methods discussed are expected to be applicable to the analysis of
other current and forthcoming CMB experiments.Comment: Replaced to match the published version, only minor change
Bivariate modified hotelling’s T2 charts using bootstrap data
The conventional Hotelling’s charts are evidently inefficient as it has resulted in disorganized data with outliers, and therefore, this study proposed the application of a novel alternative robust Hotelling’s charts approach. For the robust scale estimator , this approach encompasses the use of the Hodges-Lehmann vector and the covariance matrix in place of the arithmetic mean vector and the covariance matrix, respectively. The proposed chart was examined performance wise. For the purpose, simulated bivariate bootstrap datasets were used in two conditions, namely independent variables and dependent variables. Then, assessment was made to the modified chart in terms of its robustness. For the purpose, the likelihood of outliers’ detection and false alarms were computed. From the outcomes from the computations made, the proposed charts demonstrated superiority over the conventional ones for all the cases tested
- …