3,622 research outputs found
Complexity of Grammar Induction for Quantum Types
Most categorical models of meaning use a functor from the syntactic category
to the semantic category. When semantic information is available, the problem
of grammar induction can therefore be defined as finding preimages of the
semantic types under this forgetful functor, lifting the information flow from
the semantic level to a valid reduction at the syntactic level. We study the
complexity of grammar induction, and show that for a variety of type systems,
including pivotal and compact closed categories, the grammar induction problem
is NP-complete. Our approach could be extended to linguistic type systems such
as autonomous or bi-closed categories.Comment: In Proceedings QPL 2014, arXiv:1412.810
Bayesian Structural Inference for Hidden Processes
We introduce a Bayesian approach to discovering patterns in structurally
complex processes. The proposed method of Bayesian Structural Inference (BSI)
relies on a set of candidate unifilar HMM (uHMM) topologies for inference of
process structure from a data series. We employ a recently developed exact
enumeration of topological epsilon-machines. (A sequel then removes the
topological restriction.) This subset of the uHMM topologies has the added
benefit that inferred models are guaranteed to be epsilon-machines,
irrespective of estimated transition probabilities. Properties of
epsilon-machines and uHMMs allow for the derivation of analytic expressions for
estimating transition probabilities, inferring start states, and comparing the
posterior probability of candidate model topologies, despite process internal
structure being only indirectly present in data. We demonstrate BSI's
effectiveness in estimating a process's randomness, as reflected by the Shannon
entropy rate, and its structure, as quantified by the statistical complexity.
We also compare using the posterior distribution over candidate models and the
single, maximum a posteriori model for point estimation and show that the
former more accurately reflects uncertainty in estimated values. We apply BSI
to in-class examples of finite- and infinite-order Markov processes, as well to
an out-of-class, infinite-state hidden process.Comment: 20 pages, 11 figures, 1 table; supplementary materials, 15 pages, 11
figures, 6 tables; http://csc.ucdavis.edu/~cmg/compmech/pubs/bsihp.ht
Extracting information from S-curves of language change
It is well accepted that adoption of innovations are described by S-curves
(slow start, accelerating period, and slow end). In this paper, we analyze how
much information on the dynamics of innovation spreading can be obtained from a
quantitative description of S-curves. We focus on the adoption of linguistic
innovations for which detailed databases of written texts from the last 200
years allow for an unprecedented statistical precision. Combining data analysis
with simulations of simple models (e.g., the Bass dynamics on complex networks)
we identify signatures of endogenous and exogenous factors in the S-curves of
adoption. We propose a measure to quantify the strength of these factors and
three different methods to estimate it from S-curves. We obtain cases in which
the exogenous factors are dominant (in the adoption of German orthographic
reforms and of one irregular verb) and cases in which endogenous factors are
dominant (in the adoption of conventions for romanization of Russian names and
in the regularization of most studied verbs). These results show that the shape
of S-curve is not universal and contains information on the adoption mechanism.
(published at "J. R. Soc. Interface, vol. 11, no. 101, (2014) 1044"; DOI:
http://dx.doi.org/10.1098/rsif.2014.1044)Comment: 9 pages, 5 figures, Supplementary Material is available at
http://dx.doi.org/10.6084/m9.figshare.122178
- …