599 research outputs found
Inference, Learning, and Population Size: Projectivity for SRL Models
A subtle difference between propositional and relational data is that in many
relational models, marginal probabilities depend on the population or domain
size. This paper connects the dependence on population size to the classic
notion of projectivity from statistical theory: Projectivity implies that
relational predictions are robust with respect to changes in domain size. We
discuss projectivity for a number of common SRL systems, and identify syntactic
fragments that are guaranteed to yield projective models. The syntactic
conditions are restrictive, which suggests that projectivity is difficult to
achieve in SRL, and care must be taken when working with different domain
sizes
Local Exchangeability
Exchangeability---in which the distribution of an infinite sequence is
invariant to reorderings of its elements---implies the existence of a simple
conditional independence structure that may be leveraged in the design of
probabilistic models, efficient inference algorithms, and randomization-based
testing procedures. In practice, however, this assumption is too strong an
idealization; the distribution typically fails to be exactly invariant to
permutations and de Finetti's representation theory does not apply. Thus there
is the need for a distributional assumption that is both weak enough to hold in
practice, and strong enough to guarantee a useful underlying representation. We
introduce a relaxed notion of local exchangeability---where swapping data
associated with nearby covariates causes a bounded change in the distribution.
We prove that locally exchangeable processes correspond to independent
observations from an underlying measure-valued stochastic process. We thereby
show that de Finetti's theorem is robust to perturbation and provide further
justification for the Bayesian modelling approach. Using this probabilistic
result, we develop three novel statistical procedures for (1) estimating the
underlying process via local empirical measures, (2) testing via local
randomization, and (3) estimating the canonical premetric of local
exchangeability. These three procedures extend the applicability of previous
exchangeability-based methods without sacrificing rigorous statistical
guarantees. The paper concludes with examples of popular statistical models
that exhibit local exchangeability
The Multi-round Process Matrix
We develop an extension of the process matrix (PM) framework for correlations
between quantum operations with no causal order that allows multiple rounds of
information exchange for each party compatibly with the assumption of
well-defined causal order of events locally. We characterise the higher-order
process describing such correlations, which we name the multi-round process
matrix (MPM), and formulate a notion of causal nonseparability for it that
extends the one for standard PMs. We show that in the multi-round case there
are novel manifestations of causal nonseparability that are not captured by a
naive application of the standard PM formalism: we exhibit an instance of an
operator that is both a valid PM and a valid MPM, but is causally separable in
the first case and can violate causal inequalities in the second case due to
the possibility of using a side channel.Comment: 24 pages with 6 figures, various improvements and corrections,
accepted in Quantu
Max-stable random sup-measures with comonotonic tail dependence
Several objects in the Extremes literature are special instances of
max-stable random sup-measures. This perspective opens connections to the
theory of random sets and the theory of risk measures and makes it possible to
extend corresponding notions and results from the literature with streamlined
proofs. In particular, it clarifies the role of Choquet random sup-measures and
their stochastic dominance property. Key tools are the LePage representation of
a max-stable random sup-measure and the dual representation of its tail
dependence functional. Properties such as complete randomness, continuity,
separability, coupling, continuous choice, invariance and transformations are
also analysed.Comment: 28 pages, 1 figur
Intertemporal discrete choice
The discounted logit is widely used to estimate time preferences using data from field and laboratory experiments. Despite its popularity, it exhibits the "problem of the scale": choice probabilities depend on the scale of the value function. When applied to
intertemporal choice, the problem the scale implies that logit probabilities are sensitive to the temporal distance between the choice and the outcomes. This is a failure of an intuitive requirement of stationarity although future values are discounted geometrically. As a consequence, patterns of choice following from the structure of the logit
may be attributed to non-stationary discounting. We solve this problem introducing the discounted Luce rule. It retains the flexibility and simplicity of the logit while it satisfies stationarity. We characterize the model in two settings: dated outcomes and consumption streams. Relaxations of stationarity give observable restrictions characterizing
hyperbolic and quasi-hyperbolic discounting. Lastly, we discuss an extension of the model to recursive stochastic choices with the present bias
Separability as a modeling paradigm in large probabilistic models
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 185-191).Many interesting stochastic models can be formulated as finite-state vector Markov processes, with a state characterized by the values of a collection of random variables. In general, such models suffer from the curse of dimensionality: the size of the state space grows exponentially with the number of underlying random variables, thereby precluding conventional modeling and analysis. A potential cure to this curse is to work with models that allow the propagation of partial information, e.g. marginal distributions, expectations, higher-moments, or cross-correlations, as derived from the joint distribution for the network state. This thesis develops and rigorously investigates the notion of separability, associated with structure in probabilistic models that permits exact propagation of partial information. We show that when partial information can be propagated exactly, it can be done so linearly. The matrices for propagating such partial information share many valuable spectral relationships with the underlying transition matrix of the Markov chain. Separability can be understood from the perspective of subspace invariance in linear systems, though it relates to invariance in a non-standard way. We analyze the asymptotic generality-- as the number of random variables becomes large-of some special cases of separability that permit the propagation of marginal distributions. Within this discussion of separability, we introduce the generalized influence model, which incorporates as special cases two prominent models permitting the propagation of marginal distributions: the influence model and Markov chains on permutations (the symmetric group). The thesis proposes a potentially tractable solution to learning informative model parameters, and illustrates many advantageous properties of the estimator under the assumption of separability. Lastly, we illustrate separability in the general setting without any notion of time-homogeneity, and discuss potential benefits for inference in special cases.by William J. Richoux.Ph.D
Sparsity-Cognizant Total Least-Squares for Perturbed Compressive Sampling
Solving linear regression problems based on the total least-squares (TLS)
criterion has well-documented merits in various applications, where
perturbations appear both in the data vector as well as in the regression
matrix. However, existing TLS approaches do not account for sparsity possibly
present in the unknown vector of regression coefficients. On the other hand,
sparsity is the key attribute exploited by modern compressive sampling and
variable selection approaches to linear regression, which include noise in the
data, but do not account for perturbations in the regression matrix. The
present paper fills this gap by formulating and solving TLS optimization
problems under sparsity constraints. Near-optimum and reduced-complexity
suboptimum sparse (S-) TLS algorithms are developed to address the perturbed
compressive sampling (and the related dictionary learning) challenge, when
there is a mismatch between the true and adopted bases over which the unknown
vector is sparse. The novel S-TLS schemes also allow for perturbations in the
regression matrix of the least-absolute selection and shrinkage selection
operator (Lasso), and endow TLS approaches with ability to cope with sparse,
under-determined "errors-in-variables" models. Interesting generalizations can
further exploit prior knowledge on the perturbations to obtain novel weighted
and structured S-TLS solvers. Analysis and simulations demonstrate the
practical impact of S-TLS in calibrating the mismatch effects of contemporary
grid-based approaches to cognitive radio sensing, and robust
direction-of-arrival estimation using antenna arrays.Comment: 30 pages, 10 figures, submitted to IEEE Transactions on Signal
Processin
- …