21,288 research outputs found
Rescaling, thinning or complementing? On goodness-of-fit procedures for point process models and Generalized Linear Models
Generalized Linear Models (GLMs) are an increasingly popular framework for
modeling neural spike trains. They have been linked to the theory of stochastic
point processes and researchers have used this relation to assess
goodness-of-fit using methods from point-process theory, e.g. the
time-rescaling theorem. However, high neural firing rates or coarse
discretization lead to a breakdown of the assumptions necessary for this
connection. Here, we show how goodness-of-fit tests from point-process theory
can still be applied to GLMs by constructing equivalent surrogate point
processes out of time-series observations. Furthermore, two additional tests
based on thinning and complementing point processes are introduced. They
augment the instruments available for checking model adequacy of point
processes as well as discretized models.Comment: 9 pages, to appear in NIPS 2010 (Neural Information Processing
Systems), corrected missing referenc
Bayesian matching of unlabeled marked point sets using random fields, with an application to molecular alignment
Statistical methodology is proposed for comparing unlabeled marked point
sets, with an application to aligning steroid molecules in chemoinformatics.
Methods from statistical shape analysis are combined with techniques for
predicting random fields in spatial statistics in order to define a suitable
measure of similarity between two marked point sets. Bayesian modeling of the
predicted field overlap between pairs of point sets is proposed, and posterior
inference of the alignment is carried out using Markov chain Monte Carlo
simulation. By representing the fields in reproducing kernel Hilbert spaces,
the degree of overlap can be computed without expensive numerical integration.
Superimposing entire fields rather than the configuration matrices of point
coordinates thereby avoids the problem that there is usually no clear
one-to-one correspondence between the points. In addition, mask parameters are
introduced in the model, so that partial matching of the marked point sets can
be carried out. We also propose an adaptation of the generalized Procrustes
analysis algorithm for the simultaneous alignment of multiple point sets. The
methodology is illustrated with a simulation study and then applied to a data
set of 31 steroid molecules, where the relationship between shape and binding
activity to the corticosteroid binding globulin receptor is explored.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS486 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Mutation supply and the repeatability of selection for antibiotic resistance
Whether evolution can be predicted is a key question in evolutionary biology.
Here we set out to better understand the repeatability of evolution. We
explored experimentally the effect of mutation supply and the strength of
selective pressure on the repeatability of selection from standing genetic
variation. Different sizes of mutant libraries of an antibiotic resistance
gene, TEM-1 -lactamase in Escherichia coli, were subjected to different
antibiotic concentrations. We determined whether populations went extinct or
survived, and sequenced the TEM gene of the surviving populations. The
distribution of mutations per allele in our mutant libraries- generated by
error-prone PCR- followed a Poisson distribution. Extinction patterns could be
explained by a simple stochastic model that assumed the sampling of beneficial
mutations was key for survival. In most surviving populations, alleles
containing at least one known large-effect beneficial mutation were present.
These genotype data also support a model which only invokes sampling effects to
describe the occurrence of alleles containing large-effect driver mutations.
Hence, evolution is largely predictable given cursory knowledge of mutational
fitness effects, the mutation rate and population size. There were no clear
trends in the repeatability of selected mutants when we considered all
mutations present. However, when only known large-effect mutations were
considered, the outcome of selection is less repeatable for large libraries, in
contrast to expectations. Furthermore, we show experimentally that alleles
carrying multiple mutations selected from large libraries confer higher
resistance levels relative to alleles with only a known large-effect mutation,
suggesting that the scarcity of high-resistance alleles carrying multiple
mutations may contribute to the decrease in repeatability at large library
sizes.Comment: 31pages, 9 figure
Failure dynamics of the global risk network
Risks threatening modern societies form an intricately interconnected network
that often underlies crisis situations. Yet, little is known about how risk
materializations in distinct domains influence each other. Here we present an
approach in which expert assessments of risks likelihoods and influence
underlie a quantitative model of the global risk network dynamics. The modeled
risks range from environmental to economic and technological and include
difficult to quantify risks, such as geo-political or social. Using the maximum
likelihood estimation, we find the optimal model parameters and demonstrate
that the model including network effects significantly outperforms the others,
uncovering full value of the expert collected data. We analyze the model
dynamics and study its resilience and stability. Our findings include such risk
properties as contagion potential, persistence, roles in cascades of failures
and the identity of risks most detrimental to system stability. The model
provides quantitative means for measuring the adverse effects of risk
interdependence and the materialization of risks in the network
Representation in Econometrics: A Historical Perspective
Measurement forms the substance of econometrics. This chapter outlines the history of econometrics from a measurement perspective - how have measurement errors been dealt with and how, from a methodological standpoint, did econometrics evolve so as to represent theory more adequately in relation to data? The evolution is organized in terms of four phases: 'theory and measurement', 'measurement and theory', 'measurement with theory' and 'measurement without theory'. The question of how measurement research has helped in the advancement of knowledge advance is discussed in the light of this history.Econometrics, History, Measurement error
Data-Driven Model Reduction for the Bayesian Solution of Inverse Problems
One of the major challenges in the Bayesian solution of inverse problems
governed by partial differential equations (PDEs) is the computational cost of
repeatedly evaluating numerical PDE models, as required by Markov chain Monte
Carlo (MCMC) methods for posterior sampling. This paper proposes a data-driven
projection-based model reduction technique to reduce this computational cost.
The proposed technique has two distinctive features. First, the model reduction
strategy is tailored to inverse problems: the snapshots used to construct the
reduced-order model are computed adaptively from the posterior distribution.
Posterior exploration and model reduction are thus pursued simultaneously.
Second, to avoid repeated evaluations of the full-scale numerical model as in a
standard MCMC method, we couple the full-scale model and the reduced-order
model together in the MCMC algorithm. This maintains accurate inference while
reducing its overall computational cost. In numerical experiments considering
steady-state flow in a porous medium, the data-driven reduced-order model
achieves better accuracy than a reduced-order model constructed using the
classical approach. It also improves posterior sampling efficiency by several
orders of magnitude compared to a standard MCMC method
- …