1,017 research outputs found
Rejoinder
Rejoinder of "Statistical Inference: The Big Picture" by R. E. Kass
[arXiv:1106.2895]Comment: Published in at http://dx.doi.org/10.1214/11-STS337REJ the
Statistical Science (http://www.imstat.org/sts/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Implicit Acquisition of User Models in Cooperative Advisory Systems
User modelling systems to date have relied heavily on user models that were hand crafted for use in a particular situation. Recently, attention has focused on the feasibility of general user models, models that can be transferred from one situation to another with little or no modification. Such a general user model could be implemented as a modular component easily integrated into diverse systems. This paper addresses one class of general user models, those general with respect to the underlying domain of the application. In particular, a domain independent user modelling module for cooperative advisory systems is discussed.
A major problem in building user models is the difficulty of acquiring information about the user. Traditional approaches have relied heavily on information that is pre-encoded by the system designer. For a user model to be domain independent, acquisition of knowledge will have to be done implicitly, i.e., knowledge about the user must be acquired during his interaction with the system.
The research proposed in this paper focuses on domain independent implicit user model acquisition techniques for cooperative advisory systems. These techniques have been formalized as a set of model acquisition rules that will serve as the basis for the implementation of the model acquisition portion of a general user modelling module. The acquisition rules have been developed by studying a large number of conversations between advice-seekers and an expert. The rules presented are capable of supporting most of the modelling requirements of the expert in these conversations. Future work includes implementing these acquisition rules in a general user modelling module to test their effectiveness and domain independence
Information In The Non-Stationary Case
Information estimates such as the ``direct method'' of Strong et al. (1998)
sidestep the difficult problem of estimating the joint distribution of response
and stimulus by instead estimating the difference between the marginal and
conditional entropies of the response. While this is an effective estimation
strategy, it tempts the practitioner to ignore the role of the stimulus and the
meaning of mutual information. We show here that, as the number of trials
increases indefinitely, the direct (or ``plug-in'') estimate of marginal
entropy converges (with probability 1) to the entropy of the time-averaged
conditional distribution of the response, and the direct estimate of the
conditional entropy converges to the time-averaged entropy of the conditional
distribution of the response. Under joint stationarity and ergodicity of the
response and stimulus, the difference of these quantities converges to the
mutual information. When the stimulus is deterministic or non-stationary the
direct estimate of information no longer estimates mutual information, which is
no longer meaningful, but it remains a measure of variability of the response
distribution across time
An Implementation of Bayesian Adaptive Regression Splines (BARS) in C with S and R Wrappers
BARS (DiMatteo, Genovese, and Kass 2001) uses the powerful reversible-jump MCMC engine to perform spline-based generalized nonparametric regression. It has been shown to work well in terms of having small mean-squared error in many examples (smaller than known competitors), as well as producing visually-appealing fits that are smooth (filtering out high-frequency noise) while adapting to sudden changes (retaining high-frequency signal). However, BARS is computationally intensive. The original implementation in S was too slow to be practical in certain situations, and was found to handle some data sets incorrectly. We have implemented BARS in C for the normal and Poisson cases, the latter being important in neurophysiological and other point-process applications. The C implementation includes all needed subroutines for fitting Poisson regression, manipulating B-splines (using code created by Bates and Venables), and finding starting values for Poisson regression (using code for density estimation created by Kooperberg). The code utilizes only freely-available external libraries (LAPACK and BLAS) and is otherwise self-contained. We have also provided wrappers so that BARS can be used easily within S or R.
Assessment of synchrony in multiple neural spike trains using loglinear point process models
Neural spike trains, which are sequences of very brief jumps in voltage
across the cell membrane, were one of the motivating applications for the
development of point process methodology. Early work required the assumption of
stationarity, but contemporary experiments often use time-varying stimuli and
produce time-varying neural responses. More recently, many statistical methods
have been developed for nonstationary neural point process data. There has also
been much interest in identifying synchrony, meaning events across two or more
neurons that are nearly simultaneous at the time scale of the recordings. A
natural statistical approach is to discretize time, using short time bins, and
to introduce loglinear models for dependency among neurons, but previous use of
loglinear modeling technology has assumed stationarity. We introduce a succinct
yet powerful class of time-varying loglinear models by (a) allowing
individual-neuron effects (main effects) to involve time-varying intensities;
(b) also allowing the individual-neuron effects to involve autocovariation
effects (history effects) due to past spiking, (c) assuming excess synchrony
effects (interaction effects) do not depend on history, and (d) assuming all
effects vary smoothly across time.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS429 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Statistical Inference: The Big Picture
Statistics has moved beyond the frequentist-Bayesian controversies of the
past. Where does this leave our ability to interpret results? I suggest that a
philosophy compatible with statistical practice, labeled here statistical
pragmatism, serves as a foundation for inference. Statistical pragmatism is
inclusive and emphasizes the assumptions that connect statistical models with
observed data. I argue that introductory courses often mischaracterize the
process of statistical inference and I propose an alternative "big picture"
depiction.Comment: Published in at http://dx.doi.org/10.1214/10-STS337 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
The Need for User Models in Generating Expert System Explanations
An explanation facility is an important component of an expert system, but current systems for the most part have neglected the importance of tailoring a system\u27s explanations to the user. This paper explores the role of user modeling in generating expert system explanations, making the claim that individualized user models are essential to produce good explanations when the system users vary in their knowledge of the domain, or in their goals, plans, and preferences. To make this argument, a characterization of explanation, and good explanation is made, leading to a presentation of how knowledge about the user affects the various aspects of a good explanation. Individualized user models are not only important, it is practical to obtain them. A method for acquiring a model of the user\u27s beliefs implicitly by eavesdropping on the interaction between user and system is presented, along with examples of how this information can be used to tailor an explanation
Approximate Methods for State-Space Models
State-space models provide an important body of techniques for analyzing
time-series, but their use requires estimating unobserved states. The optimal
estimate of the state is its conditional expectation given the observation
histories, and computing this expectation is hard when there are
nonlinearities. Existing filtering methods, including sequential Monte Carlo,
tend to be either inaccurate or slow. In this paper, we study a nonlinear
filter for nonlinear/non-Gaussian state-space models, which uses Laplace's
method, an asymptotic series expansion, to approximate the state's conditional
mean and variance, together with a Gaussian conditional distribution. This {\em
Laplace-Gaussian filter} (LGF) gives fast, recursive, deterministic state
estimates, with an error which is set by the stochastic characteristics of the
model and is, we show, stable over time. We illustrate the estimation ability
of the LGF by applying it to the problem of neural decoding and compare it to
sequential Monte Carlo both in simulations and with real data. We find that the
LGF can deliver superior results in a small fraction of the computing time.Comment: 31 pages, 4 figures. Different pagination from journal version due to
incompatible style files but same content; the supplemental file for the
journal appears here as appendices B--E
False discovery rate regression: an application to neural synchrony detection in primary visual cortex
Many approaches for multiple testing begin with the assumption that all tests
in a given study should be combined into a global false-discovery-rate
analysis. But this may be inappropriate for many of today's large-scale
screening problems, where auxiliary information about each test is often
available, and where a combined analysis can lead to poorly calibrated error
rates within different subsets of the experiment. To address this issue, we
introduce an approach called false-discovery-rate regression that directly uses
this auxiliary information to inform the outcome of each test. The method can
be motivated by a two-groups model in which covariates are allowed to influence
the local false discovery rate, or equivalently, the posterior probability that
a given observation is a signal. This poses many subtle issues at the interface
between inference and computation, and we investigate several variations of the
overall approach. Simulation evidence suggests that: (1) when covariate effects
are present, FDR regression improves power for a fixed false-discovery rate;
and (2) when covariate effects are absent, the method is robust, in the sense
that it does not lead to inflated error rates. We apply the method to neural
recordings from primary visual cortex. The goal is to detect pairs of neurons
that exhibit fine-time-scale interactions, in the sense that they fire together
more often than expected due to chance. Our method detects roughly 50% more
synchronous pairs versus a standard FDR-controlling analysis. The companion R
package FDRreg implements all methods described in the paper
- …