6,310 research outputs found
Bibliographic Analysis on Research Publications using Authors, Categorical Labels and the Citation Network
Bibliographic analysis considers the author's research areas, the citation
network and the paper content among other things. In this paper, we combine
these three in a topic model that produces a bibliographic model of authors,
topics and documents, using a nonparametric extension of a combination of the
Poisson mixed-topic link model and the author-topic model. This gives rise to
the Citation Network Topic Model (CNTM). We propose a novel and efficient
inference algorithm for the CNTM to explore subsets of research publications
from CiteSeerX. The publication datasets are organised into three corpora,
totalling to about 168k publications with about 62k authors. The queried
datasets are made available online. In three publicly available corpora in
addition to the queried datasets, our proposed model demonstrates an improved
performance in both model fitting and document clustering, compared to several
baselines. Moreover, our model allows extraction of additional useful knowledge
from the corpora, such as the visualisation of the author-topics network.
Additionally, we propose a simple method to incorporate supervision into topic
modelling to achieve further improvement on the clustering task.Comment: Preprint for Journal Machine Learnin
Lidar waveform based analysis of depth images constructed using sparse single-photon data
This paper presents a new Bayesian model and algorithm used for depth and
intensity profiling using full waveforms from the time-correlated single photon
counting (TCSPC) measurement in the limit of very low photon counts. The model
proposed represents each Lidar waveform as a combination of a known impulse
response, weighted by the target intensity, and an unknown constant background,
corrupted by Poisson noise. Prior knowledge about the problem is embedded in a
hierarchical model that describes the dependence structure between the model
parameters and their constraints. In particular, a gamma Markov random field
(MRF) is used to model the joint distribution of the target intensity, and a
second MRF is used to model the distribution of the target depth, which are
both expected to exhibit significant spatial correlations. An adaptive Markov
chain Monte Carlo algorithm is then proposed to compute the Bayesian estimates
of interest and perform Bayesian inference. This algorithm is equipped with a
stochastic optimization adaptation mechanism that automatically adjusts the
parameters of the MRFs by maximum marginal likelihood estimation. Finally, the
benefits of the proposed methodology are demonstrated through a serie of
experiments using real data
Bayesian Learning and Predictability in a Stochastic Nonlinear Dynamical Model
Bayesian inference methods are applied within a Bayesian hierarchical
modelling framework to the problems of joint state and parameter estimation,
and of state forecasting. We explore and demonstrate the ideas in the context
of a simple nonlinear marine biogeochemical model. A novel approach is proposed
to the formulation of the stochastic process model, in which ecophysiological
properties of plankton communities are represented by autoregressive stochastic
processes. This approach captures the effects of changes in plankton
communities over time, and it allows the incorporation of literature metadata
on individual species into prior distributions for process model parameters.
The approach is applied to a case study at Ocean Station Papa, using Particle
Markov chain Monte Carlo computational techniques. The results suggest that, by
drawing on objective prior information, it is possible to extract useful
information about model state and a subset of parameters, and even to make
useful long-term forecasts, based on sparse and noisy observations
Bayesian Bounds on Parameter Estimation Accuracy for Compact Coalescing Binary Gravitational Wave Signals
A global network of laser interferometric gravitational wave detectors is
projected to be in operation by around the turn of the century. Here, the noisy
output of a single instrument is examined. A gravitational wave is assumed to
have been detected in the data and we deal with the subsequent problem of
parameter estimation. Specifically, we investigate theoretical lower bounds on
the minimum mean-square errors associated with measuring the parameters of the
inspiral waveform generated by an orbiting system of neutron stars/black holes.
Three theoretical lower bounds on parameter estimation accuracy are considered:
the Cramer-Rao bound (CRB); the Weiss-Weinstein bound (WWB); and the Ziv-Zakai
bound (ZZB). We obtain the WWB and ZZB for the Newtonian-form of the coalescing
binary waveform, and compare them with published CRB and numerical Monte-Carlo
results. At large SNR, we find that the theoretical bounds are all identical
and are attained by the Monte-Carlo results. As SNR gradually drops below 10,
the WWB and ZZB are both found to provide increasingly tighter lower bounds
than the CRB. However, at these levels of moderate SNR, there is a significant
departure between all the bounds and the numerical Monte-Carlo results.Comment: 17 pages (LaTeX), 4 figures. Submitted to Physical Review
Philosophy and the practice of Bayesian statistics
A substantial school in the philosophy of science identifies Bayesian
inference with inductive inference and even rationality as such, and seems to
be strengthened by the rise and practical success of Bayesian statistics. We
argue that the most successful forms of Bayesian statistics do not actually
support that particular philosophy but rather accord much better with
sophisticated forms of hypothetico-deductivism. We examine the actual role
played by prior distributions in Bayesian models, and the crucial aspects of
model checking and model revision, which fall outside the scope of Bayesian
confirmation theory. We draw on the literature on the consistency of Bayesian
updating and also on our experience of applied work in social science.
Clarity about these matters should benefit not just philosophy of science,
but also statistical practice. At best, the inductivist view has encouraged
researchers to fit and compare models without checking them; at worst,
theorists have actively discouraged practitioners from performing model
checking because it does not fit into their framework.Comment: 36 pages, 5 figures. v2: Fixed typo in caption of figure 1. v3:
Further typo fixes. v4: Revised in response to referee
Harold Jeffreys's Theory of Probability Revisited
Published exactly seventy years ago, Jeffreys's Theory of Probability (1939)
has had a unique impact on the Bayesian community and is now considered to be
one of the main classics in Bayesian Statistics as well as the initiator of the
objective Bayes school. In particular, its advances on the derivation of
noninformative priors as well as on the scaling of Bayes factors have had a
lasting impact on the field. However, the book reflects the characteristics of
the time, especially in terms of mathematical rigor. In this paper we point out
the fundamental aspects of this reference work, especially the thorough
coverage of testing problems and the construction of both estimation and
testing noninformative priors based on functional divergences. Our major aim
here is to help modern readers in navigating in this difficult text and in
concentrating on passages that are still relevant today.Comment: This paper commented in: [arXiv:1001.2967], [arXiv:1001.2968],
[arXiv:1001.2970], [arXiv:1001.2975], [arXiv:1001.2985], [arXiv:1001.3073].
Rejoinder in [arXiv:0909.1008]. Published in at
http://dx.doi.org/10.1214/09-STS284 the Statistical Science
(http://www.imstat.org/sts/) by the Institute of Mathematical Statistics
(http://www.imstat.org
- …