6,310 research outputs found

    Bibliographic Analysis on Research Publications using Authors, Categorical Labels and the Citation Network

    Full text link
    Bibliographic analysis considers the author's research areas, the citation network and the paper content among other things. In this paper, we combine these three in a topic model that produces a bibliographic model of authors, topics and documents, using a nonparametric extension of a combination of the Poisson mixed-topic link model and the author-topic model. This gives rise to the Citation Network Topic Model (CNTM). We propose a novel and efficient inference algorithm for the CNTM to explore subsets of research publications from CiteSeerX. The publication datasets are organised into three corpora, totalling to about 168k publications with about 62k authors. The queried datasets are made available online. In three publicly available corpora in addition to the queried datasets, our proposed model demonstrates an improved performance in both model fitting and document clustering, compared to several baselines. Moreover, our model allows extraction of additional useful knowledge from the corpora, such as the visualisation of the author-topics network. Additionally, we propose a simple method to incorporate supervision into topic modelling to achieve further improvement on the clustering task.Comment: Preprint for Journal Machine Learnin

    Lidar waveform based analysis of depth images constructed using sparse single-photon data

    Get PDF
    This paper presents a new Bayesian model and algorithm used for depth and intensity profiling using full waveforms from the time-correlated single photon counting (TCSPC) measurement in the limit of very low photon counts. The model proposed represents each Lidar waveform as a combination of a known impulse response, weighted by the target intensity, and an unknown constant background, corrupted by Poisson noise. Prior knowledge about the problem is embedded in a hierarchical model that describes the dependence structure between the model parameters and their constraints. In particular, a gamma Markov random field (MRF) is used to model the joint distribution of the target intensity, and a second MRF is used to model the distribution of the target depth, which are both expected to exhibit significant spatial correlations. An adaptive Markov chain Monte Carlo algorithm is then proposed to compute the Bayesian estimates of interest and perform Bayesian inference. This algorithm is equipped with a stochastic optimization adaptation mechanism that automatically adjusts the parameters of the MRFs by maximum marginal likelihood estimation. Finally, the benefits of the proposed methodology are demonstrated through a serie of experiments using real data

    Bayesian Learning and Predictability in a Stochastic Nonlinear Dynamical Model

    Get PDF
    Bayesian inference methods are applied within a Bayesian hierarchical modelling framework to the problems of joint state and parameter estimation, and of state forecasting. We explore and demonstrate the ideas in the context of a simple nonlinear marine biogeochemical model. A novel approach is proposed to the formulation of the stochastic process model, in which ecophysiological properties of plankton communities are represented by autoregressive stochastic processes. This approach captures the effects of changes in plankton communities over time, and it allows the incorporation of literature metadata on individual species into prior distributions for process model parameters. The approach is applied to a case study at Ocean Station Papa, using Particle Markov chain Monte Carlo computational techniques. The results suggest that, by drawing on objective prior information, it is possible to extract useful information about model state and a subset of parameters, and even to make useful long-term forecasts, based on sparse and noisy observations

    Bayesian Bounds on Parameter Estimation Accuracy for Compact Coalescing Binary Gravitational Wave Signals

    Get PDF
    A global network of laser interferometric gravitational wave detectors is projected to be in operation by around the turn of the century. Here, the noisy output of a single instrument is examined. A gravitational wave is assumed to have been detected in the data and we deal with the subsequent problem of parameter estimation. Specifically, we investigate theoretical lower bounds on the minimum mean-square errors associated with measuring the parameters of the inspiral waveform generated by an orbiting system of neutron stars/black holes. Three theoretical lower bounds on parameter estimation accuracy are considered: the Cramer-Rao bound (CRB); the Weiss-Weinstein bound (WWB); and the Ziv-Zakai bound (ZZB). We obtain the WWB and ZZB for the Newtonian-form of the coalescing binary waveform, and compare them with published CRB and numerical Monte-Carlo results. At large SNR, we find that the theoretical bounds are all identical and are attained by the Monte-Carlo results. As SNR gradually drops below 10, the WWB and ZZB are both found to provide increasingly tighter lower bounds than the CRB. However, at these levels of moderate SNR, there is a significant departure between all the bounds and the numerical Monte-Carlo results.Comment: 17 pages (LaTeX), 4 figures. Submitted to Physical Review

    Philosophy and the practice of Bayesian statistics

    Full text link
    A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework.Comment: 36 pages, 5 figures. v2: Fixed typo in caption of figure 1. v3: Further typo fixes. v4: Revised in response to referee

    Harold Jeffreys's Theory of Probability Revisited

    Full text link
    Published exactly seventy years ago, Jeffreys's Theory of Probability (1939) has had a unique impact on the Bayesian community and is now considered to be one of the main classics in Bayesian Statistics as well as the initiator of the objective Bayes school. In particular, its advances on the derivation of noninformative priors as well as on the scaling of Bayes factors have had a lasting impact on the field. However, the book reflects the characteristics of the time, especially in terms of mathematical rigor. In this paper we point out the fundamental aspects of this reference work, especially the thorough coverage of testing problems and the construction of both estimation and testing noninformative priors based on functional divergences. Our major aim here is to help modern readers in navigating in this difficult text and in concentrating on passages that are still relevant today.Comment: This paper commented in: [arXiv:1001.2967], [arXiv:1001.2968], [arXiv:1001.2970], [arXiv:1001.2975], [arXiv:1001.2985], [arXiv:1001.3073]. Rejoinder in [arXiv:0909.1008]. Published in at http://dx.doi.org/10.1214/09-STS284 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org
    corecore