209 research outputs found

    A Widely Applicable Bayesian Information Criterion

    Full text link
    A statistical model or a learning machine is called regular if the map taking a parameter to a probability distribution is one-to-one and if its Fisher information matrix is always positive definite. If otherwise, it is called singular. In regular statistical models, the Bayes free energy, which is defined by the minus logarithm of Bayes marginal likelihood, can be asymptotically approximated by the Schwarz Bayes information criterion (BIC), whereas in singular models such approximation does not hold. Recently, it was proved that the Bayes free energy of a singular model is asymptotically given by a generalized formula using a birational invariant, the real log canonical threshold (RLCT), instead of half the number of parameters in BIC. Theoretical values of RLCTs in several statistical models are now being discovered based on algebraic geometrical methodology. However, it has been difficult to estimate the Bayes free energy using only training samples, because an RLCT depends on an unknown true distribution. In the present paper, we define a widely applicable Bayesian information criterion (WBIC) by the average log likelihood function over the posterior distribution with the inverse temperature 1/logn1/\log n, where nn is the number of training samples. We mathematically prove that WBIC has the same asymptotic expansion as the Bayes free energy, even if a statistical model is singular for and unrealizable by a statistical model. Since WBIC can be numerically calculated without any information about a true distribution, it is a generalized version of BIC onto singular statistical models.Comment: 30 page

    Betting and Belief: Prediction Markets and Attribution of Climate Change

    Full text link
    Despite much scientific evidence, a large fraction of the American public doubts that greenhouse gases are causing global warming. We present a simulation model as a computational test-bed for climate prediction markets. Traders adapt their beliefs about future temperatures based on the profits of other traders in their social network. We simulate two alternative climate futures, in which global temperatures are primarily driven either by carbon dioxide or by solar irradiance. These represent, respectively, the scientific consensus and a hypothesis advanced by prominent skeptics. We conduct sensitivity analyses to determine how a variety of factors describing both the market and the physical climate may affect traders' beliefs about the cause of global climate change. Market participation causes most traders to converge quickly toward believing the "true" climate model, suggesting that a climate market could be useful for building public consensus.Comment: All code and data for the model is available at http://johnjnay.com/predMarket/. Forthcoming in Proceedings of the 2016 Winter Simulation Conference. IEEE Pres

    Modeling of the parties' vote share distributions

    Full text link
    Competition between varying ideas, people and institutions fuels the dynamics of socio-economic systems. Numerous analyses of the empirical data extracted from different financial markets have established a consistent set of stylized facts describing statistical signatures of the competition in the financial markets. Having an established and consistent set of stylized facts helps to set clear goals for theoretical models to achieve. Despite similar abundance of empirical analyses in sociophysics, there is no consistent set of stylized facts describing the opinion dynamics. In this contribution we consider the parties' vote share distributions observed during the Lithuanian parliamentary elections. We show that most of the time empirical vote share distributions could be well fitted by numerous different distributions. While discussing this peculiarity we provide arguments, including a simple agent-based model, on why the beta distribution could be the best choice to fit the parties' vote share distributions.Comment: 12 pages, 7 figure

    ArviZ a unified library for exploratory analysis of Bayesian models in Python

    Get PDF
    ArviZ is a Python package for exploratory analysis of Bayesian models. ArviZ aims to be a package that integrates seamlessly with established probabilistic programming languages like PyStan, PyMC, Edward, emcee, Pyro and easily integrated with novel or bespoke Bayesian analyses. Where the aim of the probabilistic programming languages is to make it easy to build and solve Bayesian models, the aim of the ArviZ library is to make it easy to process and analyze the results from the Bayesian models.Fil: Kumar, Ravin. No especifíca;Fil: Carroll, Colin. No especifíca;Fil: Hartikainen, Ari. Aalto University; FinlandiaFil: Martín, Osvaldo Antonio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis. Instituto de Matemática Aplicada de San Luis "Prof. Ezio Marchi". Universidad Nacional de San Luis. Facultad de Ciencias Físico, Matemáticas y Naturales. Instituto de Matemática Aplicada de San Luis "Prof. Ezio Marchi"; Argentin

    Malware in the Future? Forecasting of Analyst Detection of Cyber Events

    Full text link
    There have been extensive efforts in government, academia, and industry to anticipate, forecast, and mitigate cyber attacks. A common approach is time-series forecasting of cyber attacks based on data from network telescopes, honeypots, and automated intrusion detection/prevention systems. This research has uncovered key insights such as systematicity in cyber attacks. Here, we propose an alternate perspective of this problem by performing forecasting of attacks that are analyst-detected and -verified occurrences of malware. We call these instances of malware cyber event data. Specifically, our dataset was analyst-detected incidents from a large operational Computer Security Service Provider (CSSP) for the U.S. Department of Defense, which rarely relies only on automated systems. Our data set consists of weekly counts of cyber events over approximately seven years. Since all cyber events were validated by analysts, our dataset is unlikely to have false positives which are often endemic in other sources of data. Further, the higher-quality data could be used for a number for resource allocation, estimation of security resources, and the development of effective risk-management strategies. We used a Bayesian State Space Model for forecasting and found that events one week ahead could be predicted. To quantify bursts, we used a Markov model. Our findings of systematicity in analyst-detected cyber attacks are consistent with previous work using other sources. The advanced information provided by a forecast may help with threat awareness by providing a probable value and range for future cyber events one week ahead. Other potential applications for cyber event forecasting include proactive allocation of resources and capabilities for cyber defense (e.g., analyst staffing and sensor configuration) in CSSPs. Enhanced threat awareness may improve cybersecurity.Comment: Revised version resubmitted to journa
    corecore