209 research outputs found
A Widely Applicable Bayesian Information Criterion
A statistical model or a learning machine is called regular if the map taking
a parameter to a probability distribution is one-to-one and if its Fisher
information matrix is always positive definite. If otherwise, it is called
singular. In regular statistical models, the Bayes free energy, which is
defined by the minus logarithm of Bayes marginal likelihood, can be
asymptotically approximated by the Schwarz Bayes information criterion (BIC),
whereas in singular models such approximation does not hold.
Recently, it was proved that the Bayes free energy of a singular model is
asymptotically given by a generalized formula using a birational invariant, the
real log canonical threshold (RLCT), instead of half the number of parameters
in BIC. Theoretical values of RLCTs in several statistical models are now being
discovered based on algebraic geometrical methodology. However, it has been
difficult to estimate the Bayes free energy using only training samples,
because an RLCT depends on an unknown true distribution.
In the present paper, we define a widely applicable Bayesian information
criterion (WBIC) by the average log likelihood function over the posterior
distribution with the inverse temperature , where is the number
of training samples. We mathematically prove that WBIC has the same asymptotic
expansion as the Bayes free energy, even if a statistical model is singular for
and unrealizable by a statistical model. Since WBIC can be numerically
calculated without any information about a true distribution, it is a
generalized version of BIC onto singular statistical models.Comment: 30 page
Betting and Belief: Prediction Markets and Attribution of Climate Change
Despite much scientific evidence, a large fraction of the American public
doubts that greenhouse gases are causing global warming. We present a
simulation model as a computational test-bed for climate prediction markets.
Traders adapt their beliefs about future temperatures based on the profits of
other traders in their social network. We simulate two alternative climate
futures, in which global temperatures are primarily driven either by carbon
dioxide or by solar irradiance. These represent, respectively, the scientific
consensus and a hypothesis advanced by prominent skeptics. We conduct
sensitivity analyses to determine how a variety of factors describing both the
market and the physical climate may affect traders' beliefs about the cause of
global climate change. Market participation causes most traders to converge
quickly toward believing the "true" climate model, suggesting that a climate
market could be useful for building public consensus.Comment: All code and data for the model is available at
http://johnjnay.com/predMarket/. Forthcoming in Proceedings of the 2016
Winter Simulation Conference. IEEE Pres
Modeling of the parties' vote share distributions
Competition between varying ideas, people and institutions fuels the dynamics
of socio-economic systems. Numerous analyses of the empirical data extracted
from different financial markets have established a consistent set of stylized
facts describing statistical signatures of the competition in the financial
markets. Having an established and consistent set of stylized facts helps to
set clear goals for theoretical models to achieve. Despite similar abundance of
empirical analyses in sociophysics, there is no consistent set of stylized
facts describing the opinion dynamics. In this contribution we consider the
parties' vote share distributions observed during the Lithuanian parliamentary
elections. We show that most of the time empirical vote share distributions
could be well fitted by numerous different distributions. While discussing this
peculiarity we provide arguments, including a simple agent-based model, on why
the beta distribution could be the best choice to fit the parties' vote share
distributions.Comment: 12 pages, 7 figure
ArviZ a unified library for exploratory analysis of Bayesian models in Python
ArviZ is a Python package for exploratory analysis of Bayesian models. ArviZ aims to be a package that integrates seamlessly with established probabilistic programming languages like PyStan, PyMC, Edward, emcee, Pyro and easily integrated with novel or bespoke Bayesian analyses. Where the aim of the probabilistic programming languages is to make it easy to build and solve Bayesian models, the aim of the ArviZ library is to make it easy to process and analyze the results from the Bayesian models.Fil: Kumar, Ravin. No especifíca;Fil: Carroll, Colin. No especifíca;Fil: Hartikainen, Ari. Aalto University; FinlandiaFil: Martín, Osvaldo Antonio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis. Instituto de Matemática Aplicada de San Luis "Prof. Ezio Marchi". Universidad Nacional de San Luis. Facultad de Ciencias Físico, Matemáticas y Naturales. Instituto de Matemática Aplicada de San Luis "Prof. Ezio Marchi"; Argentin
Malware in the Future? Forecasting of Analyst Detection of Cyber Events
There have been extensive efforts in government, academia, and industry to
anticipate, forecast, and mitigate cyber attacks. A common approach is
time-series forecasting of cyber attacks based on data from network telescopes,
honeypots, and automated intrusion detection/prevention systems. This research
has uncovered key insights such as systematicity in cyber attacks. Here, we
propose an alternate perspective of this problem by performing forecasting of
attacks that are analyst-detected and -verified occurrences of malware. We call
these instances of malware cyber event data. Specifically, our dataset was
analyst-detected incidents from a large operational Computer Security Service
Provider (CSSP) for the U.S. Department of Defense, which rarely relies only on
automated systems. Our data set consists of weekly counts of cyber events over
approximately seven years. Since all cyber events were validated by analysts,
our dataset is unlikely to have false positives which are often endemic in
other sources of data. Further, the higher-quality data could be used for a
number for resource allocation, estimation of security resources, and the
development of effective risk-management strategies. We used a Bayesian State
Space Model for forecasting and found that events one week ahead could be
predicted. To quantify bursts, we used a Markov model. Our findings of
systematicity in analyst-detected cyber attacks are consistent with previous
work using other sources. The advanced information provided by a forecast may
help with threat awareness by providing a probable value and range for future
cyber events one week ahead. Other potential applications for cyber event
forecasting include proactive allocation of resources and capabilities for
cyber defense (e.g., analyst staffing and sensor configuration) in CSSPs.
Enhanced threat awareness may improve cybersecurity.Comment: Revised version resubmitted to journa
- …