104 research outputs found
Semi-Markov Graph Dynamics
In this paper, we outline a model of graph (or network) dynamics based on two
ingredients. The first ingredient is a Markov chain on the space of possible
graphs. The second ingredient is a semi-Markov counting process of renewal
type. The model consists in subordinating the Markov chain to the semi-Markov
counting process. In simple words, this means that the chain transitions occur
at random time instants called epochs. The model is quite rich and its possible
connections with algebraic geometry are briefly discussed. Moreover, for the
sake of simplicity, we focus on the space of undirected graphs with a fixed
number of nodes. However, in an example, we present an interbank market model
where it is meaningful to use directed graphs or even weighted graphs.Comment: 25 pages, 4 figures, submitted to PLoS-ON
Markov basis and Groebner basis of Segre-Veronese configuration for testing independence in group-wise selections
We consider testing independence in group-wise selections with some
restrictions on combinations of choices. We present models for frequency data
of selections for which it is easy to perform conditional tests by Markov chain
Monte Carlo (MCMC) methods. When the restrictions on the combinations can be
described in terms of a Segre-Veronese configuration, an explicit form of a
Gr\"obner basis consisting of moves of degree two is readily available for
performing a Markov chain. We illustrate our setting with the National Center
Test for university entrance examinations in Japan. We also apply our method to
testing independence hypotheses involving genotypes at more than one locus or
haplotypes of alleles on the same chromosome.Comment: 25 pages, 5 figure
Efficient and exact sampling of simple graphs with given arbitrary degree sequence
Uniform sampling from graphical realizations of a given degree sequence is a
fundamental component in simulation-based measurements of network observables,
with applications ranging from epidemics, through social networks to Internet
modeling. Existing graph sampling methods are either link-swap based
(Markov-Chain Monte Carlo algorithms) or stub-matching based (the Configuration
Model). Both types are ill-controlled, with typically unknown mixing times for
link-swap methods and uncontrolled rejections for the Configuration Model. Here
we propose an efficient, polynomial time algorithm that generates statistically
independent graph samples with a given, arbitrary, degree sequence. The
algorithm provides a weight associated with each sample, allowing the
observable to be measured either uniformly over the graph ensemble, or,
alternatively, with a desired distribution. Unlike other algorithms, this
method always produces a sample, without back-tracking or rejections. Using a
central limit theorem-based reasoning, we argue, that for large N, and for
degree sequences admitting many realizations, the sample weights are expected
to have a lognormal distribution. As examples, we apply our algorithm to
generate networks with degree sequences drawn from power-law distributions and
from binomial distributions.Comment: 8 pages, 3 figure
Statistical auditing and randomness test of lotto k/N-type games
One of the most popular lottery games worldwide is the so-called ``lotto
k/N''. It considers N numbers 1,2,...,N from which k are drawn randomly,
without replacement. A player selects k or more numbers and the first prize is
shared amongst those players whose selected numbers match all of the k randomly
drawn. Exact rules may vary in different countries.
In this paper, mean values and covariances for the random variables
representing the numbers drawn from this kind of game are presented, with the
aim of using them to audit statistically the consistency of a given sample of
historical results with theoretical values coming from a hypergeometric
statistical model. The method can be adapted to test pseudorandom number
generators.Comment: 10 pages, no figure
Two factor saturated designs: cycles, Gini index and state polytopes
In this paper we analyze and characterize the saturated fractions of two-factor designs under the simple effect model. Using Li et al.ear algebra, we define a criterion to check whether a given fraction is saturated or not. We also compute the number of saturated fractions, providing an alternative proof of the Cayley's formula. Finally we show how, given a list of saturated fractions, Gini indexes of their margins and the associated state polytopes could be used to classify them
Optimal Estimation of Ion-Channel Kinetics from Macroscopic Currents
Markov modeling provides an effective approach for modeling ion channel kinetics. There are several search algorithms for global fitting of macroscopic or single-channel currents across different experimental conditions. Here we present a particle swarm optimization(PSO)-based approach which, when used in combination with golden section search (GSS), can fit macroscopic voltage responses with a high degree of accuracy (errors within 1%) and reasonable amount of calculation time (less than 10 hours for 20 free parameters) on a desktop computer. We also describe a method for initial value estimation of the model parameters, which appears to favor identification of global optimum and can further reduce the computational cost. The PSO-GSS algorithm is applicable for kinetic models of arbitrary topology and size and compatible with common stimulation protocols, which provides a convenient approach for establishing kinetic models at the macroscopic level
Connections between Classical and Parametric Network Entropies
This paper explores relationships between classical and parametric measures of graph (or network) complexity. Classical measures are based on vertex decompositions induced by equivalence relations. Parametric measures, on the other hand, are constructed by using information functions to assign probabilities to the vertices. The inequalities established in this paper relating classical and parametric measures lay a foundation for systematic classification of entropy-based measures of graph complexity
Sharp bounds and normalization of Wiener-type indices
10.1371/journal.pone.0078448PLoS ONE811-POLN
Probabilistic Daily ILI Syndromic Surveillance with a Spatio-Temporal Bayesian Hierarchical Model
BACKGROUND: For daily syndromic surveillance to be effective, an efficient and sensible algorithm would be expected to detect aberrations in influenza illness, and alert public health workers prior to any impending epidemic. This detection or alert surely contains uncertainty, and thus should be evaluated with a proper probabilistic measure. However, traditional monitoring mechanisms simply provide a binary alert, failing to adequately address this uncertainty. METHODS AND FINDINGS: Based on the Bayesian posterior probability of influenza-like illness (ILI) visits, the intensity of outbreak can be directly assessed. The numbers of daily emergency room ILI visits at five community hospitals in Taipei City during 2006-2007 were collected and fitted with a Bayesian hierarchical model containing meteorological factors such as temperature and vapor pressure, spatial interaction with conditional autoregressive structure, weekend and holiday effects, seasonality factors, and previous ILI visits. The proposed algorithm recommends an alert for action if the posterior probability is larger than 70%. External data from January to February of 2008 were retained for validation. The decision rule detects successfully the peak in the validation period. When comparing the posterior probability evaluation with the modified Cusum method, results show that the proposed method is able to detect the signals 1-2 days prior to the rise of ILI visits. CONCLUSIONS: This Bayesian hierarchical model not only constitutes a dynamic surveillance system but also constructs a stochastic evaluation of the need to call for alert. The monitoring mechanism provides earlier detection as well as a complementary tool for current surveillance programs
Combining Independent, Weighted P-Values: Achieving Computational Stability by a Systematic Expansion with Controllable Accuracy
Given the expanding availability of scientific data and tools to analyze them, combining different assessments of the same piece of information has become increasingly important for social, biological, and even physical sciences. This task demands, to begin with, a method-independent standard, such as the -value, that can be used to assess the reliability of a piece of information. Good's formula and Fisher's method combine independent -values with respectively unequal and equal weights. Both approaches may be regarded as limiting instances of a general case of combining -values from groups; -values within each group are weighted equally, while weight varies by group. When some of the weights become nearly degenerate, as cautioned by Good, numeric instability occurs in computation of the combined -values. We deal explicitly with this difficulty by deriving a controlled expansion, in powers of differences in inverse weights, that provides both accurate statistics and stable numerics. We illustrate the utility of this systematic approach with a few examples. In addition, we also provide here an alternative derivation for the probability distribution function of the general case and show how the analytic formula obtained reduces to both Good's and Fisher's methods as special cases. A C++ program, which computes the combined -values with equal numerical stability regardless of whether weights are (nearly) degenerate or not, is available for download at our group website http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads/CoinedPValues.html
- …