123 research outputs found
Markov bases and subbases for bounded contingency tables
In this paper we study the computation of Markov bases for contingency tables
whose cell entries have an upper bound. In general a Markov basis for unbounded
contingency table under a certain model differs from a Markov basis for bounded
tables. Rapallo, (2007) applied Lawrence lifting to compute a Markov basis for
contingency tables whose cell entries are bounded. However, in the process, one
has to compute the universal Gr\"obner basis of the ideal associated with the
design matrix for a model which is, in general, larger than any reduced
Gr\"obner basis. Thus, this is also infeasible in small- and medium-sized
problems. In this paper we focus on bounded two-way contingency tables under
independence model and show that if these bounds on cells are positive, i.e.,
they are not structural zeros, the set of basic moves of all
minors connects all tables with given margins. We end this paper with an open
problem that if we know the given margins are positive, we want to find the
necessary and sufficient condition on the set of structural zeros so that the
set of basic moves of all minors connects all incomplete
contingency tables with given margins.Comment: 22 pages. It will appear in the Annals of the Institution of
Statistical Mathematic
Markov bases for sudoku grids
Rapporto interno del Dipartimento di Matematica del Politecnico di Torino, N. 4, marzo 201
Semi-Markov Graph Dynamics
In this paper, we outline a model of graph (or network) dynamics based on two
ingredients. The first ingredient is a Markov chain on the space of possible
graphs. The second ingredient is a semi-Markov counting process of renewal
type. The model consists in subordinating the Markov chain to the semi-Markov
counting process. In simple words, this means that the chain transitions occur
at random time instants called epochs. The model is quite rich and its possible
connections with algebraic geometry are briefly discussed. Moreover, for the
sake of simplicity, we focus on the space of undirected graphs with a fixed
number of nodes. However, in an example, we present an interbank market model
where it is meaningful to use directed graphs or even weighted graphs.Comment: 25 pages, 4 figures, submitted to PLoS-ON
Markov basis and Groebner basis of Segre-Veronese configuration for testing independence in group-wise selections
We consider testing independence in group-wise selections with some
restrictions on combinations of choices. We present models for frequency data
of selections for which it is easy to perform conditional tests by Markov chain
Monte Carlo (MCMC) methods. When the restrictions on the combinations can be
described in terms of a Segre-Veronese configuration, an explicit form of a
Gr\"obner basis consisting of moves of degree two is readily available for
performing a Markov chain. We illustrate our setting with the National Center
Test for university entrance examinations in Japan. We also apply our method to
testing independence hypotheses involving genotypes at more than one locus or
haplotypes of alleles on the same chromosome.Comment: 25 pages, 5 figure
Likelihood Geometry
We study the critical points of monomial functions over an algebraic subset
of the probability simplex. The number of critical points on the Zariski
closure is a topological invariant of that embedded projective variety, known
as its maximum likelihood degree. We present an introduction to this theory and
its statistical motivations. Many favorite objects from combinatorial algebraic
geometry are featured: toric varieties, A-discriminants, hyperplane
arrangements, Grassmannians, and determinantal varieties. Several new results
are included, especially on the likelihood correspondence and its bidegree.
These notes were written for the second author's lectures at the CIME-CIRM
summer course on Combinatorial Algebraic Geometry at Levico Terme in June 2013.Comment: 45 pages; minor changes and addition
Efficient and exact sampling of simple graphs with given arbitrary degree sequence
Uniform sampling from graphical realizations of a given degree sequence is a
fundamental component in simulation-based measurements of network observables,
with applications ranging from epidemics, through social networks to Internet
modeling. Existing graph sampling methods are either link-swap based
(Markov-Chain Monte Carlo algorithms) or stub-matching based (the Configuration
Model). Both types are ill-controlled, with typically unknown mixing times for
link-swap methods and uncontrolled rejections for the Configuration Model. Here
we propose an efficient, polynomial time algorithm that generates statistically
independent graph samples with a given, arbitrary, degree sequence. The
algorithm provides a weight associated with each sample, allowing the
observable to be measured either uniformly over the graph ensemble, or,
alternatively, with a desired distribution. Unlike other algorithms, this
method always produces a sample, without back-tracking or rejections. Using a
central limit theorem-based reasoning, we argue, that for large N, and for
degree sequences admitting many realizations, the sample weights are expected
to have a lognormal distribution. As examples, we apply our algorithm to
generate networks with degree sequences drawn from power-law distributions and
from binomial distributions.Comment: 8 pages, 3 figure
Statistical auditing and randomness test of lotto k/N-type games
One of the most popular lottery games worldwide is the so-called ``lotto
k/N''. It considers N numbers 1,2,...,N from which k are drawn randomly,
without replacement. A player selects k or more numbers and the first prize is
shared amongst those players whose selected numbers match all of the k randomly
drawn. Exact rules may vary in different countries.
In this paper, mean values and covariances for the random variables
representing the numbers drawn from this kind of game are presented, with the
aim of using them to audit statistically the consistency of a given sample of
historical results with theoretical values coming from a hypergeometric
statistical model. The method can be adapted to test pseudorandom number
generators.Comment: 10 pages, no figure
Biophysically Realistic Filament Bending Dynamics in Agent-Based Biological Simulation
An appealing tool for study of the complex biological behaviors that can emerge from networks of simple molecular interactions is an agent-based, computational simulation that explicitly tracks small-scale local interactions – following thousands to millions of states through time. For many critical cell processes (e.g. cytokinetic furrow specification, nuclear centration, cytokinesis), the flexible nature of cytoskeletal filaments is likely to be critical. Any computer model that hopes to explain the complex emergent behaviors in these processes therefore needs to encode filament flexibility in a realistic manner. Here I present a numerically convenient and biophysically realistic method for modeling cytoskeletal filament flexibility in silico. Each cytoskeletal filament is represented by a series of rigid segments linked end-to-end in series with a variable attachment point for the translational elastic element. This connection scheme allows an empirically tuning, for a wide range of segment sizes, viscosities, and time-steps, that endows any filament species with the experimentally observed (or theoretically expected) static force deflection, relaxation time-constant, and thermal writhing motions. I additionally employ a unique pair of elastic elements – one representing the axial and the other the bending rigidity– that formulate the restoring force in terms of single time-step constraint resolution. This method is highly local –adjacent rigid segments of a filament only interact with one another through constraint forces—and is thus well-suited to simulations in which arbitrary additional forces (e.g. those representing interactions of a filament with other bodies or cross-links / entanglements between filaments) may be present. Implementation in code is straightforward; Java source code is available at www.celldynamics.org
Two factor saturated designs: cycles, Gini index and state polytopes
In this paper we analyze and characterize the saturated fractions of two-factor designs under the simple effect model. Using Li et al.ear algebra, we define a criterion to check whether a given fraction is saturated or not. We also compute the number of saturated fractions, providing an alternative proof of the Cayley's formula. Finally we show how, given a list of saturated fractions, Gini indexes of their margins and the associated state polytopes could be used to classify them
Minding impacting events in a model of stochastic variance
We introduce a generalisation of the well-known ARCH process, widely used for
generating uncorrelated stochastic time series with long-term non-Gaussian
distributions and long-lasting correlations in the (instantaneous) standard
deviation exhibiting a clustering profile. Specifically, inspired by the fact
that in a variety of systems impacting events are hardly forgot, we split the
process into two different regimes: a first one for regular periods where the
average volatility of the fluctuations within a certain period of time is below
a certain threshold and another one when the local standard deviation
outnumbers it. In the former situation we use standard rules for
heteroscedastic processes whereas in the latter case the system starts
recalling past values that surpassed the threshold. Our results show that for
appropriate parameter values the model is able to provide fat tailed
probability density functions and strong persistence of the instantaneous
variance characterised by large values of the Hurst exponent is greater than
0.8, which are ubiquitous features in complex systems.Comment: 18 pages, 5 figures, 1 table. To published in PLoS on
- …