46 research outputs found
Recommended from our members
Estimating Latent Processes on a Network From Indirect Measurements
In a communication network, point-to-point traffic volumes over time are critical for designing protocols that route information efficiently and for maintaining security, whether at the scale of an Internet service provider or within a corporation. While technically feasible, the direct measurement of point-to-point traffic imposes a heavy burden on network performance and is typically not implemented. Instead, indirect aggregate traffic volumes are routinely collected. We consider the problem of estimating point-to-point traffic volumes, , from aggregate traffic volumes, , given information about the network routing protocol encoded in a matrix A. This estimation task can be reformulated as finding the solutions to a sequence of ill-posed linear inverse problems, , since the number of origin-destination routes of interest is higher than the number of aggregate measurements available.
Here, we introduce a novel multilevel state-space model (SSM) of aggregate traffic volumes with realistic features. We implement a naĂŻve strategy for estimating unobserved point-to-point traffic volumes from indirect measurements of aggregate traffic, based on particle filtering. We then develop a more efficient two-stage inference strategy that relies on model-based regularization: a simple model is used to calibrate regularization parameters that lead to efficient/scalable inference in the multilevel SSM. We apply our methods to corporate and academic networks, where we show that the proposed inference strategy outperforms existing approaches and scales to larger networks. We also design a simulation study to explore the factors that influence the performance. Our results suggest that model-based regularization may be an efficient strategy for inference in other complex multilevel models. Supplementary materials for this article are available online.Statistic
An analytic approximation of the feasible space of metabolic networks
Assuming a steady-state condition within a cell, metabolic fluxes satisfy an
under-determined linear system of stoichiometric equations. Characterizing the
space of fluxes that satisfy such equations along with given bounds (and
possibly additional relevant constraints) is considered of utmost importance
for the understanding of cellular metabolism. Extreme values for each
individual flux can be computed with Linear Programming (as Flux Balance
Analysis), and their marginal distributions can be approximately computed with
Monte-Carlo sampling. Here we present an approximate analytic method for the
latter task based on Expectation Propagation equations that does not involve
sampling and can achieve much better predictions than other existing analytic
methods. The method is iterative, and its computation time is dominated by one
matrix inversion per iteration. With respect to sampling, we show through
extensive simulation that it has some advantages including computation time,
and the ability to efficiently fix empirically estimated distributions of
fluxes
Efficient Markov bases for Z-polytope sampling : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Mathematics at Massey University, Manawatū, New Zealand
Listen in Dean's List of Exceptional Theses 2022In this thesis we study the use of lattice bases for fibre sampling, with particular attention paid to applications in volume network tomography. We use a geometric interpretation of the fibre as a Z-polytope to provide insight into the connectivity properties
of lattice bases.
Fibre sampling is used when we are interested in fitting a statistical model to a random process that may only be observed indirectly via the underdetermined linear system y = Ax. We consider the observed data y and random variable of interest x to contain count data. The likelihood function for such models requires a summation over the fibre Fy, the set of all non-negative integer vectors x satisfying this equation for some particular y. This can be computationally infeasible when Fy is large.
One approach to addressing this problem involves sampling from Fy using a Markov Chain Monte Carlo algorithm, which amounts to taking a random walk through Fy . This is facilitated by a Markov basis: a set of moves that can be used construct such a walk,
which is therefore a subset of the kernel of the configuration matrix A.
Algebraic algorithms for finding Markov bases based on the theory of Gröbner bases are available, but these can fail when the configuration matrix is large and the calculations become computationally infeasible. Instead, we propose constructing a sampler based on a type of lattice basis we call a column partition lattice basis, defined by a matrix U. Constructing such a basis is computationally much cheaper than constructing a Gröbner basis.
It is known that lattice bases are not necessarily Markov bases. We give a condition on the matrix U that guarantees that it is a Markov basis, and show for a certain class of configuration matrices how a U matrix that is a Markov basis can be constructed.
Construction of lattice bases that are Markov bases is facilitated when the configuration matrix is unimodular, or has unimodular partitions. We consider configuration matrices from volume network tomography, and give classes of traffic network that have
configuration matrices with these desirable properties.
If a Markov basis cannot be found, one alternative is to sample from some larger set that includes Fy . We give some larger sets that can be used, subject to certain conditions
A Hierarchical Bayesian Model for the Unmixing Analysis of Compositional Data subject to Unit-sum Constraints
Modeling of compositional data is emerging as an active area in statistics. It is assumed that compositional data represent the convex linear mixing of definite numbers of independent sources usually referred to as end members. A generic problem in practice is to appropriately separate the end members and quantify their fractions from compositional data subject to nonnegative and unit-sum constraints. A number of methods essentially related to polytope expansion have been proposed. However, these deterministic methods have some potential problems.
In this study, a hierarchical Bayesian model was formulated, and the algorithms were coded in MATLABĂ’. A test run using both a synthetic and real-word dataset yields scientifically sound and mathematically optimal outputs broadly consistent with other non-Bayesian methods. Also, the sensitivity of this model to the choice of different priors and structure of the covariance matrix of error were discussed