2,488 research outputs found
The Complexity of POMDPs with Long-run Average Objectives
We study the problem of approximation of optimal values in
partially-observable Markov decision processes (POMDPs) with long-run average
objectives. POMDPs are a standard model for dynamic systems with probabilistic
and nondeterministic behavior in uncertain environments. In long-run average
objectives rewards are associated with every transition of the POMDP and the
payoff is the long-run average of the rewards along the executions of the
POMDP. We establish strategy complexity and computational complexity results.
Our main result shows that finite-memory strategies suffice for approximation
of optimal values, and the related decision problem is recursively enumerable
complete
Distributing the Kalman Filter for Large-Scale Systems
This paper derives a \emph{distributed} Kalman filter to estimate a sparsely
connected, large-scale, dimensional, dynamical system monitored by a
network of sensors. Local Kalman filters are implemented on the
(dimensional, where ) sub-systems that are obtained after
spatially decomposing the large-scale system. The resulting sub-systems
overlap, which along with an assimilation procedure on the local Kalman
filters, preserve an th order Gauss-Markovian structure of the centralized
error processes. The information loss due to the th order Gauss-Markovian
approximation is controllable as it can be characterized by a divergence that
decreases as . The order of the approximation, , leads to a lower
bound on the dimension of the sub-systems, hence, providing a criterion for
sub-system selection. The assimilation procedure is carried out on the local
error covariances with a distributed iterate collapse inversion (DICI)
algorithm that we introduce. The DICI algorithm computes the (approximated)
centralized Riccati and Lyapunov equations iteratively with only local
communication and low-order computation. We fuse the observations that are
common among the local Kalman filters using bipartite fusion graphs and
consensus averaging algorithms. The proposed algorithm achieves full
distribution of the Kalman filter that is coherent with the centralized Kalman
filter with an th order Gaussian-Markovian structure on the centralized
error processes. Nowhere storage, communication, or computation of
dimensional vectors and matrices is needed; only dimensional
vectors and matrices are communicated or used in the computation at the
sensors
Geometry and Expressive Power of Conditional Restricted Boltzmann Machines
Conditional restricted Boltzmann machines are undirected stochastic neural
networks with a layer of input and output units connected bipartitely to a
layer of hidden units. These networks define models of conditional probability
distributions on the states of the output units given the states of the input
units, parametrized by interaction weights and biases. We address the
representational power of these models, proving results their ability to
represent conditional Markov random fields and conditional distributions with
restricted supports, the minimal size of universal approximators, the maximal
model approximation errors, and on the dimension of the set of representable
conditional distributions. We contribute new tools for investigating
conditional probability models, which allow us to improve the results that can
be derived from existing work on restricted Boltzmann machine probability
models.Comment: 30 pages, 5 figures, 1 algorith
Recommended from our members
Three decades of the Shuffled Complex Evolution (SCE-UA) optimization algorithm: Review and applications
On the Complexity of the Equivalence Problem for Probabilistic Automata
Checking two probabilistic automata for equivalence has been shown to be a
key problem for efficiently establishing various behavioural and anonymity
properties of probabilistic systems. In recent experiments a randomised
equivalence test based on polynomial identity testing outperformed
deterministic algorithms. In this paper we show that polynomial identity
testing yields efficient algorithms for various generalisations of the
equivalence problem. First, we provide a randomized NC procedure that also
outputs a counterexample trace in case of inequivalence. Second, we show how to
check for equivalence two probabilistic automata with (cumulative) rewards. Our
algorithm runs in deterministic polynomial time, if the number of reward
counters is fixed. Finally we show that the equivalence problem for
probabilistic visibly pushdown automata is logspace equivalent to the
Arithmetic Circuit Identity Testing problem, which is to decide whether a
polynomial represented by an arithmetic circuit is identically zero.Comment: technical report for a FoSSaCS'12 pape
Clustering metagenomic sequences with interpolated Markov models
<p>Abstract</p> <p>Background</p> <p>Sequencing of environmental DNA (often called metagenomics) has shown tremendous potential to uncover the vast number of unknown microbes that cannot be cultured and sequenced by traditional methods. Because the output from metagenomic sequencing is a large set of reads of unknown origin, clustering reads together that were sequenced from the same species is a crucial analysis step. Many effective approaches to this task rely on sequenced genomes in public databases, but these genomes are a highly biased sample that is not necessarily representative of environments interesting to many metagenomics projects.</p> <p>Results</p> <p>We present S<smcaps>CIMM</smcaps> (Sequence Clustering with Interpolated Markov Models), an unsupervised sequence clustering method. S<smcaps>CIMM</smcaps> achieves greater clustering accuracy than previous unsupervised approaches. We examine the limitations of unsupervised learning on complex datasets, and suggest a hybrid of S<smcaps>CIMM</smcaps> and supervised learning method Phymm called P<smcaps>HY</smcaps>S<smcaps>CIMM</smcaps> that performs better when evolutionarily close training genomes are available.</p> <p>Conclusions</p> <p>S<smcaps>CIMM</smcaps> and P<smcaps>HY</smcaps>S<smcaps>CIMM</smcaps> are highly accurate methods to cluster metagenomic sequences. S<smcaps>CIMM</smcaps> operates entirely unsupervised, making it ideal for environments containing mostly novel microbes. P<smcaps>HY</smcaps>S<smcaps>CIMM</smcaps> uses supervised learning to improve clustering in environments containing microbial strains from well-characterized genera. S<smcaps>CIMM</smcaps> and P<smcaps>HY</smcaps>S<smcaps>CIMM</smcaps> are available open source from <url>http://www.cbcb.umd.edu/software/scimm</url>.</p
Big data, modeling, simulation, computational platform and holistic approaches for the fourth industrial revolution
Naturally, the mathematical process starts from proving the existence and uniqueness of the solution by the using the theorem, corollary, lemma, proposition, dealing with the simple and non-complex model. Proving the existence and uniqueness solution are guaranteed by governing the infinite amount of solutions and limited to the implementation of a small-scale simulation on a single desktop CPU. Accuracy, consistency and stability were easily controlled by a small data scale. However, the fourth industrial can be described the mathematical process as the advent of cyber-physical systems involving entirely new capabilities for researcher and machines (Xing, 2017). In numerical perspective, the fourth industrial revolution (4iR) required the transition from a uncomplex model and small scale simulation to complex model and big data for visualizing the real-world application in digital dialectical and exciting opportunity. Thus, a big data analytics and its classification are a problem solving for these limitations. Some applications of 4iR will highlight the extension version in terms of models, derivative and discretization, dimension of space and time, behavior of initial and boundary conditions, grid generation, data extraction, numerical method and image processing with high resolution feature in numerical perspective. In statistics, a big data depends on data growth however, from numerical perspective, a few classification strategies will be investigated deals with the specific classifier tool. This paper will investigate the conceptual framework for a big data classification, governing the mathematical modeling, selecting the superior numerical method, handling the large sparse simulation and investigating the parallel computing on high performance computing (HPC) platform. The conceptual framework will benefit to the big data provider, algorithm provider and system analyzer to classify and recommend the specific strategy for generating, handling and analyzing the big data. All the perspectives take a holistic view of technology. Current research, the particular conceptual framework will be described in holistic terms. 4iR has ability to take a holistic approach to explain an important of big data, complex modeling, large sparse simulation and high performance computing platform. Numerical analysis and parallel performance evaluation are the indicators for performance investigation of the classification strategy. This research will benefit to obtain an accurate decision, predictions and trending practice on how to obtain the approximation solution for science and engineering applications. As a conclusion, classification strategies for generating a fine granular mesh, identifying the root causes of failures and issues in real time solution. Furthermore, the big data-driven and data transfer evolution towards high speed of technology transfer to boost the economic and social development for the 4iR (Xing, 2017; Marwala et al., 2017)
Parallelization of the SUFI2 algorithm: a Windows HPC approach.
The Soil and Water Assessment Tool (SWAT) has been used for evaluating land use changes on water resources worldwide, and like many models, SWAT requires calibration. However, the execution time of these calibrations can be rather long, reducing the time available for proper analysis. This paper presents a Windows approach for calibrating SWAT using a multinodal cluster computer, composed of six computers with i7 processors (3.2 GHz; 12 cores), 8 GB RAM and 1 TB HDD each. The only requirement for this type of cluster is to have 64-bit processors. Our computers were setup with Windows Server HPC 2012 R2, a network switch 10/100, and regular Ethernet cables. We used the SUFI2 algorithm that comes with SWAT-CUP package to perform calibrations with 100 simulations at node level. Calibration runs were configured as follows: 1-12 (1 process interval), and 12-72 (12 processes interval), resulting in 17 runs. Each run was repeated three times, and results are presented as the mean execution time, in order to minimize any influence of resources fluctuations. Results showed that time of execution was reduced by almost half by using nine processes (15 min) in comparison with the one node control (28 min). We observed a linear decrease of execution time from one to nine processes. With additional processes, execution time increased about 23% and stabilized at 80% of the control. All processing is divided into five steps: distribute files (2.24% of all processing time), organize samples (0.89%), run SWAT (47.59%), collect results (46.51%) and cleanup (0.28%)
- âŠ