1,371 research outputs found
Consistent Second-Order Conic Integer Programming for Learning Bayesian Networks
Bayesian Networks (BNs) represent conditional probability relations among a
set of random variables (nodes) in the form of a directed acyclic graph (DAG),
and have found diverse applications in knowledge discovery. We study the
problem of learning the sparse DAG structure of a BN from continuous
observational data. The central problem can be modeled as a mixed-integer
program with an objective function composed of a convex quadratic loss function
and a regularization penalty subject to linear constraints. The optimal
solution to this mathematical program is known to have desirable statistical
properties under certain conditions. However, the state-of-the-art optimization
solvers are not able to obtain provably optimal solutions to the existing
mathematical formulations for medium-size problems within reasonable
computational times. To address this difficulty, we tackle the problem from
both computational and statistical perspectives. On the one hand, we propose a
concrete early stopping criterion to terminate the branch-and-bound process in
order to obtain a near-optimal solution to the mixed-integer program, and
establish the consistency of this approximate solution. On the other hand, we
improve the existing formulations by replacing the linear "big-" constraints
that represent the relationship between the continuous and binary indicator
variables with second-order conic constraints. Our numerical results
demonstrate the effectiveness of the proposed approaches
An overlay architecture for throughput optimal multipath routing
Legacy networks are often designed to operate with simple single-path routing, like shortest-path, which is known to be throughput suboptimal. On the other hand, previously proposed throughput optimal policies (i.e., backpressure) require every device in the network to make dynamic routing decisions. In this work, we study an overlay architecture for dynamic routing such that only a subset of devices (overlay nodes) need to make dynamic routing decisions. We determine the essential collection of nodes that must bifurcate traffic for achieving the maximum multicommodity network throughput. We apply our optimal node placement algorithm to several graphs and the results show that a small fraction of overlay nodes is sufficient for achieving maximum throughput. Finally, we propose a heuristic policy (OBP), which dynamically controls traffic bifurcations at overlay nodes. In all studied simulation scenarios, OBP not only achieves full throughput, but also reduces delay in comparison to the throughput optimal backpressure routing.United States. Air Force (Contract FA8721-05-C-0002)National Science Foundation (U.S.) (Grant CNS-0915988)United States. Office of Naval Research (Grant N00014-12-1-0064)United States. Army Research Office. Multidisciplinary University Research Initiative (Grant W911NF-08-1-0238)European Social Fund (WiNC Project of the Action:Supporting Postdoctoral Researchers
Achieving target equilibria in network routing games without knowing the latency functions
The analysis of network routing games typically assumes precise, detailed information about the latency functions. Such information may, however, be unavailable or difficult to obtain. Moreover, one is often primarily interested in enforcing a desired target flow as an equilibrium. We ask whether one can achieve target flows as equilibria without knowing the underlying latency functions. We give a crisp positive answer to this question. We show that one can efficiently compute edge tolls that induce a given target multicommodity flow in a nonatomic routing game using a polynomial number of queries to an oracle that takes tolls as input and outputs the resulting equilibrium flow. This result is obtained via a novel application of the ellipsoid method, and extends to various other settings. We obtain improved query-complexity bounds for series-parallel networks, and single-commodity routing games with linear latency functions. Our techniques provide new insights into network routing games
Communication Efficiency in Information Gathering through Dynamic Information Flow
This thesis addresses the problem of how to improve the performance of multi-robot information gathering tasks by actively controlling the rate of communication between robots. Examples of such tasks include cooperative tracking and cooperative environmental monitoring. Communication is essential in such systems for both decentralised data fusion and decision making, but wireless networks impose capacity constraints that are frequently overlooked. While existing research has focussed on improving available communication throughput, the aim in this thesis is to develop algorithms that make more efficient use of the available communication capacity. Since information may be shared at various levels of abstraction, another challenge is the decision of where information should be processed based on limits of the computational resources available. Therefore, the flow of information needs to be controlled based on the trade-off between communication limits, computation limits and information value. In this thesis, we approach the trade-off by introducing the dynamic information flow (DIF) problem. We suggest variants of DIF that either consider data fusion communication independently or both data fusion and decision making communication simultaneously. For the data fusion case, we propose efficient decentralised solutions that dynamically adjust the flow of information. For the decision making case, we present an algorithm for communication efficiency based on local LQ approximations of information gathering problems. The algorithm is then integrated with our solution for the data fusion case to produce a complete communication efficiency solution for information gathering. We analyse our suggested algorithms and present important performance guarantees. The algorithms are validated in a custom-designed decentralised simulation framework and through field-robotic experimental demonstrations
Recommended from our members
Parallelizing support vector machines for scalable image annotation
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Machine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them Support Vector Machines (SVMs) are used extensively due to their generalization properties. However, SVM training is notably a computationally intensive process especially when the training dataset is large.
In this thesis distributed computing paradigms have been investigated to speed up SVM training, by partitioning a large training dataset into small data chunks and process each chunk in parallel utilizing the resources of a cluster of computers. A resource aware parallel SVM algorithm is introduced for large scale image annotation in parallel using a cluster of computers. A genetic algorithm based load balancing scheme is designed to optimize the performance of the algorithm in heterogeneous computing environments.
SVM was initially designed for binary classifications. However, most classification problems arising in domains such as image annotation usually involve more than two classes. A resource aware parallel multiclass SVM algorithm for large scale image annotation in parallel using a cluster of computers is introduced.
The combination of classifiers leads to substantial reduction of classification error in a wide range of applications. Among them SVM ensembles with bagging is shown to outperform a single SVM in terms of classification accuracy. However, SVM ensembles training are notably a computationally intensive process especially when the number replicated samples based on bootstrapping is large. A distributed SVM ensemble algorithm for image annotation is introduced which re-samples the training data based on bootstrapping and training SVM on each sample in parallel using a cluster of computers.
The above algorithms are evaluated in both experimental and simulation environments showing that the distributed SVM algorithm, distributed multiclass SVM algorithm, and distributed SVM ensemble algorithm, reduces the training time significantly while maintaining a high level of accuracy in classifications
- …