1,261 research outputs found
Distributive Network Utility Maximization (NUM) over Time-Varying Fading Channels
Distributed network utility maximization (NUM) has received an increasing
intensity of interest over the past few years. Distributed solutions (e.g., the
primal-dual gradient method) have been intensively investigated under fading
channels. As such distributed solutions involve iterative updating and explicit
message passing, it is unrealistic to assume that the wireless channel remains
unchanged during the iterations. Unfortunately, the behavior of those
distributed solutions under time-varying channels is in general unknown. In
this paper, we shall investigate the convergence behavior and tracking errors
of the iterative primal-dual scaled gradient algorithm (PDSGA) with dynamic
scaling matrices (DSC) for solving distributive NUM problems under time-varying
fading channels. We shall also study a specific application example, namely the
multi-commodity flow control and multi-carrier power allocation problem in
multi-hop ad hoc networks. Our analysis shows that the PDSGA converges to a
limit region rather than a single point under the finite state Markov chain
(FSMC) fading channels. We also show that the order of growth of the tracking
errors is given by O(T/N), where T and N are the update interval and the
average sojourn time of the FSMC, respectively. Based on this analysis, we
derive a low complexity distributive adaptation algorithm for determining the
adaptive scaling matrices, which can be implemented distributively at each
transmitter. The numerical results show the superior performance of the
proposed dynamic scaling matrix algorithm over several baseline schemes, such
as the regular primal-dual gradient algorithm
Block-Coordinate Frank-Wolfe Optimization for Structural SVMs
We propose a randomized block-coordinate variant of the classic Frank-Wolfe
algorithm for convex optimization with block-separable constraints. Despite its
lower iteration cost, we show that it achieves a similar convergence rate in
duality gap as the full Frank-Wolfe algorithm. We also show that, when applied
to the dual structural support vector machine (SVM) objective, this yields an
online algorithm that has the same low iteration complexity as primal
stochastic subgradient methods. However, unlike stochastic subgradient methods,
the block-coordinate Frank-Wolfe algorithm allows us to compute the optimal
step-size and yields a computable duality gap guarantee. Our experiments
indicate that this simple algorithm outperforms competing structural SVM
solvers.Comment: Appears in Proceedings of the 30th International Conference on
Machine Learning (ICML 2013). 9 pages main text + 22 pages appendix. Changes
from v3 to v4: 1) Re-organized appendix; improved & clarified duality gap
proofs; re-drew all plots; 2) Changed convention for Cf definition; 3) Added
weighted averaging experiments + convergence results; 4) Clarified main text
and relationship with appendi
Block Belief Propagation for Parameter Learning in Markov Random Fields
Traditional learning methods for training Markov random fields require doing
inference over all variables to compute the likelihood gradient. The iteration
complexity for those methods therefore scales with the size of the graphical
models. In this paper, we propose \emph{block belief propagation learning}
(BBPL), which uses block-coordinate updates of approximate marginals to compute
approximate gradients, removing the need to compute inference on the entire
graphical model. Thus, the iteration complexity of BBPL does not scale with the
size of the graphs. We prove that the method converges to the same solution as
that obtained by using full inference per iteration, despite these
approximations, and we empirically demonstrate its scalability improvements
over standard training methods.Comment: Accepted to AAAI 201
Efficient Multi-Template Learning for Structured Prediction
Conditional random field (CRF) and Structural Support Vector Machine
(Structural SVM) are two state-of-the-art methods for structured prediction
which captures the interdependencies among output variables. The success of
these methods is attributed to the fact that their discriminative models are
able to account for overlapping features on the whole input observations. These
features are usually generated by applying a given set of templates on labeled
data, but improper templates may lead to degraded performance. To alleviate
this issue, in this paper, we propose a novel multiple template learning
paradigm to learn structured prediction and the importance of each template
simultaneously, so that hundreds of arbitrary templates could be added into the
learning model without caution. This paradigm can be formulated as a special
multiple kernel learning problem with exponential number of constraints. Then
we introduce an efficient cutting plane algorithm to solve this problem in the
primal, and its convergence is presented. We also evaluate the proposed
learning paradigm on two widely-studied structured prediction tasks,
\emph{i.e.} sequence labeling and dependency parsing. Extensive experimental
results show that the proposed method outperforms CRFs and Structural SVMs due
to exploiting the importance of each template. Our complexity analysis and
empirical results also show that our proposed method is more efficient than
OnlineMKL on very sparse and high-dimensional data. We further extend this
paradigm for structured prediction using generalized -block norm
regularization with , and experiments show competitive performances when
- …