2,142 research outputs found
A Tutorial on Clique Problems in Communications and Signal Processing
Since its first use by Euler on the problem of the seven bridges of
K\"onigsberg, graph theory has shown excellent abilities in solving and
unveiling the properties of multiple discrete optimization problems. The study
of the structure of some integer programs reveals equivalence with graph theory
problems making a large body of the literature readily available for solving
and characterizing the complexity of these problems. This tutorial presents a
framework for utilizing a particular graph theory problem, known as the clique
problem, for solving communications and signal processing problems. In
particular, the paper aims to illustrate the structural properties of integer
programs that can be formulated as clique problems through multiple examples in
communications and signal processing. To that end, the first part of the
tutorial provides various optimal and heuristic solutions for the maximum
clique, maximum weight clique, and -clique problems. The tutorial, further,
illustrates the use of the clique formulation through numerous contemporary
examples in communications and signal processing, mainly in maximum access for
non-orthogonal multiple access networks, throughput maximization using index
and instantly decodable network coding, collision-free radio frequency
identification networks, and resource allocation in cloud-radio access
networks. Finally, the tutorial sheds light on the recent advances of such
applications, and provides technical insights on ways of dealing with mixed
discrete-continuous optimization problems
Solving Hard Computational Problems Efficiently: Asymptotic Parametric Complexity 3-Coloring Algorithm
Many practical problems in almost all scientific and technological
disciplines have been classified as computationally hard (NP-hard or even
NP-complete). In life sciences, combinatorial optimization problems frequently
arise in molecular biology, e.g., genome sequencing; global alignment of
multiple genomes; identifying siblings or discovery of dysregulated pathways.In
almost all of these problems, there is the need for proving a hypothesis about
certain property of an object that can be present only when it adopts some
particular admissible structure (an NP-certificate) or be absent (no admissible
structure), however, none of the standard approaches can discard the hypothesis
when no solution can be found, since none can provide a proof that there is no
admissible structure. This article presents an algorithm that introduces a
novel type of solution method to "efficiently" solve the graph 3-coloring
problem; an NP-complete problem. The proposed method provides certificates
(proofs) in both cases: present or absent, so it is possible to accept or
reject the hypothesis on the basis of a rigorous proof. It provides exact
solutions and is polynomial-time (i.e., efficient) however parametric. The only
requirement is sufficient computational power, which is controlled by the
parameter . Nevertheless, here it is proved that the
probability of requiring a value of to obtain a solution for a
random graph decreases exponentially: , making
tractable almost all problem instances. Thorough experimental analyses were
performed. The algorithm was tested on random graphs, planar graphs and
4-regular planar graphs. The obtained experimental results are in accordance
with the theoretical expected results.Comment: Working pape
Dynamic load balancing for the distributed mining of molecular structures
In molecular biology, it is often desirable to find common properties in large numbers of drug candidates. One family of
methods stems from the data mining community, where algorithms to find frequent graphs have received increasing attention over the
past years. However, the computational complexity of the underlying problem and the large amount of data to be explored essentially
render sequential algorithms useless. In this paper, we present a distributed approach to the frequent subgraph mining problem to
discover interesting patterns in molecular compounds. This problem is characterized by a highly irregular search tree, whereby no
reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely, a dynamic
partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiverinitiated
load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer
Instituteâs HIV-screening data set, where we were able to show close-to linear speedup in a network of workstations. The proposed
approach also allows for dynamic resource aggregation in a non dedicated computational environment. These features make it suitable
for large-scale, multi-domain, heterogeneous environments, such as computational grids
Importance sampling strategy for non-convex randomized block-coordinate descent
As the number of samples and dimensionality of optimization problems related
to statistics an machine learning explode, block coordinate descent algorithms
have gained popularity since they reduce the original problem to several
smaller ones. Coordinates to be optimized are usually selected randomly
according to a given probability distribution. We introduce an importance
sampling strategy that helps randomized coordinate descent algorithms to focus
on blocks that are still far from convergence. The framework applies to
problems composed of the sum of two possibly non-convex terms, one being
separable and non-smooth. We have compared our algorithm to a full gradient
proximal approach as well as to a randomized block coordinate algorithm that
considers uniform sampling and cyclic block coordinate descent. Experimental
evidences show the clear benefit of using an importance sampling strategy
High performance subgraph mining in molecular compounds
Structured data represented in the form of graphs arises in
several fields of the science and the growing amount of available data makes distributed graph mining techniques particularly relevant. In this paper, we present a distributed approach to the frequent subgraph mining
problem to discover interesting patterns in molecular compounds. The problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main
aspects of the proposed distributed algorithm, namely a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiver-initiated, load balancing
algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Instituteâs HIV-screening dataset, where the approach attains close-to linear speedup in a network
of workstations
Statistical mechanics of the vertex-cover problem
We review recent progress in the study of the vertex-cover problem (VC). VC
belongs to the class of NP-complete graph theoretical problems, which plays a
central role in theoretical computer science. On ensembles of random graphs, VC
exhibits an coverable-uncoverable phase transition. Very close to this
transition, depending on the solution algorithm, easy-hard transitions in the
typical running time of the algorithms occur.
We explain a statistical mechanics approach, which works by mapping VC to a
hard-core lattice gas, and then applying techniques like the replica trick or
the cavity approach. Using these methods, the phase diagram of VC could be
obtained exactly for connectivities , where VC is replica symmetric.
Recently, this result could be confirmed using traditional mathematical
techniques. For , the solution of VC exhibits full replica symmetry
breaking.
The statistical mechanics approach can also be used to study analytically the
typical running time of simple complete and incomplete algorithms for VC.
Finally, we describe recent results for VC when studied on other ensembles of
finite- and infinite-dimensional graphs.Comment: review article, 26 pages, 9 figures, to appear in J. Phys. A: Math.
Ge
Non-Uniform Stochastic Average Gradient Method for Training Conditional Random Fields
We apply stochastic average gradient (SAG) algorithms for training
conditional random fields (CRFs). We describe a practical implementation that
uses structure in the CRF gradient to reduce the memory requirement of this
linearly-convergent stochastic gradient method, propose a non-uniform sampling
scheme that substantially improves practical performance, and analyze the rate
of convergence of the SAGA variant under non-uniform sampling. Our experimental
results reveal that our method often significantly outperforms existing methods
in terms of the training objective, and performs as well or better than
optimally-tuned stochastic gradient methods in terms of test error.Comment: AI/Stats 2015, 24 page
- âŠ