76,202 research outputs found
Faster Approximate Multicommodity Flow Using Quadratically Coupled Flows
The maximum multicommodity flow problem is a natural generalization of the
maximum flow problem to route multiple distinct flows. Obtaining a
approximation to the multicommodity flow problem on graphs is a well-studied
problem. In this paper we present an adaptation of recent advances in
single-commodity flow algorithms to this problem. As the underlying linear
systems in the electrical problems of multicommodity flow problems are no
longer Laplacians, our approach is tailored to generate specialized systems
which can be preconditioned and solved efficiently using Laplacians. Given an
undirected graph with m edges and k commodities, we give algorithms that find
approximate solutions to the maximum concurrent flow problem and
the maximum weighted multicommodity flow problem in time
\tilde{O}(m^{4/3}\poly(k,\epsilon^{-1}))
SCOR: Software-defined Constrained Optimal Routing Platform for SDN
A Software-defined Constrained Optimal Routing (SCOR) platform is introduced
as a Northbound interface in SDN architecture. It is based on constraint
programming techniques and is implemented in MiniZinc modelling language. Using
constraint programming techniques in this Northbound interface has created an
efficient tool for implementing complex Quality of Service routing applications
in a few lines of code. The code includes only the problem statement and the
solution is found by a general solver program. A routing framework is
introduced based on SDN's architecture model which uses SCOR as its Northbound
interface and an upper layer of applications implemented in SCOR. Performance
of a few implemented routing applications are evaluated in different network
topologies, network sizes and various number of concurrent flows.Comment: 19 pages, 11 figures, 11 algorithms, 3 table
Sequential and Parallel Algorithms for Mixed Packing and Covering
Mixed packing and covering problems are problems that can be formulated as
linear programs using only non-negative coefficients. Examples include
multicommodity network flow, the Held-Karp lower bound on TSP, fractional
relaxations of set cover, bin-packing, knapsack, scheduling problems,
minimum-weight triangulation, etc. This paper gives approximation algorithms
for the general class of problems. The sequential algorithm is a simple greedy
algorithm that can be implemented to find an epsilon-approximate solution in
O(epsilon^-2 log m) linear-time iterations. The parallel algorithm does
comparable work but finishes in polylogarithmic time.
The results generalize previous work on pure packing and covering (the
special case when the constraints are all "less-than" or all "greater-than") by
Michael Luby and Noam Nisan (1993) and Naveen Garg and Jochen Konemann (1998)
GAMER: a GPU-Accelerated Adaptive Mesh Refinement Code for Astrophysics
We present the newly developed code, GAMER (GPU-accelerated Adaptive MEsh
Refinement code), which has adopted a novel approach to improve the performance
of adaptive mesh refinement (AMR) astrophysical simulations by a large factor
with the use of the graphic processing unit (GPU). The AMR implementation is
based on a hierarchy of grid patches with an oct-tree data structure. We adopt
a three-dimensional relaxing TVD scheme for the hydrodynamic solver, and a
multi-level relaxation scheme for the Poisson solver. Both solvers have been
implemented in GPU, by which hundreds of patches can be advanced in parallel.
The computational overhead associated with the data transfer between CPU and
GPU is carefully reduced by utilizing the capability of asynchronous memory
copies in GPU, and the computing time of the ghost-zone values for each patch
is made to diminish by overlapping it with the GPU computations. We demonstrate
the accuracy of the code by performing several standard test problems in
astrophysics. GAMER is a parallel code that can be run in a multi-GPU cluster
system. We measure the performance of the code by performing purely-baryonic
cosmological simulations in different hardware implementations, in which
detailed timing analyses provide comparison between the computations with and
without GPU(s) acceleration. Maximum speed-up factors of 12.19 and 10.47 are
demonstrated using 1 GPU with 4096^3 effective resolution and 16 GPUs with
8192^3 effective resolution, respectively.Comment: 60 pages, 22 figures, 3 tables. More accuracy tests are included.
Accepted for publication in ApJ
Optimized shunting with mixed-usage tracks
We consider the planning of railway freight classification at hump yards, where the problem
involves the formation of departing freight train blocks from arriving trains subject to
scheduling and capacity constraints. The hump yard layout considered consists of arrival
tracks of sufficient length at an arrival yard, a hump, classification tracks of non-uniform
and possibly non-sufficient length at a classification yard, and departure tracks of sufficient
length. To increase yard capacity, freight cars arriving early can be stored temporarily
on specific mixed-usage tracks. The entire hump yard planning process is covered in this
paper, and heuristics for arrival and departure track assignment, as well as hump scheduling,
have been included to provide the neccessary input data. However, the central problem
considered is the classification track allocation problem. This problem has previously
been modeled using direct mixed integer programming models, but this approach did not
yield lower bounds of sufficient quality to prove optimality. Later attempts focused on
a column generation approach based on branch-and-price that could solve problem instances
of industrial size. Building upon the column generation approach we introduce
a direct arc-based integer programming model, where the arcs are precedence relations
between blocks on the same classification track. Further, the most promising models
are adapted for rolling-horizon planning. We evaluate the methods on historical data
from the Hallsberg shunting yard in Sweden. The results show that the new arc-based
model performs as well as the column generation approach. It returns an optimal schedule
within the execution time limit for all instances but from one, and executes as fast
as the column generation approach. Further, the short execution times of the column
generation approach and the arc-indexed model make them suitable for rolling-horizon
planning, while the direct mixed integer program proved to be too slow for this.
Extended analysis of the results shows that mixing was only required if the maximum
number of concurrent trains on the classification yard exceeds 29 (there are 32 available
tracks), and that after this point the number of extra car roll-ins increases heavily
Probabilistic Graphical Models on Multi-Core CPUs using Java 8
In this paper, we discuss software design issues related to the development
of parallel computational intelligence algorithms on multi-core CPUs, using the
new Java 8 functional programming features. In particular, we focus on
probabilistic graphical models (PGMs) and present the parallelisation of a
collection of algorithms that deal with inference and learning of PGMs from
data. Namely, maximum likelihood estimation, importance sampling, and greedy
search for solving combinatorial optimisation problems. Through these concrete
examples, we tackle the problem of defining efficient data structures for PGMs
and parallel processing of same-size batches of data sets using Java 8
features. We also provide straightforward techniques to code parallel
algorithms that seamlessly exploit multi-core processors. The experimental
analysis, carried out using our open source AMIDST (Analysis of MassIve Data
STreams) Java toolbox, shows the merits of the proposed solutions.Comment: Pre-print version of the paper presented in the special issue on
Computational Intelligence Software at IEEE Computational Intelligence
Magazine journa
Faster Algorithms for Weighted Recursive State Machines
Pushdown systems (PDSs) and recursive state machines (RSMs), which are
linearly equivalent, are standard models for interprocedural analysis. Yet RSMs
are more convenient as they (a) explicitly model function calls and returns,
and (b) specify many natural parameters for algorithmic analysis, e.g., the
number of entries and exits. We consider a general framework where RSM
transitions are labeled from a semiring and path properties are algebraic with
semiring operations, which can model, e.g., interprocedural reachability and
dataflow analysis problems.
Our main contributions are new algorithms for several fundamental problems.
As compared to a direct translation of RSMs to PDSs and the best-known existing
bounds of PDSs, our analysis algorithm improves the complexity for
finite-height semirings (that subsumes reachability and standard dataflow
properties). We further consider the problem of extracting distance values from
the representation structures computed by our algorithm, and give efficient
algorithms that distinguish the complexity of a one-time preprocessing from the
complexity of each individual query. Another advantage of our algorithm is that
our improvements carry over to the concurrent setting, where we improve the
best-known complexity for the context-bounded analysis of concurrent RSMs.
Finally, we provide a prototype implementation that gives a significant
speed-up on several benchmarks from the SLAM/SDV project
- …