3,403 research outputs found
The H2 Control Problem for Quadratically Invariant Systems with Delays
This paper gives a new solution to the output feedback H2 problem for
quadratically invariant communication delay patterns. A characterization of all
stabilizing controllers satisfying the delay constraints is given and the
decentralized H2 problem is cast as a convex model matching problem. The main
result shows that the model matching problem can be reduced to a
finite-dimensional quadratic program. A recursive state-space method for
computing the optimal controller based on vectorization is given.Comment: Draft submitted to IEEE Transactions on Automatic Contro
Memory-constrained Vectorization and Scheduling of Dataflow Graphs for Hybrid CPU-GPU Platforms
The increasing use of heterogeneous embedded systems with multi-core CPUs and
Graphics Processing Units (GPUs) presents important challenges in effectively
exploiting pipeline, task and data-level parallelism to meet throughput
requirements of digital signal processing (DSP) applications. Moreover, in the
presence of system-level memory constraints, hand optimization of code to
satisfy these requirements is inefficient and error-prone, and can therefore,
greatly slow down development time or result in highly underutilized processing
resources. In this paper, we present vectorization and scheduling methods to
effectively exploit multiple forms of parallelism for throughput optimization
on hybrid CPU-GPU platforms, while conforming to system-level memory
constraints. The methods operate on synchronous dataflow representations, which
are widely used in the design of embedded systems for signal and information
processing. We show that our novel methods can significantly improve system
throughput compared to previous vectorization and scheduling approaches under
the same memory constraints. In addition, we present a practical case-study of
applying our methods to significantly improve the throughput of an orthogonal
frequency division multiplexing (OFDM) receiver system for wireless
communications.Comment: 25 page
Convex Global 3D Registration with Lagrangian Duality
The registration of 3D models by a Euclidean transformation is a fundamental task at the core of many application in computer vision. This problem is non-convex due to the presence of rotational constraints, making traditional local optimization methods prone to getting stuck in local minima. This paper addresses finding the globally optimal transformation in various 3D registration problems by a unified formulation that integrates common geometric registration modalities (namely point-to-point, point-to-line and point-to-plane). This formulation renders the optimization problem independent of both the number and nature of the correspondences.
The main novelty of our proposal is the introduction of a strengthened Lagrangian dual relaxation for this problem, which surpasses previous similar approaches [32] in effectiveness.
In fact, even though with no theoretical guarantees, exhaustive empirical evaluation in both synthetic and real experiments always resulted on a tight relaxation that allowed to recover a guaranteed globally optimal solution by exploiting duality theory.
Thus, our approach allows for effectively solving the 3D registration with global optimality guarantees while running at a fraction of the time for the state-of-the-art alternative [34], based on a more computationally intensive Branch and Bound method.Universidad de Málaga. Campus de Excelencia Internacional AndalucÃa Tech
Higher-Order Low-Rank Regression
This paper proposes an efficient algorithm (HOLRR) to handle regression tasks
where the outputs have a tensor structure. We formulate the regression problem
as the minimization of a least square criterion under a multilinear rank
constraint, a difficult non convex problem. HOLRR computes efficiently an
approximate solution of this problem, with solid theoretical guarantees. A
kernel extension is also presented. Experiments on synthetic and real data show
that HOLRR outperforms multivariate and multilinear regression methods and is
considerably faster than existing tensor methods.Comment: submitted to ICML 201
Polly's Polyhedral Scheduling in the Presence of Reductions
The polyhedral model provides a powerful mathematical abstraction to enable
effective optimization of loop nests with respect to a given optimization goal,
e.g., exploiting parallelism. Unexploited reduction properties are a frequent
reason for polyhedral optimizers to assume parallelism prohibiting dependences.
To our knowledge, no polyhedral loop optimizer available in any production
compiler provides support for reductions. In this paper, we show that
leveraging the parallelism of reductions can lead to a significant performance
increase. We give a precise, dependence based, definition of reductions and
discuss ways to extend polyhedral optimization to exploit the associativity and
commutativity of reduction computations. We have implemented a
reduction-enabled scheduling approach in the Polly polyhedral optimizer and
evaluate it on the standard Polybench 3.2 benchmark suite. We were able to
detect and model all 52 arithmetic reductions and achieve speedups up to
2.21 on a quad core machine by exploiting the multidimensional
reduction in the BiCG benchmark.Comment: Presented at the IMPACT15 worksho
Particle-in-Cell Laser-Plasma Simulation on Xeon Phi Coprocessors
This paper concerns development of a high-performance implementation of the
Particle-in-Cell method for plasma simulation on Intel Xeon Phi coprocessors.
We discuss suitability of the method for Xeon Phi architecture and present our
experience of porting and optimization of the existing parallel
Particle-in-Cell code PICADOR. Direct porting with no code modification gives
performance on Xeon Phi close to 8-core CPU on a benchmark problem with 50
particles per cell. We demonstrate step-by-step application of optimization
techniques such as improving data locality, enhancing parallelization
efficiency and vectorization that leads to 3.75 x speedup on CPU and 7.5 x on
Xeon Phi. The optimized version achieves 18.8 ns per particle update on Intel
Xeon E5-2660 CPU and 9.3 ns per particle update on Intel Xeon Phi 5110P. On a
real problem of laser ion acceleration in targets with surface grating that
requires a large number of macroparticles per cell the speedup of Xeon Phi
compared to CPU is 1.6 x.Comment: 16 pages, 3 figure
An algorithm for the optimization of finite element integration loops
We present an algorithm for the optimization of a class of finite element
integration loop nests. This algorithm, which exploits fundamental mathematical
properties of finite element operators, is proven to achieve a locally optimal
operation count. In specified circumstances the optimum achieved is global.
Extensive numerical experiments demonstrate significant performance
improvements over the state of the art in finite element code generation in
almost all cases. This validates the effectiveness of the algorithm presented
here, and illustrates its limitations
Vectorization of quantum operations and its use
We give a detailed exposition of the "vectorized" notation for dealing with
quantum operations. This notation is used to highlight the relationships
between representations of completely-positive dynamics. Vectorization
considerably simplifies the analysis of different methods of quantum process
tomography, and enables us to derive compact representation of the investigated
quantum operations in terms of the resulting data.Comment: 12 pages, long-overdue modifications + corrections following the
feedbac
What's In A Patch, I: Tensors, Differential Geometry and Statistical Shading Analysis
We develop a linear algebraic framework for the shape-from-shading problem,
because tensors arise when scalar (e.g. image) and vector (e.g. surface normal)
fields are differentiated multiple times. The work is in two parts. In this
first part we investigate when image derivatives exhibit invariance to changing
illumination by calculating the statistics of image derivatives under general
distributions on the light source. We computationally validate the hypothesis
that image orientations (derivatives) provide increased invariance to
illumination by showing (for a Lambertian model) that a shape-from-shading
algorithm matching gradients instead of intensities provides more accurate
reconstructions when illumination is incorrectly estimated under a flatness
prior
Continuous-Time Inverse Quadratic Optimal Control Problem
In this paper, the problem of finite horizon inverse optimal control (IOC) is
investigated, where the quadratic cost function of a dynamic process is
required to be recovered based on the observation of optimal control sequences.
We propose the first complete result of the necessary and sufficient condition
for the existence of corresponding LQ cost functions. Under feasible cases, the
analytic expression of the whole solution space is derived and the equivalence
of weighting matrices in LQ problems is discussed. For infeasible problems, an
infinite dimensional convex problem is formulated to obtain a best-fit
approximate solution with minimal control residual. And the optimality
condition is solved under a static quadratic programming framework to
facilitate the computation. Finally, numerical simulations are used to
demonstrate the effectiveness and feasibility of the proposed methods.Comment: 16 pages, 2 figure
- …